This page can be found: http://centauri.ezy.net.au/~fastvid/picsound.htm

Record+play fast 1bit sound on a PIC!

Roman Black - December 2001
This is public domain, use it as you like, give me credit if you like.
Jan 20 2002 NEW! Full playback code for PIC and I2C serial EEPROM - see page bottom

BTc "Binary Time constant" algorithm.

A system to record and/or play sound in a bitstream format using just one digital output pin.

This is a sound playback system for a PIC or any other microcontroller. It uses a clever encoding system to mathematically model the actual performance of the RC filter when the signal is encoded. This allows playback of good quality sound with the absolute minimum software and hardware. The RC filter modeling (encoding algorithm) has been refined to be PIC friendly in binary math, giving the ability to playback AND RECORD in real time even on a PIC, even with high rates up to 150+ kbit/sec. The playback hardware is: 1 PIC digital output pin 1 resistor 1 capacitor This makes it suitable for adding to small and low-cost products, providing speech or sound confirmation of keypress, talking PICs, sound record/playback devices, etc.

Playback theory:

With this system sound playback only requires an RC filter, and speaker, earphone socket etc. Sound playback is very simple and fast. Data is a single stream of bits, at a constant rate. The PIC only needs to output one bit, on one pin, at a regular timed rate to make the sound or speech. Sound can be added to an existing PIC project, if one pin and a few instructions each interrupt are available. Of course a digital output driving a simple RC filter is a common system, used in a number of cheap applications, but it has drawbacks due to the RC time constant giving a non-linear waveform. My BTc algorithm provides a simple fix, and is fast enough that it can do real-time compensation during encoding. The math modeling of my encoder maintains a model of the RC filter with zero accumulated error. If the bitrate is high enough this system will give CD-quality audio, due to the PWM effect of the dithering output, and the RC filter linearity problem is fully pre-compensated by the math modeling in the encoder. So the player doesn't need any math. :o)

Encoding theory:

Encoding the sound is the hard part. The stream of bits must be chosen correctly so that the final output waveform is as close as possible to the original waveform. But the actual output waveform depends on the electrical characteristics of the RC filter, ohms, uF etc. We can make the voltage on the RC filter rise by sending a "1" bit, or fall by sending a "0" bit, but unfortunately the voltage doesn't rise or fall by an equal amount for each bit... A form of compensation is needed to make the filter produce the correct waveform. This picture shows the bitstream and the resulting waveform. Not very linear, is it? Encoding methods. In order to make the final playback waveform the closest reproduction of the original waveform I tried some closed-loop encoding systems, starting with the two most obvious types of encoding; REACTIVE: If sample is higher than math model make a 1 bit. If sample is lower than math model make a 0 bit. PREDICTIVE: In math model, make both bits, "predicting" each result. Then pick the bit that gives the output closest to the desired sound sample. Reactive is easier with only one model needing to be generated with each bit. Predictive works a lot better with all the sound samples I tried. I only used the predictive algorithm in my encoder program. Actually, I did support the reactive algorithm too, but deleted it later when it proved inferior. The trick is getting a fast and accurate math model generated, especially fast enough for high speed bitstream recording/encoding on a PIC.

BTc Encoding Algorithm:

I haven't seen this BTc algorithm used or claimed by anyone, it is basic encoding methodology, just done in a clever way to make it very easy and fast on a PIC. The use of "Binary Time constants" to make the filter calculation easy is my idea, but as always some other person may have thought of it before I did. I don't read enough technical papers these days. :o)

Why BTc??

This was my solution to the problem of math modeling the RC filter on a PIC, and allows high speed encoding. To model the RC filter it requires using the "time constant" or Tc=RC if you remember the math. Tc=0.63208, and is not of much use to us when trying to encode each bit in real time unless you have a Pentium with floating point. When I was testing different high speed encoding solutions I came across a very simple idea. Instead of calculating Tc as given and then doing more calcs to reduce to a bit speed level suitable for encoding, I could simply combine the whole lot if a specific Tc could be assumed. This allows ONE calculation, a simple binary division, to do the entire RC math modelling! If a SPECIFIC time constant is chosen, the TIME for each bit matches the performance of the RC filter in a very specific way, ie; the charge on C will rise by exactly 1/4 or 1/8 or 1/16 during the time that it takes for one bit! Magic. Here are the binary time constants, which I have provided for you; BTc2 = 0.6931 x Tc (voltage changes 1/2 each step) BTc4 = 0.2877 x Tc (voltage changes 1/4 each step) BTc8 = 0.1393 x Tc (changes 1/8th etc) BTc16 = 0.0645 x Tc BTc32 = 0.0317 x Tc BTc64 = 0.0157 x Tc Real world example? So if we choose the BTc8 system, we know that the charge (voltage) on the C of our RC filter rises or falls by 1/8th of the remaining voltage with every bit in our playback bitstream. To do this we just need to tune the RC filter values to our required bitrate. Don't worry too much about the math, at the bottom of this page I have provided software that will display your sound wave, encode it in the way you choose, and display the encoded "math model" waveform so you can see the actual waveform you will get on your RC filter. Then it will save the bitstream as a data file or even as complete PIC code with the sound data as RETLW table, ready to program directly into a PIC chip and play the sound back. The math stuff Imagine a PIC with 16MHz crystal, giving 4,000,000 instructions/sec. Now using prescaler=0 the timer0 interrupt occurs at 15625 Hz. That is an ideal rate for our sound playback on a PIC. Bitrate: 15625 Hz (We choose BTc8) = 0.1393 So BTc8 = 0.1393 @ 15625 Hz Therefore Tc = 0.1393 x 15625 Tc = 2177 Hz Tc = 1 / 2177 = 0.000459 seconds Tc = RC R = Tc / C (let's assume 0.1uF for convenience) R = 0.000459 / 0.0000001 R = 4595 ohms Hope that doesn't look too scary, but what we have done is found that for our desired bitrate at 15625Hz, we can use an RC filter of 4595 ohms and 0.1uF capacitor, and for every bit which occurs at 15625Hz the voltage on our RC filter will rise or fall by EXACTLY 1/8th. Now we can encode in real time on a PIC, and the main filter math is a couple of add/subtracts and one divide by 8. This means we only have to do a handful of very simple calculations for each bit. We can do the entire encoding process and produce the encoded bitstream in real time on a PIC to be played back perfectly on any other PIC with just a resistor and capacitor. How cool. Below is an example showing the BTc8 algorithm encoding a file which contains speech and background music. This is the BTc8 1bit algorithm on a 19.5kHz sound. 19.5kHz = 20MHz PIC interrupted every 256 instructions. The RED wave is the encoded waveform, as you can see it is a decent reproduction of the original sound waveform (green). The "spiky" points of the wave are not really important as they will be filtered by the speaker inductance, or a simple post-filter if needed. The main thing is that the AVERAGE of the wave is a decent reproduction of the original, as that is how your ear will hear it. :o) You can still see the non-linearity in the encoded waveform, look anywhere there are a few 1s or 0s in a sequence. However this does not matter as the encoding process allows for all errors and still produces an acceptable reproduction waveform.

The BTc8 1bit algorithm:

* sound data is in "samples", either in a file of directly coming from the ADC if we are recording in real time. * we start the RC filter "model" voltage at 50%, or half voltage, on the PIC that would be 2.5v. * for every new sound sample, do 3 steps: 1. predict a HI bit; * get the max rise voltage available, Vrise which is simply Vmax - Vmodel * divide Vrise by 8 (we are using BTc8) * add this amount to the existing model voltage Vmodel, to give the predicted voltage "if" we were to generate a HI bit. * we now have predicted the result of a HI bit. 2. predict a LO bit; * get the max drop voltage available, Vdrop which is simply Vmodel - Vmin * divide Vdrop by 8 (we are using BTc8) * subtract this amount from the existing model voltage Vmodel, to give the predicted voltage "if" we were to generate a LO bit. * we now have predicted the result of a LO bit. 3. Now test which of the two predicitons is closest to our actual sound sample. This takes 2 or three simple subtractions. * If HI bit was the closest, we generate a HI bit and keep that voltage as the new Vmodel. * If LO bit was the closest, we generate a LO bit and keep that voltage as the new Vmodel. That's it!! Just a few additions/subtractions, and dividing by 8 which is all very quick. If the total operation is kept to under 64 PIC instructions we can record and encode sound as a bitstream at 78 kbit/sec on a 20MHz PIC, I believe it can be done within 32 PIC instructions, which would be 156 kbit/sec. Of course that is probably much faster than needed in most PIC applications, remember at 156 kbit/sec the PIC could happily record and encode CD quality sound in real time. I don't think anyone has achieved that before with a $2 PIC. :o) Obviously the BTc4 and BTc16 algorithms will work using almost the same software, just changing to /4 or /16 etc. This gives the advantage where your device can record and encode in a number of formats to give the best result, just by changing the division code in your software.

Decoding (playback) 1bit BTc

* standard PIC interrupt at 15625Hz (or whatever) * output one bit to the PIC output pin. * every 8 bits load a new data byte That's it! The entire decode (playback) system is only a handful of instructions and an RC filter. I'm quite proud of this system. The whole thing, designing the algorithm and writing the PIC Sound software utility took me a weekend, and came out pretty good. It's a useful tool. I may consider updating the software to give some more features, one is to play the bitstream back on a RC filter connected to your PC parallel port. This would be handy to HEAR your encoded waveform as well as see it, before committing the sound to your PIC. Another feature I may add is the ability to convert one freq to another.

New BTc 1.5bit algorithm!

During writing the software and testing it with many sound files, especially speech, I found a nice way to improve and smooth the sound for almost zero cost. This new algorithm was added to my software, so you can now choose 1bit or 1.5bit BTc systems for your PIC sound device. 1.5bit ?? This still only requires the same 1bit bitstream, so recorded sounds only take the same amount of bits and sound files are the same size. But cleverly this system smoothes the bitstream at the playback end, for very little cost, it only takes one more resistor and one more digital output pin. What we do is MIX every new bit with the LAST bit. So instead of playing back: 011110000111100001111 we play back: 011110000111100001111 011110000111100001111 basically giving: 012221000122210001222 See it is the same bitstream but delayed one bit. This is extremely easy to do on the playback PIC and has no cost in data storage as we use the same data size per second. Normally this delayed bit system would not be worth much, UNLESS of course you ENCODED the bitstream in a special way so that the reproduced sound would work with the delayed bit system and still produce the correct waveform. Which is exactly what my 1.5bit encoding algorithm does:

Decoding 1.5bit BTc bitstream

* standard PIC interrupt at 15625Hz (or whatever) * compare new bit to the LAST bit; * if bits the same, both PIC pins are set to output * if bits are different, only the NEW bit is output, the other pin is turned to "high impedance input" and has no effect. This gives only "half the effect" when the two bits are different. Again playback is VERY easy, and the difficulty is in encoding, as the encoder needs to model the two pins and two halves of the RC filter. Lucky for us this is quite easy!

Encoding 1.5bit BTc bitstream

The same encoding system with binary time constants is used, and we still only need to do the predictive model in two calculations, ie, predict HI bit and LO bit. So encoding is just as fast as the 1bit method, and still PIC-able. :o) Encoding 1.5bit algorithm: * sound data is in "samples", either in a file of directly coming from the ADC if we are recording in real time. * we start the RC filter math model at 50%, or half voltage, on the PIC that would be 2.5v. * for every new sound sample, do 4 steps: 1. check what the last bit was 2. predict a HI bit; * get the max rise voltage available, Vrise which is simply Vmax - Vmodel * if last bit was a HI, this HI is full Vrise, else if last bit was LO we divide Vrise by 2 to give only half the rise * divide Vrise by 8 (we are using BTc8) * add this amount to the existing model voltage Vmodel, to give the predicted voltage "if" we were to generate a HI bit. * we now have predicted the result of a HI bit. 3. predict a LO bit; * get the max drop voltage available, Vdrop which is simply Vmodel - Vmin * if last bit was a LO, this LO is full Vdrop, else if last bit was HI we divide Vdrop by 2 to give only half the drop * divide Vdrop by 8 (we are using BTc8) * subtract this amount from the existing model voltage Vmodel, to give the predicted voltage "if" we were to generate a LO bit. * we now have predicted the result of a LO bit. 4. Now test which of the two predicitons is closest to our actual sound sample. This takes 2 or three simple subtractions. * If HI bit was the closest, we generate a HI bit and keep that voltage as the new Vmodel. * If LO bit was the closest, we generate a LO bit and keep that voltage as the new Vmodel. As you can see this is only fractionally different to the 1bit encoding, and again the encoder device can be switchable between 1bit and 1.5bit formats with very little effort. Using the 1.5bit encoder gives MUCH better sound reproduction for the same size data file and the same bitrate. Here is a picture from my encoder software, showing the difference between the 1bit system (shown in red) and the 1.5bit system (shown in blue). The 1.5bit system is very obviously superior, and provided you have 2 PIC pins available for sound playback it would definitely be the best choice.

Conclusion?

I think this software and algorithm may be very useful as it allows a convenient conversion from a sound file into bitstream data or PIC assembler code. I hope if you need to add sound to your project or product you will make use of this work and that it may save you some effort. I know many PIC (and other embedded micro) developers make products for the handicapped and visually impaired. It should be very easy to add speech or sound to your products. I particularly like the idea of PIC based products that verbally confirm button pushes, saying "on" "off" "up" "down" etc. With a 8k PIC and 7k for sound samples, running at 15625Hz you can get 3.7 seconds of decent speech. About 6 or 8 short words, enough for 6 or 8 buttons.

Here it is!

Click here to download my v0.01 encoder software.

(picsound.zip file size is 655kb) Click here to see a screenshot 1024x768 (32kb) The file PICSOUND.ZIP contains ENCODER.EXE and a few wave files, or actually the same speech file recorded into different sample rates from 8kHz through to 44.1kHz. NEW!! Christian Dorner has sent me code for a PIC chip and I2C serial eeprom. This allows playback of a LOT of sound, with a 24LC256 eeprom of 32k bytes you can get 17 seconds of sound at 15625Hz. He gave me 3 new files which I have added to the .ZIP file above. They are: * PS_I2C.ASM PIC assembler code * PS_I2C.INC (serial eeprom routines, needed by above code) * PS_I2C.GIF .GIF of the circuit using PIC 16F84 and 24C65 eeprom I have not had time yet to test his code, and any feedback is appreciated. If you like my encoder software or have suggestions for changes to it please e-mail me here. -Roman PS. A fantastic tool is the sound wave editor shareware Acoustica made by http://www.aconas.com/ Their 30-day shareware allows re-sampling at any bitrate which is very handy when you have PIC interrupts running at speeds like 15625Hz and 19531Hz. It also does some very nice dynamic shaping which can help with speech files.