The ITU define the acceptable delay for speech in G.114. The aim in any network is to minimize this delay, however this can be problematic as the designers of networks have no idea who or where you will be calling. You maybe calling from your IP phone in New York talking to your buddy who is on his cell phone in Asia somewhere – who knows how many different networks your call is being routed over. One of the fixed delays that can be controlled is the delay through the vocoder(voice compression algorithm).
The higher the compression rate then the larger block of data the compression algorithm needs. If we look at one of the popular algorithms used, LD-CELP, Standard ITU-T G.728(Code Excited Linear Prediction), G728, this algorithm compresses a 64Kb per second sample down to a rate of 16Kb per second. Lets assume it requires 160 bytes of voice samples before the algorithm starts to compress the data. 160 voice samples at 125us sample rate equates to 20ms to just store enough samples for the vocoder to start. It typically takes 5ms to process those 160 bytes and produce the compressed data to be sent. The 160 bytes have now been compressed to just 40 bytes. By the time we add the overhead of framing information, Forward Error Correction(FEC) and the overhead of 24 bytes for the IP Packet Header then we have to send on average about 80 bytes every 20ms. That’s the same as 4000 bytes a second or 32Kbits per second.
If we increased the time between packets from 20ms to 40ms then the bandwidth required would drop to about 24Kbits per second. (120 bytes per 40ms). However the delay or latency has increased by 20ms.
For most residential users whether 24Kbits or 32Kbits are used is not a big deal, it won’t impact them surfing the web or downloading videos, songs etc. However this 25% reduction in bandwidth effectively means that businesses could gain 25% of bandwidth translating into more simultaneous phone calls per T1 and hence less T1’s required – this could be a substantial saving.
Various new voice compression algorithms are being developed. One of the most common that’s now being used compresses voice to 8Kbits per second. Giving even more calls per T1 installed. That is the real goal.