Sample rate and bit rate
A digital audio segment's sample rate specifies the number of samples to take from one second of an audio's source material; a high sample rate increases the ability of digital audio to faithfully represent high frequencies. The highest frequency that can be accurately represented is half the sampling rate. Since the human voice typically spans the 40 Hz - 4 kHz range, a typical phone call the audio is sampled at 8 kHz or 8000 samples per second. This is a preferred sampling rate that will result in good transcription.
Bit depth affects the dynamic range of a given audio sample. A higher bit depth allows you to represent more precise amplitudes. If you have lots of loud and soft sounds within the same audio sample, you will need more bit depth to represent those sounds correctly. A typical phone call uses 8-bit depth. For comparison audio CDs use 16-bit depth, whereas DVD/HD audio uses 24 bit-depth.
Most digital audio processing uses these two factors/parameters — sampling rate and bit depth – which comprises the bit rate (Sampling rate x Bit Depth). So a typical phone conversation is (8 kHz sample rate * 8 bits of depth = 64 kbits per second) which is acceptable to produce a good transcription. The optimal bit rate is 8 kHz * 16 bits of depth = 128 kbps.