Introduction to Digital Encoding

 

This page will explain how analog signals are translated into digital code.
It will demonstrate some commonly used digital codes.

 

This project is done as a part of the "Protocols and computer Networks" course given by Dr Debby Koren in Tel Aviv university.


The Site was created by:

Ofer Yarom, Eyal Grubner, Igal Perelman and Amos Barak.


 

 

 

 

 

Introduction

The main goal of this web page is to describe the complete process of signal encoding, that is:
Understand the complete encoding process.

We will answer questions such as: why to use Digital Encoding at all, how a signal is transmitted and received, how analog signals are converted to digital signals etc'. 
You will also learn about some common Digital Encoding methods.

If the reader is interested (or just got bored), s/he is invited to test her/himself at the tests chapter, and try to decode some digital encodings by him/herself.

 

 

 

Basic definitions and abbreviations

Analog signal – A function of time that has an infinite set of values.

Digital signal – a function of time that has a finite set of values.

PCM – Pulse Code Modulation, is the most common technique to convert an analog signal to a digital signal.

PAM – Pulse Amplitude Modulation, which may be used individually as analog modulation or as the first stage of PCM.

Quantization – a method to confine an infinite set of values to a finite set of M values.

LPF – Low Pass Filter, a filter that eliminates the high frequencies of the input signal (i.e. the output signal contains only the low frequencies of the input signal).

Channel – any channel between the transmitter and the receiver, such as a Radio channel or a Fiber optic channel. 

Tx – Transmitting. 

Rx – Receiving. 

Baseband – a signal with frequencies close to zero. 

RF – This is the transmission frequency, which is a relatively high frequency (method dependable). 

 

 

Overview

 

As can be understood from the above figure, converting an analog signal to digital words (usually in binary representation) is composed of 4 main stages:

  1. The analog signal is filtered by a LPF and than sampled every Ts seconds.
  2. The samples (PAM signal) that are distributed over an infinite set of values are converted to a final set of M values. This process is called Quantization.
  3. Each of the M quantized values is converted to a digital word, usually in binary representation.
  4. The last, but not least stage is encoding the digital words thus preparing them for transmission.

      The PCM method is composed from the first 3 stages, and is covered in the chapter dealing with converting analog signals to digital signals.

      The last stage is covered in the chapter of Digital Encoding methods

 

 Why should we use these methods? 

The answer is very simple; many communication systems that utilize these methods gain many advantages. Part of these advantages exists also in Analog systems, but its cost is much higher and the performance is usually worse. The achieved advantages are described below:  

PCM   

Error correction – It is very easy to detect errors in received Digital data, and then we can just demand to retransmit the damaged data again (as in TCP) or to try to correct the data by ourselves. Of course, correcting the data by ourselves, require an appropriate coding scheme at the transmitter and the receiver as well. 

Encryption – Digital data can be encrypted very easily. This is a very important advantage, especially for business / military purposes. 

Multiplexing – Many digital data sources can be interleaved into one digital data stream, then transmitted together through a channel (radio, fiber optic etc.) and finally separated to the original sources (this process is called de-multiplexing). 

Compression – Digital data can be compressed, thus demanding less memory space for storage. 

Storage – Digital data can be stored and retrieved more flexibly than Analog data, using cheaper peripherals equipment.  

Transmission – For distant transmissions (i.e. the transmitter and the receiver are relatively distant) the signal “gain” additive noise. To overcome this problem repeaters are used, that regenerate the digital signal without the additive noise.

 

Line Encoding  

The PCM signal is indeed digital, but isn’t ready to be transmitted along the channel.

Line coding is overcoming some typical problems like: 

  1. The frequency range of a Baseband signal is very low (near to zero, including DC). Such a frequency range isn’t suitable for transitions.

  2. Many applications require synchronization, therefore the signal should imply when a bit (or a block of bits) starts and ends.

  3. We always desire to achieve the narrowest bandwidth possible (due to the cost of filters and signal processing).

 




 

 

Analog  => Digital

As mentioned at the introduction, the most common technique to convert an analog signal to a digital signal is PCM. This conversion is composed of 3 main stages:

 

  1. Passing the Analog signal through a LPF and sampling it.
  2. Transferring the sampled signal through a quantizer.
  3. Converting the quantized value to a binary representation.

 

The PCM technique is described at the following figure. Then, each of the 3 PCM stages is described in detail.

 

 

LPF and Sampling 

From Nyquest theorem, we know that an Analog signal can be reconstructed from a sequence of samples if the sampling rate is, at least, twice as the highest frequency of the signal. Therefore, basing on this theorem, it is clear that during the sampling or the reconstruction we don’t loss any part of the original data. 

The LPF must come before the sampling. Its task is to filter frequencies that are higher than half of the sampling rate, thus eliminating a phenomenon called Aliasing (may happen at the reconstruction of the signal, when part of the signal frequencies overlap other frequencies). Actually, the LPF ensures that the prerequisite of Nyquest theorem is satisfied. 

From the sampling rate fs we can calculate the time period of each sample Ts = 1/fs, Which means that every Ts seconds the Analog signal is sampled. These samples are distributed over an infinite set of values, which we obviously can’t transmit. For this reason we need to use Quantization.

 
Quantization 

Quantization is used to confine the infinite set of sampled values to a finite set of values that can be transmitted later. The letter M is used to designate the number of values in that finite set. Usually M is chosen as a power of 2, i.e. M = 2n (will be used later, for binary representation of the quantized values).

The quantization is implemented simply by rounding. Each sampled value is rounded to the closest legal value of the quantizer.

A simple quantizer is described at the following table: 

X, the input voltage [Volt]

Output voltage [Volt]

X >= 6

7

6 > X >= 4

5

4 > X >= 2

3

2 > X >= 0

1

0 > X >= -2

-1

-2 > X >= -4

-3

-4 > X >= -6

-5

-6 > X

-7

 

It can be easily derived from the above table that this quantizer has 8 levels (M=8). Thus we confined the infinite values of the sampled signal to a finite set of 8 possible values.  This is the same quantizer used in the graph appearing at the beginning of this chapter; therefore the sampled value of 4.29 Volt is rounded to 5 Volt. 

The quantizer used in this example is a linear quantizer, i.e. the interval between every two neighboring output values is equal to 2 Volt. We used such a quantizer only for the simplicity of the explanation. In reality, we use quantizers that are matched to the distribution of their input signal. Most of the analog signals, especially those carrying speech, contain much more lower frequencies than higher frequencies, which means that the extreme quantization levels (for example, in our quantizer X >= 4 and           X <= -4) are used rarely. For this reason, most of the quantizers have more quantization levels at the range of lower frequencies and less quantization levels at the range of higher frequencies (the density of the levels is changed according to the distribution of the input signal).

 

 

Binary Representation 

The last stage of PCM is converting the decimal value of the quantization to a binary representation. This is the reason that M was chosen as a power of 2. In our quantizer example we used M=8 => the number of bits needed for binary representation is n=3.

Basically we can use any desired representation, such as octal or hexadecimal. These representations reduce the require bandwidth, nevertheless the cost and simplicity of modules needed to support binary representation is much more significant here.

 

The binary representation designating each quantization level should be also considered. Gray code can be very usefull here. In Gray code, every two neighboring words are different in only one bit. Thus a possible error (caused due to additive noise or some other problem) will cause a only minor shift to a neighboring frequency in the reconstruction of the analog signal. This way the impact of the error occurred is decreased significantly. Using Gray code on our quantizer will derive the following values:

  

Quantizer output voltage [Volt]

PCM output [binary representation]

7

110

5

111

3

101

1

100

-1

000

-3

001

-5

011

-7

010

 

This is the reason that the quantized value of 5 Volt (appearing in the graph at the beginning of this chapter) is converted to 111. Finally, after all this tiring work we achieve a PCM signal.

 

Typical problems 

We hope that the reader won’t be too disappointed, however we don’t want to deceive anyone. Indeed we gain many advantages by our analog to digital conversion, but we also created some new problems. Nevertheless it is known how to decrease the impact of these problems, so on the whole PCM method still has much more advantages than disadvantages. Some of the typical problems are: 

Quantization noise – The difference between the original samples to their quantized values is called Quantization noise. This noise will appear at the output, after the reconstruction of the Analog signal.

Bandwidth – Each sample is represented by n bits, therefore the required bandwidth is multiplied a factor of, at least, n. 

ISI - Intersymbol Interference. Each binary representation of the samples, will be transformed at the end to some shape, usually a pulse, called a symbol. It is very likely that neighboring symbols will interfere each other, thus adding difficulties to the reconstruction of the analog signal. This phenomenon is called ISI.

 

 

 




 

 

Common Codes

In this section we will describe some commonly used line codes. The role of line codes in a communication system is described in the introduction section.

We will go from the easiest and most trivial code towards some more complex codes.

Enjoy.

 

 

NRZ-L: Non Return to Zero Level

Description:

Zero is represented as no voltage, and one by high voltage level.

Notes:

This is the simplest representation of digital signal.

It has two major shortcomings: first, it has a DC component, meaning that its average voltage is not 0 but some positive constant. Some electrical components (e.g. capacitor) need constant change in voltage, and in case we have a sequence of ones, it won’t be the case. Second, it has the inability to carry synchronization information. Again, if we have a series of ones, we won’t be able to know how many we got.

Example:

 

Polar NRZ-L: Polar Non Return to Zero Level

Description:

Zero is represented as negative voltage level, and one by positive voltage level.

Notes:

This code is similar to the previous one. It handles the DC component issue, meaning the average voltage level is 0. It still has the synchronization problem. 

Example:

 

 

NRZ-I: Non Return to Zero Inverted

Description:

Transition on one only.

Notes:

It’s easier for electrical components to detect change in voltage rather than absolute voltage levels.

This line code has the same two shortcomings of the previous code: no change in voltage in the case of zeroes sequence and no carry of synchronization information.

This code doesn’t handle the DC component (average is not 0).

Example:

Bipolar - AMI

Description:

No voltage on zero, the first one is a positive voltage, the second one is a negative voltage, and the voltage values of subsequent ones alternate.

Notes:

Here the problem of DC component (average not 0) was solved by introducing negative voltage level. The code is not sensitive for polarity but we can lose synchronization on a long sequence of zeroes.

Example:

 

Manchester

Description:

Zero is represented as a transition from high to low voltage level in the middle of the bit, while one is represented by the transition from low to high.

Notes:

Good for timing as we have a transition every cycle, fully self synchronizing.

No DC component, the average voltage is 0.

The drawback of this line code is the fact that we move to half bit transitions, causing our bandwidth to double (the faster we change our levels, the bigger the bandwidth we use).

Example:

 

Differential Manchester

Description:

Always a transition in the middle of a bit, transition at the beginning only for zero.

Notes:

As in the regular Manchester code, fully self synchronizing, no DC component, the average voltage is 0.

Another advantage here – polarity is not significant.

The drawback of this line code is the same as for the previous one – double bandwidth.

Example:

 

B8ZS: Bipolar with 8 Zero Substitution

Description:

Acts like Bipolar – AMI except on 8 consecutive zeroes. In that case it sticks ones in the place of the fourth, fifth, seventh and eighth zeroes, with the polarity of the first and third ones incorrectly reversed.   

Notes:

This line code is an improvement to the Bipolar – AMI code. It handles the drawback of losing the synchronization on a sequence of zeroes. The decoder recognizes an intentional violation to conclude that there is in fact a sequence of 8 zeroes. This way we ensure that at any time, there will be no more than 7 consecutive zeros.

Example:

 

HDB3: High Density Bipolar order 3

Description:

Acts like Bipolar – AMI except on 4 consecutive zeroes. It places a one in the place of the fourth zero but with no alternate level (the violation). Any pattern of more then 4 bits is encoded as “B00V” where V is the violation and B is a balancing pulse. The value of B is assigned + or – in order to make alternate Vs of opposite polarity.

Notes:

This line code is an improvement to the Bipolar – AMI code. It handles the drawback of losing the synchronization on a sequence of zeroes. This way we ensure no sequence of more than 3 zeroes. The B trick is handling the DC component we will get on many consecutive zeroes (the violation will keep the same level always).

Example:

 

Is it all?

Fortunately, our work here is over. However, a complete communication system that is capable of transmitting and receiving data is composed of many additional modules.

We will describe, briefly, a basic block diagram of a complete communication system for analog signals. If the reader is interested, she (or he) is invited to find more details in the Bibliography section.

 

 

Modulation – taking the input bits (called Baseband) and “loading” it on the transmission carrier (RF carrier).

Detection – mainly, receiving only a pre defined frequency range. 

Matched filter – a filter that is match to the transmitted signal, thus enables the best possible reception. 

Decision – for every digital value received we should decide what was the original value that was transmitted. 

D/A – Digital to Analog signal convertor. 

 

Bibliography: 

[1] Communication Systems, Prof. Nadav Levanon.   Tel Aviv university 2002, Edition 2A.

[2] Digital and Analog Communication Systems, L. W. Couch.  Prentice Hall 2001, 6'th Edition.