Huffman code.

Lecture



Huffman code.

Consider the Huffman code: built on the algorithm of the code tree, the construction of the graph begins with the hanging peaks. Their number corresponds to the number of messages. The weight of each vertex is the probability of the appearance of the corresponding message. And before starting work, the pendant vertices are ordered as the heights increase. Step 1: the number of subtrees of the graph is determined. If it is less than 2, then the tree is constructed; if not, step 2. Step 2: select 2 vertices with minimum weights and compare. A new vertex and 2 edges are obtained. The new vertex has weight = the sum of the original vertices. The left edge is 1, the right edge is 0. Return to step 1. As a result of building the tree, the root is always = 1. To get the code combination of the i-th message you need to go down the edges from the root of the i-th pendant vertex, writing out the labels edges of this path.

EXAMPLE: 8 messages, X = {x1, ... x8}, B = {0.1}, P 1 = 0.19, P 2 = 0.16, P 3 = 0.16, P 4 = 0.15, P 5 = 0.12, P 6 = 0.11, P 7 = 0.09, P 8 = 0.02.

Let back (the more often the message, the shorter the code), x1: 01, x2: 111, x3: 110, x4: 101, x5: 100, x6: 001, x7: 0001, x8: 0000. The smallest vertices are spliced

Evaluate the degree of efficiency of the resulting code. For this (**)

x i

code combination

n i

n i * P i

-P i * log 2 P i

X 1

eleven

2

0.38

-0.19 * log 2 0.19

X 2

011

3

0.48

-0.16 * log 2 0.16

X 3

010

3

0.48

-0.16 * log 2 0.16

X 4

001

3

0.45

-0.15 * log 2 0.15

X 5

000

3

0.36

-0.12 * log 2 0.12

X 6

101

3

0.33

-0.11 * log 2 0.11

X 7

1001

four

0.36

-0.09 * log 2 0.09

X 8

1000

four

0.08

-0.02 * log 2 0.02

Calculate after n cf = 2.92 bits, H (x) = 2.85 bits.

H (x) / log 2 m = n min = 2.85 bits, K and = (n cf - n min ) / n cf = 0.024, 2.4% is the redundancy of the Huffman code for this example.

Shannon-Fano code.

1) the entire sequence of characters is ordered in descending order of their probability. 2) the sequence is divided into 2 equally probable groups,

3) the symbol “0” is assigned to the first group, the second - “1”.

4) each resulting group develops as equiprobable as possible, etc. while division into groups is possible.

Example: 5 events and p 1 = 0.4, p 2 = 0.3, p 3 = 0.15, p 4 = 0.1, p 5 = 0.05.

n cf = ∑ [i = 1, 5] P i * n i = 2.05 bits,

H (x) = - ∑ [i = 1, 5] P i * log 2 P i = 2.01 bits,

log 2 m = 1, K and = (2.85 - 2.01) / 2.05 = 0.02 = 2%

Knowledge of effective codes allows

answer the question - what should

be the minimum code length

combinations to convey

information embedded in the source of the message in the absence of interference. Less than the level should not be. On it you can create an add-in, i.e. introduce additional discharges to solve problems of dealing with interference. As a rule, code combinations of effective codes have different lengths for different messages. This leads to the need to mark the beginning and end of message transmission with special characters, which complicates and slows down the transmission. Codes that have different length code combinations are called unequal.


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Information and Coding Theory

Terms: Information and Coding Theory