Efficient coding.

Lecture



Under redundancy understand the use of a large number of characters than the minimum necessary. Language has a great redundancy. The question arises how to construct the rule G in such a way that the redundancy is minimal. Requirements for building non-redundant (effective) codes. The greater the likelihood of a message, the less should be its length. We obtain a mathematical relationship connecting the probability of occurrence and the length of the corresponding code combination.

Let the encoding map X = {x 1 , x 2 , ..., x r } be B, and | m | power.

The coding mapping To each x i sets in correspondence some code combination of length n i from the alphabet B. Task: Estimate the minimum average length of the code combination of the code G. Knowing this value and comparing it with the average code word length of the real code, you can determine how real close to optimal. The probability P (x i ) is known, and the entropy is H (x) =-= [i = 1, k] P (x i ) * log 2 P (x i ), the average length n cf = ∑ i = 1 r P i * n i . Maximum entropy, which can have a message from the characters of the alphabet: H max = n cf * log 2 m, since H max ≥H (x), then n cf * log 2 m ≥ -∑ i = 1 r P (x i ) * log 2 P (x i ). Consider a special case in which redundancy = 0. (H (x) = H max ), k i is the redundancy factor.

k i = (H max - H (x)) / H max ), k i = (n cf - n min ). ∑ i = 1 r n i * P (x i ) * log 2 m = –∑ [i = 1, r] P (x i ) log 2 P (x i ), i = 1 r (P (x i ) * (n i log 2 m + log 2 P (x i )) = 0, n i log 2 m + log 2 P (x i ) = 0, n i = -log 2 (P (x i )) / log 2 m (*). From (*) it follows that the length of the i-th code combination ~ (-log) of the probability of the i-th message. The left part is an integer, the right part is most often non-integer. The resulting condition for real. building an effective code cannot be used, since all equality is not fulfilled, but one can demand that n i lie in a certain range of the obtained value, in which there is necessarily one whole.

- log 2 P (x i ) / log 2 m ≤ n i ≤ (- log 2 P (x i ) / log 2 (m)) +1, multiply both parts by P (x i ) .... - ∑ i = 1 r P (x i ) log 2 P (x i ) / log 2 m ≤ i = 1 r P (x i ) n i ≤ (-∑ i = 1 r P (x i ) * log 2 P (x i ) / log 2 m) + i = 1 r P (x i ),

H (x) / log 2 m≤n cf ≤ (H (x) / log 2 m) +1 (**).

Using this inequality, one can numerically estimate the structure of the constructed code. To obtain efficient codes close to zero, there are several methods (Huffman code, Shannon-Fano code).

created: 2014-09-15
updated: 2021-03-13
132476



Rating 9 of 10. count vote: 2
Are you satisfied?:



Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Information and Coding Theory

Terms: Information and Coding Theory