Cryptanalysis. Cryptanalysis methods

Lecture



Cryptanalysis (from the Greek. - Hidden and analysis) - the science of how to obtain the original value of the encrypted information, without having access to secret information (key) necessary for this. In most cases, this means finding the key. In non-technical terms, cryptanalysis is the breaking of a cipher (code). The term was introduced by American cryptographer William F. Friedman in 1920.

The term “cryptanalysis” also refers to an attempt to find a vulnerability in a cryptographic algorithm or protocol. Although the main goal remained unchanged over time, the methods of cryptanalysis have undergone significant changes, evolving from using only pens and paper to the widespread use of computing power of computers these days. If earlier cryptanalysts were mostly linguists, nowadays this is the lot of “pure” mathematicians.

The results of the cryptanalysis of a specific cipher are called a cryptographic attack on this cipher. A successful cryptographic attack that completely discredits the attacked cipher is called a burglary or attack.

Cryptanalysis methods

1.1. Terminology and assumptions.
Cryptology is quite clearly divided into two parts: cryptography (encryption) and cryptanalysis. The cryptographer is trying to find methods to ensure the secrecy and / or authenticity (authenticity) of messages. A cryptanalyst tries to perform the inverse task by uncovering a cipher or faking coded signals so that they are accepted as authentic. The original message to which the cryptographer applies his art is called plaintext, and the result of his work, the ciphertext of the message, is called a ciphertext, or a cryptogram. The cryptographer always uses the secret key to control the encryption process. Often (but not always), he sends this secret key in some reliable way (for example, in a “diplomat” handcuffed to the courier’s hand) to a person (or machine), to whom he is going to later send a cryptogram composed using this key.
An almost accepted assumption in cryptography is that the enemy's cryptanalyst has the full text of the cryptogram. In addition, the cryptographer is almost always guided by the rule, first formulated by the Dutchman Kerkhoff: the strength of the cipher should be determined only by the secrecy of the key. In other words, the Kerkhoff rule is that the entire encryption mechanism, except for the secret key value, is known to the enemy's cryptanalyst. If the cryptographer accepts only these two assumptions, then he develops a system that is resistant to analysis based on ciphertext only. If, moreover, the cryptographer assumes that the enemy's cryptanalyst can extract several passages of plaintext and the corresponding ciphertext, formed with the use of a secret key, then a system is developed that is open when analyzing on the basis of plaintext. The cryptographer may even assume that the adversary cryptanalyst is able to enter its plaintext and get the correct cryptogram formed using the secret key (analysis based on the plaintext selected), or assume that the adversary cryptanalyst can substitute fictitious cryptograms and get the text into which they turn when decrypting (analysis based on the selected ciphertext), or allow both of these possibilities (analysis based on the selected text). The developers of most modern ciphers ensure their resistance to analysis based on the selected plaintext, even when it is assumed that the enemy cryptanalyst can resort to ciphertext-based analysis.
1.2. Cryptography tasks The concept of the strength of a cryptographic algorithm.
Cryptology is a "special" area of ​​research. The achievements of this science are increasingly reported not only by scientific, but also by popular science journals and the usual press. Abroad in recent years there has been an unprecedented boom in the field of cryptology. This is due to the fact that its achievements have been applied not only in narrow departmental circles, but also in the lives of millions of citizens.
The widespread introduction of computing systems has led to the fact that they become attractive for various kinds of information attacks. This is facilitated by the fact that the information was deprived of its physical
incarnations, as was previously (for example, the text is written on paper and signed by the author). The absence of such a physical embodiment, coupled with the inability to authenticate its author, opened the way to various kinds of violations. In this regard, there is a need not only to ensure confidentiality, but also to ensure the control of the authenticity and integrity of information. In addition, the growing value of information and the informatization of society raise issues of differentiation of access to information (for example, if the user has not paid for the work with the knowledge base) and protection against computer terrorism. To date, such protection is carried out effectively using cryptography.
Systems and means of information protection (SZI) differ from "ordinary" systems and
means that for them there are no simple and unambiguous tests,
which make sure that the information is securely protected. Besides
In addition, the effectiveness of GIS and simply their presence is not tied to
performance of the main system. Therefore, the problem of the effectiveness of SZI is not
can be solved by ordinary testing. For example, to check
The performance of the communication system is sufficient to test it. but
successful completion of these tests does not allow us to conclude that
the information protection subsystem built into it is also operable.
The task of determining the effectiveness of GIS (especially if used
cryptographic methods of protection), often more laborious than
GIS development requires special knowledge and, as a rule, more
highly skilled than the task of developing. Often the analysis of a new cipher
is a new scientific, not an engineering challenge.
These circumstances lead to the fact that the market appears many
means of cryptographic protection of information about which no one can
say nothing definite. At the same time, developers keep the cryptoalgorithm
(as practice shows, often unstable) in secret. However the task
exact definition of the cryptographic algorithm can not be guaranteed
complex, if only because it is known to developers. Also, if
the offender has found a way to overcome the defense, it is not in his interest about it
declare. As a result, users of such GIS fall into dependence as
minimum from the developer. Therefore, it should be beneficial for the public to open
discussing the security of GIS for mass use, and concealment
developers of cryptoalgorithm should be invalid.
Crypto scheme or cryptoalgorithm will be called the algorithm itself.
encryption, imitation protection, and other cryptographic functions.
Cryptographic protocol will be called a set of rules and procedures
defining the use of a cryptographic algorithm. The cryptosystem represents
is a set of cryptographic schemes, key management protocols and procedures,
including manufacture and distribution. These definitions do not claim
rigor and do not allow for a clear boundary between the cryptoalgorithm and
protocol. So, the hash function y = F (z, x) + x, where F -
crypto-transformation with a known key z, can be considered as
stand-alone cryptoalgorithm, and as a protocol using
the transformation F. However, for the further presentation of these definitions
enough
It is customary to distinguish cryptoalgorithms by the degree of provability of their security.
There are certainly persistent, provably persistent and supposedly
strong cryptoalgorithms. Security of unconditionally secure cryptographic algorithms
based on proven theorems about the impossibility of disclosure of the key. Example
Of course, a one-time
using keys (the Varnam cipher) or a quantum cryptography system,
based on the quantum-mechanical uncertainty principle. Persistence
demonstrably persistent cryptographic algorithms determined by the complexity of the solution well
known mathematical problem that many mathematicians tried to solve
and which is admittedly challenging. An example would be systems
Diffie-Hellman or Rivest-Shamir-Adelman, based on the difficulties
discrete logarithm and decomposition of an integer into
multipliers. Presumably persistent cryptographic algorithms based on complexity
solving a particular mathematical problem that does not come down to well
known tasks and that tried to solve one or more people.
Examples include cryptographic algorithms GOST 28147-89, DES, FEAL.
Unfortunately, unconditionally resistant cryptosystems are inconvenient in practice (
one-time use of the key require large protected memory for
storing keys, quantum cryptography systems require
fiber optic communication channels and are expensive besides
proof of their safety leaves the field of mathematics in the field
physics).
The advantage of provable proof algorithms is good knowledge
the tasks underlying them. Their disadvantage is the impossibility
operational refinement of cryptographic algorithms in the event of such
necessary, that is, the rigidity of these cryptographic algorithms. Boost
persistence can be achieved by increasing the size of a mathematical problem
or its replacement, which, as a rule, entails a chain of changes not only in
encrypted, but also adjacent hardware.
Presumably persistent cryptoalgorithms are characterized by relatively small
knowledge of a mathematical problem, but they are very flexible,
what allows not to refuse algorithms in which weak
places, and carry out their refinement.
The task of providing secure communications includes a range of
problems. This is the task of ensuring privacy and cryptographic protection, identification
(authentication) and the task of managing keys, including their generation,
distribution and delivery to users, as well as their prompt replacement in
if necessary.
The general scheme of the organization of secure communication is shown in Figure 2.1. Same
The scheme also describes subscriber encryption when the user encrypts
information stored in computer memory. In the latter case, the source of the messages and
the recipient is identified, and the communication channel can be arbitrary
delays.
Message source generates arbitrary information (open text)
with some kind of probability distribution. The encryptor encrypts this message to
confidential (known only to the sender and recipient) key Z and
translates plaintext into ciphertext or cipher
(cryptogram, ciphertext). Keys are generated by a key source and by
secure channels are sent by the network subscriber. Descrambler reveals
received cipher and sends to the recipient.
In the diagram in Figure 2.1. included more randomizer and solver.
The randomizer makes all encryption different from each other, even if
input messages are the same. The purpose of this will be explained below. Decisive
the device decides whether the received message is
authentic, that is, performs the function of cryptographic protection.
Encryption and decryption can be described as follows: Y = E (X), X = D (Y).
For one-to-one, it is necessary that the DE be single
conversion. This section will assume the sender
and the recipient of the shared secret key Z. (In fact, they do not have the keys
necessarily the same, but the knowledge of one key, for example encryption,
makes it easy to calculate the other. Therefore, the considered cryptoalgorithms
sometimes called symmetric, or single-key. Accordingly, and
cryptography that studies such algorithms is called
single key). This scheme is used if the network subscribers
trust each other.
The following shows how issues are solved using single-key cryptography.
imitation protection and identification (secrecy is ensured in an obvious way).
The authenticity and integrity of the message is provided by its cryptographic
conversion performed using the secret key. For example, if
the sender will transmit immediately and open (not requiring secrecy), and
encrypted texts, this will allow the recipient who knows the key,
to assert that the text during transmission over the communication channel was not changed if
The result of deciphering the cipher text is the same as the plain text
Indeed, the coincidence of the corresponding open
text and cipher text - almost impossible event. This pair could
make only the sender who knows the secret key. Hence it follows
the authenticity of the message (the sender is identified with the owner of the key). AT
there’s really no need to transmit the entire cipher
transfer only its part, called crypto unit, which should depend
from all plain text. Then the recipient may, on the basis of the received
text and key to calculate your crypto device and check its compliance
received.
To identify the user, use the following dialog. Checking
generates random text and sends it to the identifier for encryption.
The identifiable one encrypts this text and returns it to the verifier. He checks
cipher correspondence with text. The correct answer can only be
the owner of the secret key that is identified with the legitimate
by user. Obviously, the offending interceptor will not be able to
correctly encrypt the new text and be called a user. (Exception
is the analogy of the famous scam applied when playing chess
by mail when the offender simply transmits the answers and requests genuine
to the verifier and the verifier, conducting a dialogue simultaneously with each of them).
The principal difference of this system from the identification by password, where
eavesdropping allows you to find out the secret password in the future
take advantage of this, is that here by communication
secret information is not transmitted. It is clear that both encryption strength and
imitation resistance and durability of recognition can be provided if
underlying crypto-transformation is robust in the sense of
key disclosure.
Cryptographic algorithms are usually built using simple and
quickly executed operators of several types. Many reversible
operators that convert text of length n bits to text of length n bits,
are elements of the group of invertible multiplication operators (permutations
n-bit words). Let f, g, h be invertible operators, that is, there exist f
-1, g -1, h -1. Therefore, hgf is the sequential execution of statements.
f, g, h is also a reversible operator (operators run from right to left) with
the inverse operator to this product is f -1, g -1, h -1. therefore
the decoder performs the same operations as the encoder, but in the opposite
order, and each decryption operator is inverse to
corresponding encryption operator. Some operators are
mutually inverse, that is, performing two consecutive times of some operation
above the text gives the source text. In terms of group theory, this is written
by the equation f 2 = e, where e is the unit operator. This operator is called
involution. It can be said that involution is the root of
units. An example of involution is the addition modulo two texts with
the key.
The offender can solve the following problems. He may try to uncover
encrypted information, organize selective transmission of one or another
information, finally, it may try to change the true or impose
false information. The fundamental difference between the tasks of classifying and
cryptographic protection is that the secret key should not be
is available to the offender during the period of secrecy of information, which is usually
much longer than the validity of the key and can be tens of years.
The crypto security key is of interest to the intruder only during his
actions. Therefore, its requirements are less stringent than
the secret key.
There is another important application of single-key cryptography. it
implementation of one-way computable information conversion. Such
the conversion is called a hash function. Feature of this conversion
is that the direct transformation y = h (x) is easy to calculate, and
the inverse x = h-1 (y) is difficult. Generally speaking, the inverse transform is not
is a function, so it’s more correct to talk about finding one of
preimages for a given hash value. In this case, the key
understood as some confidential information, no. However persistent
hash functions for which the preimage of a given function value is hard
to find, are implemented by cryptographic methods and require to substantiate
stability of cryptographic research. Typical application
hash functions — creating a compressed image for source code such that
find another text that has the same way, computationally impossible.
The task of creating a persistent hash function arises, for example, with digital
signature texts.
One of the possible independent uses of hash functions is to identify
user using a chain of the form x, h (x), h (h (x)) = h 2 (x), h 3 (x),
... hk (x).
The last value of the chain hk (x) is the control information for
verifier, and the user knows h k-1 (x) and presents this information
at the request of the verifier. The reviewer calculates h (h k-1 (x)) and compares
from the control. Next time this user should show h
k-2 (x), and control information is h k-1 (x), etc. This is interesting
The solution proposed by A. Conheim, however, has several disadvantages.
First, the user needs to store the entire hi (x) chain, which requires
large amounts of memory if the number of identifications can be large.
Secondly, if each user has several testers, then
there is a question about the synchronization of verifiers by indicators
the hi (x) value used, that is, communication channels between
each pair of inspectors.
The ability of cryptosystems to withstand attacks (active or passive)
cryptanalyst is called endurance. Quantitatively resistance is measured
as the complexity of the best algorithm leading a cryptanalyst to success with
acceptable probability.Depending on the goals and capabilities of the
cryptanalyst, resistance also changes. Distinguish between key persistence (the complexity of
key disclosure by the best known algorithm), keyless
read resistance, crypto resistance (the complexity of imposing false information by the best
known algorithm) and the probability of imposing false information. it
sometimes completely different concepts, not related to each other. Some
cryptosystems, such as RSA, allow you to impose false information with
complexity that is almost independent of key durability. Similarly, one can
distinguish between the strength of the cryptoalgorithm itself, the strength of the protocol, the
strength of the key generation and distribution algorithm.
The level of endurance depends on the capabilities of the cryptanalyst and on the
user. Thus, cryptanalysis is based on ciphertext only
, when the cryptanalyst has only a set of cipher codes and does not
know plaintext, and cryptoanalysis on the basis of plaintext, when the
cryptanalyst knows both the discovery and the corresponding ciphertext.
Since the cryptoalgorithm usually has to be quite universal,
It seems natural that the key persistence does not depend
on the probability distribution of the message source. In the general case, the source of
messages can produce "convenient" messages for the violator that
may become known to him. In this case, talking about cryptanalysis on the
basis of specially selected open texts. Obviously, the
key's persistence with respect to an analysis based on selected texts cannot
exceed the persistence with respect to an open text analysis, and
it, in turn, cannot exceed the persistence with respect to an analysis
based on encrypted texts. Sometimes a GIS developer may even
that a hostile cryptanalyst can have access to a cryptosystem, that is, to
be "one's own."
Typically, cryptoalgorithms are designed to be resistant
to cryptanalysis based on specially selected open texts.
The concept of the “best algorithm” of disclosing a key in determining persistence is
non-constructive and allows for a subjective interpretation (for some of the
developers, the best algorithm can be a simple enumeration of keys).
Apparently, none of the cryptoalgorithms used has defined the
best key discovery algorithm, that is, the task of finding the best
algorithm is extremely difficult. Therefore, in practice, to assess
endurance use the best known or found in the course of research
disclosure algorithm. Thus, in practice, no one can prevent
a cryptoanalyst who is capable of lowering the assessment of resilience by inventing a new, more
efficient method of analysis.
Creating new effective methods of disclosing a key or another method of
weakening a cryptoalgorithm can give knowledgeable people greater
opportunities to harm users who use this
cryptoalgorithm. Publication or silence of this information is determined by the
degree of openness of society. An ordinary user of the system is powerless to
prevent the violator from disclosing his keys.
From the above it follows that the concept of the “best known” algorithm
not absolutely: tomorrow a new more efficient
disclosure algorithm may appear , which will lead to an unacceptable decrease in the strength of the
cryptoalgorithm. With the development of mathematics and computer technology, the
robustness of a cryptoalgorithm can only decrease. To reduce the
possible damage caused by the untimely replacement of a cryptoalgorithm that has
lost its strength, a periodic recheck of the
strength of the cryptoalgorithm is desirable . To reduce the likelihood of an unpredictable
“collapse” of a newly developed cryptoalgorithm, it is necessary to conduct
cryptographic studies.
From the above it follows that the concept of cryptosystem strength
multifaceted. Persistence depends not only on the developer, but also on the
characteristics of using this cryptographic algorithm in the control or
communication system, on the physical implementation of the cryptoalgorithm, and on the future
success of mathematics and computing. After all, a cryptosystem can be
operated for many years, and the need to keep secret for a
long time previously transmitted information through open communication channels
may make it necessary to predict the development of science and technology for
decades.
1.3.The need for cryptanalysis.
The last decade has been characterized by a sharp increase in the number of open
papers on all issues of cryptology, and cryptanalysis is becoming one of the
most actively developing areas of research. Many
cryptosystems, the resilience of which did not cause any special doubts, were
successfully disclosed. At the same time, a large arsenal of mathematical
methods of direct interest for a cryptanalyst has been developed .
In the early 1970s. only classical single-key
cryptography was known , but the number of open jobs on this topic was very
modest. Lack of interest in it can be explained by a number of reasons.
Firstly,
there was apparently no acute need for commercial cryptosystems . Secondly, the large amount of closed
cryptographic research discouraged many scientists, who
naturally wanted new results. Finally, maybe
the most important factor is that cryptanalysis as a scientific
discipline, in fact, still represented a large set of
disparate “tricks” that were not united by a coherent mathematical concept.
In the 1970s the situation has changed radically. Firstly, with the development of
communication networks and the widespread intrusion of computers, the need for cryptographic
data protection has become increasingly widespread in society.
Secondly, the invention of Diffie and Hellmann public-key cryptography
created fertile ground for meeting the commercial needs of
secrecy, eliminating such a significant drawback of classical
cryptography as the complexity of key distribution. Essentially, this
the invention galvanized the scientific community, opening a qualitatively new
unexplored area, which also promised the possibility of widespread
introduction of new results of the rapidly developing theory of computational
complexity for the development of specific systems with a simple mathematical
description. It was expected that the stability of such systems would reliably rely
on the real-time insolubility of many well-known problems and that,
perhaps, over time, it will be possible to prove the fundamental non-extensibility of
some cryptosystems.
But hopes of achieving demonstrable strength by reducing
cryptography tasks to well-known mathematical problems did not materialize, but
rather the opposite. It is the fact that any task of finding a
way to reveal a particular cryptosystem can be
reformulated as an attractive mathematical problem,
which can be solved using many methods of the same complexity theory, number
theory and algebra, which led to the discovery of many cryptosystems. To date
, the classic tape single-use remains the only, of
course, strong encryption system.
The ideal proof of the strength of a public-
key cryptosystem could consist in the proof of the fact that any algorithm for the
disclosure of this system, which possesses its not negligible probability
disclosure associated with an unacceptably large amount of computation. And although none
of the well-known public key systems satisfies this strong
criterion of endurance, the situation should not be considered absolutely
hopeless. Many systems have been developed for which it has been proven
that their stability is equivalent to the complexity of solving some important problems,
which are considered by almost all as extremely complex, such as, for example,
the well-known problem of decomposing integers. (Many of the
cryptosystems disclosed were obtained by weakening these supposedly
stable systems in order to achieve high speed.) In addition, the
results of extensive research conducted over the past ten years
Over the years, both in cryptography itself and in the general theory of computational
complexity, modern cryptanalysts have a much deeper understanding of
what makes its systems unstable.
Conducting cryptanalysis for long-existing and recently emerging
cryptoalgorithms is very important, as it can be said in time that
this cryptoalgorithm is weak and can be improved or replaced with a new one.
In order to identify unstable cryptoalgorithms, it is necessary to constantly
improve the already known methods of cryptanalysis and find new ones.
2. Review of the known methods of cryptanalysis of classical ciphers.
The direction of cryptanalysis may be different, in accordance with
directions of cryptography: this is the disclosure of the key, the imposition of false
information by finding weaknesses in the cryptographic algorithm or protocol, the
possibility of endless reading of encrypted information, etc.
A cryptographic algorithm, as a rule, must be resistant to
cryptanalysis based on selected open texts. The first requirement
essentially means that the declassification of some information
transmitted over the communication channel in an encrypted form should not lead to the
declassification of other information encrypted on this key. The second
The requirement takes into account the specifics of operating the equipment and allows
some liberties on the part of the operator or persons who have access to the
formation of classified information.
Cryptanalysis can be based on the use of both general mathematical
results (for example, large numbers decomposition methods for
RSA cryptosystem) and particular results obtained for a specific
cryptoalgorithm. As a rule, cryptanalysis algorithms are
probabilistic.
2.1. Typical methods of cryptanalysis of classical algorithms
2.1.1.The method of meeting in the middle.
If the set of keys of the cryptoalgorithm is closed relative to the composition, that
is, for any keys zi and zj there is a key zk such that the result of
encrypting any text in series on zi and zj is equal to the cipher code of the
same number on zk, that is, F (zj, F (zi, x )) = F (zk, x), then you can
use this property. Suppose we need to find the key zk. Then to
find the key zk, you need to find an equivalent pair of keys zi,
zj. This method of cryptanalysis is based on the “birthday paradox”.
It is known that if we assume that birthdays are evenly distributed, then in a
group of 24 people with a probability of 0.5 in two people birthdays
match. In general, this paradox is formulated as follows: if a C n
objects are selected with a return from a certain set of size n,
then the probability that two of them coincide is 1-exp (-a2 / 2).
Let the plaintext x and its cipher code y be known. For the text x, we build
a database containing a random set of keys z | and the corresponding
ciprograms w = F (z |, x), and we arrange it according to the ciprograms w. Database size
, select O (C #). Then we randomly select the keys z | |
to decrypt texts y and the result of decryption v = F (z | |, y) is
compared with the database. If the text v turns out to be equal to one of the
ciprograms w, then the key z | z | | is equivalent to the desired key z. Temporary
the complexity of the method is O (n # log #). The log # factor
takes into account the complexity of sorting. The required memory is O (C # log #)
bits or OC #) blocks (it is assumed that the block length and key length
differ by a limited constant).
The same method is applicable if the set of keys contains a sufficiently large
subset that is a semigroup.
Another application of this method to a set that is not a semigroup
can be demonstrated on hash functions. For example, to fake a signature,
you need to find two texts that have the same hash image. After that, you can
replace the signed message with another one that has the same hash image.
The search for two such messages can be performed using the
"meeting in the middle" method . The complexity of the search is O (n), where # is the number of
various hash images.
This algorithm is probabilistic. However, there is a
deterministic analogue of this algorithm "giant step - baby step" with the
same complexity, proposed by the American mathematician D.Shenks.
2.1.2. Pollard method.
This probabilistic method is based on the following fact. If on a certain
finite set M we define a random mapping f and apply it
alternately to all elements of M and match the edges y = f (x) for x, yH M.
Since the set M is finite, then this graph must contain trees,
whose roots are connected in cycles. For a random mapping f, the cycle length
is O (C # M) and the height of the tree is on average O (C # M).
To find a key, for example, in a cryptoalgorithm based on the
logarithm problem on a certain group, it suffices to solve the problem of meeting a
random map graph . For this, from two different starting points x0 |, x0 |
|a trajectory is built before entering the cycle. Then one of the two end points
lying in the cycle is fixed, and the trajectory of the other continues until the meeting
with a fixed point. For the function f and the entry point x0, the length of the trajectory
is O (C # M) steps. A typical form of this trajectory contains a
limit cycle of length O (C # M) and a segment before entering the cycle of approximately the same
length. As an indicator of the closure of the trajectory, Pollard proposed to
use the equation xi = x2i, where xi is the i-th point of the trajectory for the input
x0. This equality will always hold. The index value i does not
exceed the sum of the length of the path to the entrance to the loop.
On average, the difficulty of finding the equality xi = x2i is equal to 3C (p / 8) # M.
The complexity of the meeting, when both points lie in the cycle, is equal to 0.5C (p / 8) # M.
Thus, the final difficulty is 6.5C (p / 8) # M.
Pollard's method is used to solve the logarithm problem on a cyclic
group, search for partially equivalent keys (collisions), to find
two arguments giving the same hash function, and in some
other cases. Applied to the problem of logarithmization, this method
eliminates the use of a large amount of memory compared
with the method of meeting in the middle. In addition, there is no need to
sort the database. As a result, the time complexity
is reduced by the factor O (log # M) as compared with the method of the meeting in the middle. Complexity
this method in this case is O (C # M) steps and requires a memory
of O (1) block size. However, a small memory is not always required.
Consider, as an illustration of the Pollard method, an algorithm for finding a
collision (two arguments giving the same hash function value) for a
computational model with a memory O (v). Such arguments are the
elements of the set M, the arrows from which, under the action of the hash function f, lead
to the entry point to the cycle. In practice, the algorithm is reduced to finding the
entry point into the cycle.
1. Enter the cycle using the equation xi = x2i = t.
2. Measure the length of the cycle m by successively applying the mapping f to the
element t to obtain the equality fm (t) = t.
3. Split the cycle m into v intervals of the same or close length and create
a database, remembering and arranging the starting points of the intervals.
4. For the starting vertex of step 1, perform single steps before meeting with
any point from the database of step 3. Mark the start and end point of the
interval at which the meeting occurred.
5. Delete the previous one and create a new database from v points, dividing the
interval at which the meeting took place into equal parts, remembering
and sorting the initial points of the intervals.
6. Repeat the procedure of paragraphs.4.5 until the length of the
interval is equal to 1. Calculate the meeting point in the cycle, calculate the collision
as a pair of vertices, one of which is in the loop, but the other is not.
This algorithm requires repeated O (C # M) steps to enter the
loop and perform database sorting. At each stage when creating a
new database, the interval length is reduced by v times, that is, the number of
repetitions is O (logv # M). If v << C # M, then the complexity of database sorting
can be neglected. Then the complexity of the algorithm is O (C # M logv # M).
Consider the possibility of improving the assessment of the complexity of this method by
increasing the available memory. Recall that almost all the vertices of the
random mapping graph lie on the same tree. Therefore, we will solve
the meeting problem on a randomly oriented tree. With increasing depth h width
tree b (h) decreases, that is, the random mapping has compressive
properties. Therefore, with increasing depth of the vertex, on the one hand,
the complexity of the meeting increases due to repeated use of the
display, on the other hand, the probability of meeting increases due to the
compressive properties. The proportion of vertices with a depth of at least h is equal to O (1 / h).
Consider two algorithms for solving the meeting problem. The first algorithm
involves selecting a set of m random initial vertices, applying
mappings to them at times, sorting the set of finite vertices, and searching for
equal ones among them. The second algorithm involves memorizing the entire
descent trajectory for each vertex, sorting the set of vertices that make up
trajectories, and search among equals. The second algorithm is no less
complex than the first, due to the compressive properties of the random mapping.
The complexity of the first algorithm is km (log m). The log m factor takes into account the
complexity of sorting. Due to the paradox of birthdays, m = O (# M / h) 0.5. For
one step, the complexity of the algorithm is O (C #M log # M), that is, this algorithm is
more efficient than the algorithm of the meeting in the middle.
After the first step from the sheet, the depth becomes O (log # M). However,
further growth of depth with each step slows down. This is explained by the
essentially discontinuous character of the conditional probability p (h | H) of obtaining
depth h at the initial depth H. Indeed, from the definition of depth
it follows that each vertex with depth H + 1 is connected by an edge with vertex with
depth H. From each vertex there is a single edge . Therefore, by virtue of
quantitative estimates of the width of the graph, p (H + 1 | H) = [O (H-2 -
(H + 1) -2)] / O (H-2) = 1-O (H-1).
The numerator of this expression is equal to the difference of the width of the graph at depths H and H + 1, the
denominator takes into account the fact that the initial depth is equal to N. The probability of
falling to a depth h> H + 1 is determined by the probability of not falling to a depth
H + 1 and the width of the graph, p (h | H) = O (H-1h-2).
Let us compare the probability pk (h) of obtaining depth h after k steps when descending from the
sheet and the probability pk (h | Mk-1) of depth h after k steps, provided that at
step k-1 the depth was equal to the mathematical expectation. The estimate
pk (h)> pk (h | Mk-1) holds . Therefore, the location of the probability of a complex event pk (h) can be
considered the probability of a simple event pk (h | Mk-1).
Let the starting vertex lie at the depth H = O (log # M). What will be the depth
after the next step? Direct calculations show that the
mathematical expectation of depth is H + 1 + O (1), that is, the second and
subsequent steps increase the depth of descent to O (1). Substituting
as the initial depth of the mathematical expectation of the descent depth at the
previous step gives an estimate of the mathematical expectation of the descent depth after
to the steps: h = O (log # M) + O (k).
Since the optimal number of steps during the descent to the algorithm is determined by the
equality of the difficulty of descent mk and the complexity of sorting m log m, then
Copt = O (log # M). It follows that the task of meeting on a randomly
oriented tree is no less complicated than the problem of meeting in a cycle, then
Pollard's algorithm is not improving due to the increase in available memory.
2.1.3.Differential cryptanalysis method.
The differential cryptanalysis method (DFA) was proposed by E. Biham and
A. Shamir in 1990. Differential cryptanalysis is an attempt to unlock the
secret key of block ciphers, which are based on the repeated use of a
cryptographically weak digital encryption operation times. The analysis
assumes that each cycle uses its own encryption subkey.
DKA can use both selected and well-known open texts.
The success of such attempts to open the r-cyclic cipher depends on the existence of
differentials of the (r-1) -th cycle, which have a high probability.
The differential of the i-th cycle is defined as a pair (a, b) i such that a pair
different open texts x, x * c difference a can lead to a pair of
output texts y, y * after the i-th cycle, having a difference b (for the
corresponding notion of difference). The probability of an i-cycle differential
(a, b) i is the conditional probability P (D y (i) = b | D x = a) that the
difference D y (i) of a pair of ciphertexts (y, y *) after i- This cycle is equal to b,
provided that the pair of texts (x, x *) has the difference D x = a; plaintext x
and connect the cycles to (1), to (2), ...., to (i) independent and equiprobable.
The basic procedure for a DCA of an r-cyclic cipher using selected
open texts may be as follows:
1. At the precomputation stage, we are looking for a set of (r-1) -cycle differentials (a
1, b 1) r-1, (a 2, b 2) r-1, .... (as, bs) r-1. We arrange this set of
differentials according to their probability.
2. Select the plaintext x arbitrarily and calculate x * so
that the difference between x and x * is equal to a 1. Texts
x and x * are encrypted with the real key and after r cycles we get a pair of
ciphertexts y (r), y * ( r). We assume that at the output of the penultimate
(r-1) th cycle, the difference between the ciphertexts is the most probable: D y (r-1) = b
1. For a triple (D y (r-1), y (r), y * ( r)) we find every possible (if there are
several) values ​​of the subkey of the last cycle to (r). Add it to the
number of occurrences of each such value is connected to (r).
3. Repeat step 2 until one or more values ​​of the subkey
to (r) appear more often than others. We take this subkey or many
such subkeys as a cryptographic solution for connecting to (r).
4. We repeat the PP.1-3 for the penultimate cycle, and the values ​​of y (r-1)
are calculated by decrypting the ciphertext on the found connection of the last
cycle to (r). Further we act similarly, until the keys of all
encryption cycles are disclosed .
Proposed for the first time to analyze a specific cipher, DFA turned out to be
applicable for analyzing a wide class of Markov ciphers. Markovsky
is a cipher whose encryption equation is on one cycle
satisfies the condition: the probability of the differential does not depend on the choice of
open texts. Then, if the plug-in cycles are independent, then the
sequence of differences after each cycle forms a Markov chain,
where the next state is determined only by the previous one. Examples of
Markov ciphers are DES and FEAL.
It can be shown that a Markov r-cyclic cipher with independent subkeys
is vulnerable to a DFA if and only if for one
encryption cycle a key using a known triple (y, y *, D x) can be easily calculated
and exists (r-1 a) -cyclic differential (k, b) k-1 such that its
probability satisfies the condition P (D y (r-1) = b | D x = a) >> 2-n,
where n is the number of bits in the block of encrypted text.
The complexity of the disclosure of the key of the r-cyclic cipher Q (r) is defined as the number of
encryptions used, followed by the calculation of the key: Q (r)? 2 / (Pmax -
1 / (2n-1)),
where Pmax = max (a) max (b) (P (D y (r-1) = b | D x = a)).
In particular, if Pmax> 1 / (2n-1), then the attack will not be successful. Insofar as
subkey computing is a more laborious operation than encryption, then the
unit of measurement of complexity is the difficulty of finding possible
subkeys of one cycle by the known (D y (r-1), y (r), y * (r)).
A distinctive feature of differential analysis is that it
practically does not use the algebraic properties of the cipher (linearity,
affinity, transitivity, closure, etc.), but is based only on the
uneven distribution of the probability of differentials.
1.4.Linear cryptanalysis.
In the open press linear method of cryptanalysis was first proposed by the
Japanese mathematician Matsui. The method assumes that the cryptanalyst knows
open and relevant encrypted texts.
Typically, encryption uses modulo-2 addition of text with a key and a
scatter and mix operation. The task of the cryptanalyst is to find the
best linear approximation (after all encryption cycles) of the expression
xi1 + .... + xir + yj1 + yjs = zk1 + .... + zk t (3.1)
Let pL be the probability that (3.1) holds, it is necessary
that pL № 1/2 and the value of | pL-1/2 | should be the maximum. If |
pL-1/2 | is large enough and the cryptanalyst knows a sufficient number of
pairs of open and corresponding encrypted texts, then the sum modulo 2
bits of the key at the corresponding position in the right side (3.1) is the most
likely value of the sum modulo 2 corresponding bits of the open and
encrypted texts in the left part. If pL> 1/2, then the sum of the key bits in the
right-hand side of (3.1) is zero, if the sum of the bits in the left-hand side is zero
more than half of the pairs of encrypted texts, and the sum of the key bits in the
right-hand side of (3.1) is one, if the sum of the bits on the left side is
one greater than half the texts. If pL <1/2, then vice versa: the
sum of the key bits in the right side of (3.1) is zero, if the sum of the bits in the left
parts equal one more than half of the pairs of open and encrypted
texts, and the sum of the key bits in the right part (3.1) is equal to one if the sum of the
bits in the left part equals zero more than half of the texts. For
finding each bit of the key itself now needs to solve a system of
linear equations for known linear combinations of these bits, but this
procedure is not difficult, since the complexity of solving a system of
linear equations is described by a polynomial of no more than third degree of the
key length .
The number of N pairs of open and encrypted
texts (blocks) required for key disclosure is estimated by the expression N> | pL-1/2 | -2
To reveal the DES cipher key with this method, 247 pairs of known
open and encrypted texts are needed .
2.2.Cryptanalysis tools.
For the analysis of ciphers can be used various mathematical
tools. However, there are general principles for solving complex
computational problems.
Usually, the key calculation problem can be reformulated as a search problem
inside a large finite set M of the element m, which has the necessary
properties. One of the approaches to solving this problem is called
"divide and conquer." Its essence lies in the fact that the initial complex
search task is divided into two subtasks. For this, the set of elements is
divided into intersecting or weakly intersecting enumerable
subsets, recognizable with respect to the properties that it possesses
this item. At the first stage, a subset is searched in which the
required element is located (the first subtask is solved), then the required
element is found within the found subset (the second subtask is solved). This
kind of partitioning can be applied many times. An example of such an approach
is the well-known method of guessing an arbitrary word from a multi-volume
encyclopedia, if the guesser can ask 20 questions and get
"yes" or "no" answers to them .
The divide and conquer approach can also be used in
cipher analysis. Naturally, its application should be individual for
each cryptoalgorithm. For example, if the set M allows splitting into
subsets recognized in part of property A, and there is a contraction
mapping acting on these subsets and preserving this
property A, the Pollard method can be applied not to the elements of the set
M, but to the subsets containing the given element.
Another effective method for solving computational problems is
that the set M is ordered in decreasing order of the probability that a
given element has the necessary property. Next comes the testing of
elements M (brute force), starting with the most likely. This technique is
used in the differential method of analysis. If the probabilities are
distributed substantially non-uniformly, then a large gain is obtained by
difficulties. In particular, if the probabilities form a geometric
progression, then the complexity of finding the desired element is linear
with the size of the problem (the logarithm of the power of the original set).
A common tool is also a linearization task. This is often
due to the fact that affine approximations of transformations form a
semigroup with respect to composition and have simple descriptions. However, such a
semigroup is formed not only by affine transformations, but also by other
objects, such as symmetric polynomials, lattices, some classes of forms
(homogeneous polynomials, all the terms of which have the same degree).
By polynomial composition is meant the substitution of a polynomial as
variable in another polynomial.
For analysis of ciphers, Andelman and Reeds suggested using the transition from the
original discrete encoder to the "continuous" encoder, which
coincides with the source on the vertices of the n-dimensional unit cube, and then look for a
continuous key using the technique of searching extremes of continuous
mappings. Note that here lies a certain complexity. it
due to the fact that all elements of the ring of Zhegalkin polynomials or rings of
Boolean functions with AND, OR, operations are NOT idempotent. Let be
variables take values ​​from some continuous subset A of
real numbers. For the numerical values ​​of real analogs of Boolean
formulas, it is necessary to provide x2 = x, for any rational number r. So
Thus, all real numbers A are equal, that is, the elements of A
are elements of the factor group R / Q. It is not difficult to see that in any
computational model with a finite digit capacity, the numbers from A are not representable,
therefore this method does not work (at least for real and,
therefore, for complex numbers).
Each new cryptanalysis method leads to a revision of the security of the
ciphers to which it is applicable. If the goal of a cryptanalyst is to
uncover as many ciphers as possible (regardless of whether he wants
to harm society, warn him of a possible danger, or
simply become known), then the best strategy for him is
development of universal methods of analysis. But this task is also the
most difficult.
Encryption algorithms (standards) change periodically (as can be seen from the
example of LUCIFER, DES, FEAL ciphers, clipper chips), and secret information
often tends to age, that is, it does not represent much interest
to the offender some time after it is transmitted through the channels communication is
encrypted. Therefore, the dependence of the effect on finding the method of
disclosing the keys of a given cipher has a maximum: at the beginning of
its validity, the cryptoalgorithm is not widely used, and at the
end of the term it ceases to be popular, and the bulk of the encrypted
information is not of interest.

created: 2016-09-19
updated: 2022-01-09
132540



Rating 9 of 10. count vote: 2
Are you satisfied?:



Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Cryptanalysis, Types of Vulnerability and Information Protection

Terms: Cryptanalysis, Types of Vulnerability and Information Protection