From algorithm to data structure

Lecture

"It seems that with examples of sorting you can build a whole course of programming."
Niklaus Wirth

“I'm afraid I can't explain all this to you,” said Alice politely. “I myself don’t understand anything. So many transformations in one day can be confusing to anyone.”
Lewis Carroll

The formulation of the problem, as a rule, leaves the programmer the opportunity to choose a path for solving it. Evidence of this is a variety of algorithmic mechanisms, including not only those mentioned above, but also some others. Are there any doubts that they should belong to the programmer’s arsenal?

In fact, a program written outright may well be ineffective. But does every programmer bother to find out? Unfortunately, we repeatedly had to deal with typical examples of “hidden” inefficiency when discussing algorithmic and software solutions with young coders. The most popular arguments are that the finished program, firstly, works, secondly, it works quickly, and in general, they say, “what do you want from me (or from it, I mean, programs)”.

Indeed, the presented product looks workable, and it is difficult to convince the encoder that he should think more. The fact is that the level of complexity of learning tasks that novice programmers have to deal with is not very high. If you have to build an algorithm of polynomial complexity, then it usually does not exceed O (n ² ) , in extreme cases, O (n ³ ) . And we already know that with small values of n, the real time of the execution of such a computational process does not give reason to search for a better solution. At the same time, the preparation of any significant data set, in order to check the behavior of the program with large n , the same programmers do not bother themselves. As for the complexity of the exponential, then such tasks in the curriculum - we mean an ordinary school course of computer science - are quite few, if they are presented at all.

So, we will proceed from the need to include in the "gentleman's set" of the programmer as many basic algorithms as possible. Saying “as much as possible”, we do not plan to simply renumber them, and then, in order of priority, discuss them. Nevertheless, it is advisable to carry out some of their classification.

With one of the options for such a classification, you had to face at an early stage of acquaintance with the discipline of programming. There is a view of the classification of software mechanisms by types that precede the algorithmic - linear algorithm , branching and loop .

Of course, the discussion of efficiency only begins with the advent of cyclic processing. And we need to go further. Tempting idea Wirth, rendered in the epigraph. But the goal that we pursue in this section is much more modest: we do not aim at the "whole" course, we just want to demonstrate the differences in the algorithmic content of the tasks that arise before the programmer. What prevents us from doing something similar — or at least trying — based on an object familiar to any reader — the multiplication table !? So, we propose to treat the further reasoning as a kind of detailed exercise in the design of algorithms.

Naturally, before turning to the table, you should set the task. Let’s accept the following wording: “ calculate the corresponding output value — their product” from the input values of a pair of natural numbers. On the keyword "calculate" note: its presence is reflected in how the following is built

Algorithm 1. Multiplication 1

Having returned some years ago, each of us will remember how he was taught to solve this problem in the first grade. Suppose, by way of discussion, that the input test includes the numbers 5 and 6 (obviously, any other pair of single-digit numbers is suitable).

Take a handful of 5 matches; add to them another 5 from the second pile; then five more from the third; and so on, until we collect matches from all 6 piles together. How many matches did we get? Perhaps, to multiply 5 × 6, you were offered a childhood interpretation of “5 piles of 6 matches,” but the author was taught exactly this way - however, this does not change the essence (Fig. A4-1).

heap number	one	2	3	four	five	6
number of matches in the heap	five	five	five	five	five	five

Fig. one

Such an explanation, or something of the same kind, describes an algorithm for multiplying two factors — first by second — in the class of natural numbers . The complexity of this algorithm is O (n) , where n is the second factor.

Algorithm 2. Multiplication 2

A few more problems will be caused by the situation of a large number of heaps (even if in the new example there are 26 of them) and matches in them (15 each). Now it is appropriate to recall the multiplication algorithm "in a column . "

Fig. 2

Multiplying here 6 by 15, in the course of the calculations we must multiply 6 by 5 (or 5 by 6), which allows us to reduce many steps of the A4-2 algorithm to the calls of Algorithm 1. We can assume that at one time this version of the algorithm was one of the first reader's experiences in applying the method of reducing the task to subtasks.

Algorithm 3. Multiplication 3

What does not suit us - or it may not, if there is a better solution - Algorithm 2? Naturally, the repeated challenges of Algorithm 1, whose effectiveness is questionable. Here, finally, the multiplication table will be useful to us.

The value that the statement of the problem suggests to calculate is provided by the table directly - at the intersection of the 5th row and the 6th column. Everything would be fine, but the remaining rows and columns turn out to be “redundant”, creating the problem of choosing the right pair. So there are two subtasks of the same type (again: decomposition!) - search for the 5th, top, line and, after this, search in it for the 6th number from the left (Fig. A4-3).

	one	2	3	four	five	6	7	eight	9
one	one	2	3	four	five	6	7	eight	9
2	2	four	6	eight	ten	12	14	sixteen	18
3	3	6	9	12	15	18	21	24	27
four	four	eight	12	sixteen	20	24	28	32	36
five	five	ten	15	20	25	thirty	35	40	45
6	6	12	18	24	thirty	36	42	48	54
7	7	14	21	28	35	42	49	56	63
eight	eight	sixteen	24	32	40	48	56	64	72
9	9	18	27	36	45	54	63	72	81

Fig. 3

Such a search is not a primitive operation. It requires cyclic processing of candidates in a certain sequence - in our case, enumeration and verification of rows 1 through 5, then columns 1 through 6. It is easy to see that the order of candidates in the sequence can be set differently: for example, from the bottom line - upwards and, then, from the rightmost column - to the left. But the nature of these differences is only technical, they do not affect the assessment of the complexity of the search for the elements in the sequence.

We are, of course, interested in the algorithmic difference. In principle , the divide-and-rule method works differently, although here we only mention it, leaving the discussion of the mechanism for the future.

Thus, in our classification, it is necessary to provide a place not for a unique mechanism — a sequential search , but for a whole family of search algorithms , in which, by the way, there are other representatives besides the two already mentioned. So the table search technology - in this case - is split into two independent subtasks of the search in a linear array .

Let us return once more to the initial formulation of the problem. It turns out that we changed it "imperceptibly" as soon as the table appeared. In fact, modification of the wording is associated with the replacement of the keyword "calculate" with the word "find" . That is, the corresponding formulation of the problem, as soon as it was supplemented with a suitable data structure , immediately suggested to us another algorithmic solution. We will discuss the data structures below, but for now we will try to develop this consideration further.

Algorithm 4. Multiplication 4

So, if we expand the standard multiplication table to 15 rows and 26 columns, then the product of interest will be found in its lower right corner (Fig. A4-4).

	one	2	...	26
one	one	2	...	26
2	2	four	...	52
...	...	...	...	...
15	15	thirty	...	390

Fig. four

Essentially, the statement of the problem and familiarity with Algorithm 3 dictated the choice of structure, and the input data determined the values of the parameters of the structure.

Of course, not every algorithmic decision is worth making, - but we are only reasoning for now, and have not rushed to write the program code right away. And here a simple argument shows that Algorithm 4 is unlikely to suit us. Indeed, with an increase in the number of rows m and columns n - that is, the values of input parameters - the size of the table should be no less than the product of the specified values. We have to recall the capacitive complexity of the algorithm, which is associated with the storage of information in the selected data structure, and will be O (m · n) .

Arguing further, we note that problems arise with temporal complexity. It is not enough to allocate memory for the structure; to store data, you first need to place them there, and this will also take time on the order of O (m · n) . By the way, the initial placement of data in the selected structure is often called its initialization . So, with a single calculation of the 15 × 26 product, the latter algorithm does not suit us at once for two reasons: due to the inefficiency of the time spent at the initialization stage and because of the capacitive non-effectiveness of the selected structure.

Naturally, the same reasoning is relevant for any pair of values of the factors. How to be: after all, the problem of multiplying natural numbers has to be solved regularly? In non- computer practice (microcalculators - not counted), we often use two alternative algorithms, switching at the right moments. The mechanism is to multiply numbers with a bit more than 1 in a column, according to Algorithm 2, and for bitwise operations, refer to the multiplication table of Algorithm 3. With this approach, stored in the mind - but initialized in childhood! - the multiplication table has the usual sizes 10 × 10 (or 9 × 9), and it does not reach Algorithm 4.

However, the size of the table 10 × 10 light wedge did not converge. The reader is probably curious to know that Australian schoolchildren are "cramming" a table of 13 × 13. So, and there is a place for choice. Well, and at computer processing when the binary numeral system is used , the table becomes absolutely small - 2 × 2. Does it make sense to initialize it in RAM, or use other algorithms, we will discuss in Chapter B, where we will talk about computer arithmetic .

However, we will not be distracted. I hope the reader has not forgotten that our goal is to outline the range of tasks and algorithms encountered in programming practice and, of course, in our course. And the multiplication table serves us as “only” a model object. But in our classification is added another method for solving problems - this is computer simulation . More precisely, numerical simulation , in which a computer serves as a high-speed performer and a demonstrator of processing results. We do not intend to deeply intrude into this independent field of computer science, we confine ourselves to just one example. We mean the visualizers of algorithms , which are many in our course. In fact, these are just models, and their application for the realization — in full — of the possibilities and goals of the corresponding algorithms is by no means envisaged.

Let us now note that in the considered algorithms we take into account, albeit sometimes implicitly, but when encoding, of course, explicitly, the form of presenting information . So, speaking of data, we everywhere typified them somehow, for example, in Algorithm 1, referring to natural numbers. What can we say about machine arithmetic, where form, to a large extent, dictates processing algorithms.

We also used the term “structure” to present the data, without specifying the content, since the reader came across this concept in a programming language. In the algorithms, data structures are understood somewhat more broadly than in a specific programming language, but the general characteristics of the concepts are the same. In any case, there is a finite set of elements related — within the structure — by a certain order . The established order makes it possible to distinguish the elements of the structure by location in it or relative to each other .

A programming language may or may not allow the manipulation of the entire structure as a whole, but it always allows the processing of its constituent elements separately. Both possibilities are relevant in algorithms. Each processed structure is assigned its own set of operations , and in this sense, the capabilities of the language are usually already. Thus, individual algorithmic operations often have to be emulated by a sequence of operations allowed by the framework of the language. In detail, these questions are also waiting to be discussed, - we will proceed to it, starting with Chapter D.

In this sense, when processing the multiplication table, the operations are allowed - and, accordingly, the mechanisms of their implementation are regulated - direct access to a row or column by a given number, writing the value of a table element during its initialization, reading the element.

Exercise 1
Perhaps your acquaintance with the computer has not yet required knowledge of the device of RAM. In this case, find out what is the RAM as a structure (allocation) of data.

Exercise 2
If we think about the method of storing the multiplication table, we will probably decide that we need some more operations. What kind?

Finally (and perhaps this should have been started), combining the elements into a structure requires that they have a certain - at least one - common property. In our model problem, this attribute is the belonging of elements to the class of natural numbers.

Exercise 3
Is it possible to consider a series of natural numbers as a data structure?

Exercise 4
What data structure would you choose when describing a military unit in a parade?

Imagine that, in addition to the digital value, the cells of the multiplication table also placed its verbal notation, for example, "30, thirty". This option could be useful to us, say, as a methodological guide for teaching the numeral English. Then the value of the structure element is the entire record "30, thirty", the key of the record is "30", and the information part - "thirty" - is accessible by key .

For the chosen representation of the concept, the key and the number (or index ) of the element are not interchangeable. The key assignment is in the identification of the structure element, the index assignment is to provide access to the element. So, in the table in fig. A4-3 keys are present in an explicit form: the row keys are located in the left column, and the column keys in the top row. The numbers of the rows and columns are related to the way the structure is displayed on a particular carrier. For example, it is possible to remove the left column and the top row from the same table, leaving it without keys.

Here, an attentive reader may accuse us of the different uses of the concept element when considering examples. Indeed, then we understand the element row (or column), the element lies at their intersection. In fact, a table of rows , a table of columns, and a “just” multiplication table look the same only on the book page, but they look different from an access point of view. The difference is already determined by the type of constituent elements of the structure. But not only them. Accordingly, we are talking about different structures, and these three tables will also have to be manipulated differently.

Exercise 5
Which of the three options is more expedient from the point of view of the use of "as intended" when placing the table in the computer memory?

Exercise 6
Can you suggest other options for the structure of the "multiplication table", except for the three proposed above?

Exercise 7
Since we will be equally interested in the possibility of displaying other data structures in the computer memory, it is useful to recall how the data described in the programming language as a component is stored in the computer's memory.

Exercise 8
Write a program for learning English numerals on the multiplication table. Use a programming language, not spreadsheets. But your program can work on the principle of a spreadsheet.

Comments

To leave a comment

If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.

To reply

Comment

To confirm that you are not a bot, answer:

Name

Email(not published)

Vote

From algorithm to data structure

Comments

To leave a comment

Algorithms

Terms: Algorithms