Laboratory work number 10. "RESEARCH OF METHODS OF LINEAR AND BINARY SEARCH"

Lecture





Objective: to study the methods of linear and binary search.


The task of the work: to master the skills of writing programs for the methods of linear and binary search in the PASCAL programming language.


Operating procedure :

  • study the description of the laboratory work;

  • according to the task given by the teacher, to develop an algorithm for the program for solving the problem;

  • write a program in PASCAL;

  • debug the program;

  • to solve a task;

  • issue a report.


Brief theory



SEARCH

One of the most common programming actions is search. It is also an ideal task, where you can test various data structures as they appear. There are several basic "variations of this theme", and many different algorithms have been created for them. Upon further consideration, we proceed from such a fundamental assumption: the group of data in which it is necessary to find a given element is fixed. We assume that the set of N elements is given, say, in the form of such an array

a: ARRAY [0..N-1] OF item

Typically, the item type describes an entry with some field acting as a key. The task is to search for an element whose key is equal to the specified "search argument" x. The resulting index i, satisfying the condition a [i] .key = x, provides access to other fields of the detected element. Since we are primarily interested in the search process itself, rather than the detected data, we will assume that the item type includes only the key, i.e. he is the key.

Algorithm

Linear search


If there is no additional information about the data being searched for, then the obvious approach is to simply sequentially scan the array, step by step, in the part of the array where the desired element is not found. This method is called linear search. The search termination conditions are:

1. Element found, i.e. ai = x.

2. The entire array was scanned and no matches were found.


This gives us a linear algorithm:

i: = 0;

WHILE (i x) DO

i: = i + 1;

END;

Note that the order of the elements in a logical expression is significant. The cycle invariant, i.e. The condition that runs before each increase in the index i looks like this:

(0  i k : 0  k k  x)

He says that for all values ​​of k smaller than i, there was no match. From here and from the fact that the search ends only if the condition in the loop header is false, the final condition can be derived:

((i = N) OR (a i = x)) AND (A k : 0  k k  x)

This condition not only indicates the desired result, but it also follows that if an element is found, then it is found together with the minimum possible index, i.e. This is the first of these elements. The equality i = N indicates that there is no match.

It is obvious that the end of the cycle is guaranteed, since at each step the value of i increases, and, therefore, it will certainly reach the limit N in a finite number of steps; in fact, if there was no match, it will happen after N steps.

It is clear that at each step it is required to increase the index and calculate the logical expression. Is it possible to simplify this work and thus speed up the search?

The only possibility is to try to simplify the logical expression itself, because it consists of two members. Consequently, the only chance on the path to a simpler solution is to formulate a simple condition equivalent to our complex one. This can be done if we guarantee that a match will always occur. To do this, it is enough to place an additional element with the value x at the end of the array. Let's call such an auxiliary element a “barrier”, because it protects us from going beyond the array. Now the array will be described as:

a: ARRAY [0..N] OF INTEGER

and the linear search algorithm with a barrier looks like this:

a [N]: = x;

i: = 0;

WHILE a [i] <> x DO

i: = i + 1;

END;

The resulting condition derived from the same invariant as before:

(a i = x) AND (A k : 0  k k  x)

It is clear that the equality i = N indicates that there was no coincidence (except for a coincidence with the barrier).

Search by division in half (binary search).


It is clear that there are no other ways to speed up the search, unless, of course, there is still any information about the data, among which there is a search. It is well known that a search can be made significantly more efficient if the data is streamlined. Imagine a telephone directory in which the names will not be arranged in order. This is something completely useless! Therefore, we present an algorithm based on the knowledge that array a is ordered, i.e. satisfies the condition

A k : 1 k k-1  a k

The basic idea is to randomly select some element, suppose am, and compare it with the search argument x. If it is equal to x, then the search ends, if it is less than x, then we conclude that all elements with indices less than or equal to m can be excluded from further search; if it is greater than x, then indices greater than and equal to m are excluded. This consideration leads us to the following algorithm (it is called “search by division in half”). Here, the two index variables L and R mark the left and right ends of the array section a, respectively, where the required element can still be found.

L: = 0;

R: = N-1;

found: = FALSE;

WHILE (L Ј R) AND NOT found DO

m: = any value between L and R;

IF a [m] = x THEN found: = TRUE;

IF a [m]
ELSE R: = m-1;

END;

END;

The cycle invariant, i.e. The condition that runs before each step is:

(L  R) AND (A k : 0  k k k : R k > x)

what the result is derived from

found OR ((L> R) AND (A k : 0  k k k : R k > x))

where it comes from

(a m = x) OR (A k : 0  k
The choice of m is completely arbitrary in the sense that the correctness of the algorithm does not depend on it. However, the choice affects its effectiveness. It is clear that our task is to exclude at every step from further search, whatever the result of the comparison, as many elements as possible. The best solution would be to select the middle element, since in this case half of the array will be excluded. As a result, the maximum number of comparisons is log N, rounded to the nearest integer. Thus, the above algorithm significantly benefits compared with a linear search, because there the expected number of comparisons is N / 2.

Efficiency can be somewhat improved by reversing the headings of conditional statements. The equality test can be performed secondarily, since it occurs only once and leads to the end of the work. But more importantly, the following consideration: is it possible, like in a linear search, to find such a solution, which would again simplify the termination condition. And we really find such a fast algorithm as soon as we abandon the naive desire to end the search while fixing a match. At first glance, this seems strange, but upon careful examination it turns out that the efficiency gain at each step exceeds the loss from comparison with several additional elements. Recall that the number of steps in the worst case is log N. The fast algorithm is based on the following invariant:

(A k : 0  k k k : R  k k  x)

the search continues until both sections "cover" the entire array.

L: = 0;

R: = N;

WHILE L
m: = (L + R) DIV 2;

IF a [k]
ELSE R: = m;

END

END

The termination condition is L i R, but is it attainable? To prove this, we need to show that in all circumstances the difference RL decreases at every step. At the beginning of each step L

Tasks


Options:

1. Find the smallest element in array A using a linear search.

2. Search for elements in array A that are greater than 30.

3. Display on screen all the numbers in array A multiples of 3 (3,6,9, ...) using a linear search.

4. Find all items whose module is greater than 20 and less than 50, using linear search.

5. To display on the screen all the numbers in array A are multiples of 4 (4.8, ...) using a linear search.

6. To display a message, which numbers are more relative to 50, using a linear search.

7. Find the element in array A and find the number of comparisons using linear search.

8. Search for items randomly using a binary search.

9. It is given a list of car numbers (345, 368, 876, 945, 564, 387, 230), to find where the car with the specified number is located, binary search.

10. Search every second element in the list and the number of comparisons.

11. Find an element with a given key using a binary search.


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Structures and data processing algorithms.

Terms: Structures and data processing algorithms.