Chess Program Algorithms

Lecture



Minimax

To begin, consider a simpler game, for example, tic-tac-toe on a 3x3 board. Consider the starting position. Here the first player (let it be crosses) has 9 moves (ignore symmetry). For each of them, the opponent has 8 answers, in turn, for each of them - 7 answers for crosses, etc. The maximum length of the game is 9 half moves (i.e., the moves of one of the opponents), but it may end quickly if someone wins.

Take a piece of paper and draw the top position from the top. Since there are 9 moves out of it, we draw 9 lines (branches) out of it. At the end of each branch we will draw a position that is obtained after this move, and in turn we will draw from it branches - moves that are possible from this position. And so on. The resulting structure is called a tree, because (1) it has a root — the starting position, (2) it has leaves — positions from which there are no branches, for one of the players won or a draw happened, and (3) it is remotely like a tree if a sheet of paper is turned over. Tic-tac-toe is a simple game, its tree should fit on a (large) piece of paper, but you can also imagine a painted tree for chess (there seems to be only 10 ** 50 or so different positions, and the 50-move rule limits the maximum length batches ~ 4000 moves). Of course, for such a sheet of paper there will not be enough all the atoms in the universe, but you can still imagine such a sheet (I, at least, can).

Now we assign to each sheet its assessment. Let the winning of the crosses be -1, the zeros - plus 1, and the draw will be zero. We start the “raise” estimate up the tree - consider the vertex immediately standing above the leaf (terminal position). From it leads several branches (exactly one in the tic-tac-toe, but this is not important). Imagine that the move now is the crosses. He is interested in obtaining -1, so from several options he will choose the option with the minimum value (-1, if possible, and if not, then at least 0). Thus, he will choose the minimum estimate of those that his have and assign it to this vertex.

A level above is the turn of . He, in turn, is interested in getting +1, so he will maximize the estimate obtained from the bottom.

In the end, we will raise the estimate to the root. In the case of tic-tac-toe, this will of course be 0, because the theoretical result of the tic-tac-toe in the optimal game of both players is a draw. What is the theoretical result of chess science is unknown.

Of course, having a painted game tree, it is very easy to play - the crosses must choose the move leading from the current position to the position with the minimum score; zeroes - in a position with the maximum. As it is easy to understand, the described algorithm is the promised minimax.

Alpha beta

Attentive readers should have guessed that the minimax will work not only on trees with plus / minus one marks, but also on trees whose leaves are simply real numbers. In the future, we will consider just such trees (to explain the basic algorithms, I will talk about trees, forgetting for the time being about the actual games). In addition, I will focus on determining the theoretical result of the game, i.e. on the rise grades up the tree.

Minimax is a very time consuming algorithm. To determine the theoretical result of the game, absolutely all the tree is marked up. Question: Is it possible, by keeping the mathematical correctness of the minimax, to bypass less?

It turns out you can. The algorithm is called alpha beta. It is easy to prove that he gives the same results (do not be afraid, I will not do this). The idea of ​​it, if explained on the fingers, is:

Let there be a root, it has 2 descendants, each of them has 3 descendants-leaves. At the root is the turn of the mine, respectively, in the tops of the second level - the move of max. The sheets that are descendants of the left subtree are assigned the estimates 0, 5, 10. Since we maximize at the vertices of the second sub-level, we assign the estimate 10 to the left subtree.

We start to consider the right subtree. Suppose that at the very left sheet a score of 100. And now - attention! We know that the level above we will maximize. This means that the score * of the entire right subtree * will be at least 100, since the remaining leaves can only raise this assessment. In addition, we know that in the initial position we will minimize, and therefore we choose the left subtree, whose score is 10.

So, we got an estimate of the whole tree, having considered only 4 positions out of 6. Chess analogy - suppose that I considered one of my moves, and found that after all the answers of the opponent, the position is equal. I looked at what happened after my other move, and found that one of my answers was opposing the opponent who won the knight. Considering his other answers no longer makes sense - I will not make this move anymore.

Please note that we are lucky that in the right subtree, grade 100 has the leftmost sheet. If he were right, and if both left-handers have a rating of 7, we would have to bypass all 6 leaves. In the worst case (we always consider the subtree / sheet with the best rating last), alpha-beta degenerates into a minimax. In the best case (the best subtree is always considered first) is considered (very roughly) 2 * (sqrt (the number of vertices considered by the minimax)).

Alpha-beta is an algorithm with - the branch is cut off only after at least one descendant has been evaluated, and they are convinced that the value of this branch will not affect the final result.

Application of theory to practice

Of course, for real games (chess, checkers, go) the whole tree will not be able to go around the machine. Therefore, go to a scam. First, the whole tree is not built - no machine will have enough memory. They store only the current position and the sequence of moves that led to it. If you go deeper, make a move on the board (more precisely, on its analogue in memory). You come back - take a step back (this is not a scam yet, from a theoretical point of view that we obviously do not build a tree, does not change anything).

Secondly and mainly, the exact estimate is replaced by an approximate one. Namely, after going to a certain depth, if a mate or a forced draw did not occur, instead of going further, we call an evaluation function that assigns a certain position to the current position (or rather, the entire subtree starting from the current position). How this function is arranged - a little later, but for now the main thing: here we have irrevocably broken with the theory, because the evaluation function is something approximate. This is the main scam - alpha-beta was scientifically substantiated under completely different conditions, and why, after the introduction of the evaluation function into it, it works in real games, it is not entirely clear (more precisely, there are as many works on this topic, but I’m seem inconclusive). Moreover, an entire class of games has been artificially created, where alpha-beta does not work in such conditions.

How is the evaluation function? First, just count the material. The simplest option: pawn - 1, knight and bishop - 3, rook - 5, queen - 9.

Secondly, it is necessary to add a positional assessment, and this is the most difficult. There are many factors, in the chess books are far from all, and most importantly - it says , but we need . Accordingly, it is necessary to first attribute the factors to the weight from the ceiling, and then debug, manually or trying to optimize the function in a space of several thousand.

Here are a few factors taken into account, with their approximate weights (weights, of course, taken from the ceiling, but vouch for the order of magnitude):

  • a pair of elephants is a plus third of a pawn
  • the king is covered with his pawns - plus half a pawn,
  • the knight or queen is close to the king (his or someone else’s) - plus the hundredth pawn,
  • a weak or lagged pawn - minus a quarter of a pawn,
  • the rook on the half-open vertical is plus five hundredths of the pawn,
  • an open rook is plus one tenth of the pawn,
  • a pair of rooks on the seventh rank - plus half rolls,
  • a hanging piece - minus a quarter of a pawn, two hanging pieces - minus a pawn, three or more - minus three pawns.

And so on and so forth. In addition, the weight of many factors may depend on the situation on the board - for example, while the queens are not exchanged, the king's cover can be estimated much higher than after the exchange. Thus, we obtain the positional estimate of whites, add up with the material of whites, do the same for black, subtract one from the other, and everything is reduced to one number. Beauty!

Unfortunately, not everything is so simple. The first programs used the algorithm just described — a search to a fixed depth with a call to the evaluation function there (if the mate had not happened before). And it immediately became clear that we could stop busting, for example, in the middle of the exchange, or with the “hanging” queen, or when a checkmate is on the board in one move, and the fact that the attacking side lacks a rook does not matter. There were attempts to take all this into account in the evaluation function, but then it became insanely complex and slow. For each case repaired in this way, ten others began to fall in, where a program deceleration due to a slower evaluation function led to an insufficient depth of searching and an even worse game.

In addition, the program with manic persistence is trying to postpone their troubles, getting in return the worst. . Of course, with a deeper search, the program would find that in the end it will have minus 4, but there is no time to go deeper. This phenomenon is called the "horizon effect", because the program tends to push the trouble beyond the horizon of visibility (or event horizon).

Attempts to use selective methods failed with an even bigger bang. It is not possible to understand how a person plays chess, and attempts to make programs with "forward pruning" - cutting off tree branches without first considering them, based on purely chess principles, fail. There are always positions where the program cuts off the desired course ().

In the end, it became clear that the evaluation function can only be called in calm positions. Therefore, at the end of the main search, another one was inserted, simplified, where only captures, transformations, and possibly shahs and answers to the shah are considered. Since there are few such moves, (as it is called) slows down the program by only 20-50%, and the strength of the game increases dramatically, since The program does not attempt to assess sharp positions.

Then we noticed that a series of exchanges with exchanges at the end would be examined to a greater depth than with exchanges in the middle, and realized that some options with exchanges in the middle would also be nice to consider deeper. An idea arose (I don’t know how to speak Russian, let it be ) - when going through some moves, do not decrease the remaining search depth. Typical grooves - after the exchange, when turning a pawn, when leaving the check. The idea is the same as in the forced version - the situation on the board can change dramatically, this option should be studied more deeply. That was the situation in the early and mid 70s.

Improvements

As already mentioned, for 40 years no alternative to alpha-beta has been found. In this sense, the most sad story of Botvinnik. For 20 years he was engaged in the creation of a chess program. He firmly believed that being a world chess champion and at the same time well aware of programming, he would be able to create a program that would play at the world champion level. The result is sad - the program never began to play like this (on this occasion, someone said well ). Worse, Botvinnik published an article describing the algorithm in the International Computer Chess Association. The article contained the results of the analysis of several positions allegedly performed by the program. Independent analysis showed that these results are falsifications — they could not be obtained using the Botvinnik algorithm.

However, all these years the quality of the game of chess programs has improved. Partially, the progress is explained by the rapid development of hardware - personal computers in some settings reached the level of supercomputers of the late 70s (although in a number of parameters they are still seriously lagging behind). The development of algorithms was partially - there was no breakthrough, but there were a lot of evolutionary improvements. What are the main ones?

(1) Use debut library. The debut theory in chess is well developed, and it is only natural that the programs began to use it. Moreover, they use it better than grandmasters, because remembering a few million machine positions is a snap.

(2) Using databases with fully calculated endgames. Now the real limit for a personal computer is 5-figured endgames (counting kings). Using such databases, programs play simple endgames flawlessly. Moreover, modern programs can, while going through the variants, to the database from the late middlegame, when there are 10-15 figures on the board. I will modestly note that I had a hand in creating the most widespread endgame bases.

(3) Thought during the opponent's turn. We have made a move and now the enemy is thinking. Why waste time? We will choose the most reasonable move from our point of view, which he can make, and we will start thinking about the answer to it. If he makes another move — well, and the devil is with him, it was still not our time. This is the simplest way to use the time of the enemy, and it is also the most effective, because good programs correctly predict ~ 70% of enemy moves.

(4) Various heuristics that help sort the moves in the brute force process. As noted earlier, alpha-beta is very sensitive to the order in which moves are analyzed. With a good order, it is faster, and faster in principle. At the same time, the algorithm still remains exponential — the search for N + 1 half-moves works several times slower than on N half-moves, but at least the power raised to the power decreases significantly. In a typical chess position, it is possible to have about 40 moves, and in order to go through N half-moves, a minimax (also alpha-beta with the worst ordering of moves) needs to consider 40 ** N positions. Alpha-beta with the best ordering of the moves should be considered approximately 6 ** N positions - the difference is fundamental. Heuristics are used differently, for example: first of all, try to eat the enemy figure that just walked (it works, since most of the moves under consideration are idiotic). Another useful heuristic (with a lower priority) is to try to make a move that was best for the neighbor on the left (it is based on the fact that in similar positions the best move is often the same). There are a few more heuristics, but I hope that the whole idea is clear.

(5) Iterative deepening. In a real game, the program has some time for each move, and they need to be wisely managed. It’s unreasonable to just start a search at a fixed depth, as earlier programs did, in a difficult position with a lot of exchanges and / or checks there will be many options that will be calculated very deeply due to , and the program may not have time to do a bust. And one of the features of alpha-beta is that it is impossible to use the result of incomplete busting - there is no guarantee that the best move found at the time of the interruption is any reasonable. On the other hand, in any closed position with a small number of possible moves (branching factor) the search may end unexpectedly quickly. The idea of ​​an iterative deepening is that the program first performs a search to a depth of 1. There is time left - we iterate over a depth of 2. There is time left - a depth of 3. And so on. Of course, in this case one has to do some of the work anew, but here he understands the exponential nature of the algorithm. The next iteration takes (roughly) 6 times more time, so we re-do a maximum of 1/6 work. But we always have a move from the previous iteration, which we can use if we have to interrupt the current one. Further optimizations are possible - for example, not to start a new iteration, if we have eaten more than half of the time allotted to think about this move, for all the same we almost certainly do not have time.

(6) Hash tables (transposition tables). In chess, we do not have a game tree, but a graph - very often after rearrangement of moves, we get the same position. There was a simple idea - to take all the memory of a machine that would otherwise not be used, for memorizing the positions already reviewed. For each position, it is necessary to store its assessment (more precisely, the assessment of the subtree under this position), the depth of the search, the best move, and some service information. Now, starting to disassemble the position, you need to take a look - haven't we already seen it? If you have not met, then we act as before. If you met, then we look at how deep we have examined it before. If the same as we need now, or a deeper one, then you can use the old estimate and save time. Если же на меньшую, то мы всё равно можем воспользоваться частью информации, а именно наилучшим ходом. Лучший ход для глубины N может оказаться лучшим и для глубины N+1. То есть, кроме своего изначального предназначения, хэаш-таблица оказывается полезна и для упорядочения ходов. Тут ещё неожиданно помогает итеративное углубление - когда мы начинаем следующую итерацию, хэш-таблица оказывается заполнена информацией от предыдущей, и до какого-то момента (до глубины 1) все позиции просто есть в ней, с наилучшим ходом на глубину N-1. Программа, использующая итеративное углубление и хэш-таблицы, часто выполняет все итерации от 1 до N в несколько раз быстрее, чем если бы она сразу начала итерацию N, т.к. с вероятностью 75% она всегда первым выбирает наилучший ход, а с вероятностью ~90% наилучший ход оказывается в числе первых трёх рассмотренных.

(7) Пустой ход (null move). В шахматах очерёдность хода очень важна, и тот игрок, который мог бы сделать два хода подряд, имел бы огромное преимущество. Возникла очень простая идея - пусть игрок A сделал свой ход и теперь очередь B. У него есть преимущество - например, лишний конь. Надо рассмотреть поддерево на глубину N. Игрок A говорит < я не буду делать ход, пусть B делает два хода подряд, зато поддерево после этого мы посмотрим до глубины N-2 а не N-1, как было бы после моего хода>. Мы выполняем этот перебор на меньшую глубину, и если у A всё равно преимущество, мы решаем <так как на самом деле он будет делать ход, наверное на самом деле его преимущество будет ещё больше>. То есть мы сэкономили, разобрав не настоящее поддерево до глубины N, а другое до глубины N-1 - напоминаю ещё раз, что перебор экспоненциален, так что выигрыш очень значителен. Иногда оказывается, что игрок B, имея два хода подряд, может восстановить баланс - пустой ход не сработал, придётся перебирать всё честно до глубины N. Опять-таки, мы потеряли не очень много времени, т.к. перебирали до глубины N-1. Разумеется, пустой ход, являясь эвристикой, может и наврать - например, в цугцванге право хода лишь мешает. Но в реальной игре цугцванг встречается лишь в эндшпиле, а в эндшпиле пустой ход можно и не использовать. Пустой ход оказался очень серьёзным улучшением, т.к. он позволяет сократить время перебора на глубину N с 6**N до (25-3)**N.

(8) Единственный ответ (singular extension). Стандартная альфа-бета прекращает анализ позиции, как только найден ход, который вызовет отсечение. Так как у нас не полное дерево игры, а мы используем оценочную функцию, может оказаться, что мы проврались, и никакого отсечения нет. Соответственно, мы думали, что в этой позиции у нас лучше, а на самом деле мы теряем ферзя. Идея единственного ответа состоит в том, что найдя один хороший ответ, мы не отсекаем сразу всё поддерево, а продолжаем перебор, пока не найдём второй хороший ход. Если же мы его не нашли, то позиция анализируется заново, на сей раз на глубину N+1. Очень помогает в тактических позициях - программа начинает видеть комбинации на несколько ходов раньше, но резко замедляет перебор и поэтому используется только программами, работающими на суперкомпьютерах.

Современные шахматные программы

Сразу оговорюсь, что про наиболее известную программу (а точнее, машину+программу) - Deep Blue - я напишу в другой раз. Здесь же речь пойдёт о типичных современных программах, работающих в основном на обычных компьютерах.

Итак, как они устроены внутри, мы уже разобрались. Тем не менее, продукты, построенные на одинаковых принципах, могут очень сильно друг от друга отличаться, и шахматные программы здесь не исключение. Большую часть современных программ можно разбить на 3 категории:

Категория Fast searchers - идея состоит в том, что, упростив до предела оценочную функцию, и тщательно с оптимизировав всю программу в целом (что обычно достигается путём написания программы на ассемблере), можно довести количество позиций, рассматриваемых программой (nps - nodes per second) до астрономического числа, например, до 150-200k nps на P/200. То есть на каждую позицию программа тратит порядка одной-двух тысяч машинных команд. В это число входит делание хода из предыдущей позиции в эту, оценка позиции, генерация ходов из данной позиции, управляющая логика и т.д. На собственно оценочную функцию остаются вообще крохи - порядка сотни команд. Соответственно, она проста, как топор - перед началом перебора программа строит таблицы, где на каждой клетке нарисовано, какая фигура стоит хорошо, а какая - плохо. Например, вражеский король стоит на E8 - на клетках непосредственно рядом с ним нашему ферзю припишем +0.5 пешки, чуть подальше +0.2, ещё дальше +0.1. Слабая пешка - на диагонали, откуда мы можем на неё давить, слону припишем +0.15, а ладье (разумеется, на вертикали или диагонали) - +0.2. And so on. Для получения оценки достаточно для каждой фигуры слазить в её таблицу, добавить разницу в материале - и оценка готова.

Проще всего, кстати, не выписывать оценочную функцию явно, а посчитать её один раз перед началом перебора, а потом обновлять - сделали ход, вычли из неё оценку фигуры на старом месте и добавили на новом. Если кого-то сожрали, надо вычесть материальную и позиционную оценку для него.

Программы получаются безумно быстрыми и превосходно ведут себя в сложных тактических позициях, а также отлично решают комбинационные задачи. Разумеется, за всё надо платить. Платой за тактическое умение становится слабая позиционная игра, особенно при нынешних скоростях машин. Длина рассматриваемого варианта может достичь 20 полуходов, и принятые в начале перебора решения оказываются далеки от истины. Например, подтянул я свои фигуры к E8, а вражеский король давно уже убежал и стоит на A3. Или появилась новая слабость в пешечной структуре, про которую программа ничего не знает. And so on.

Соответственно, вторая категория - это так называемые knowledge-based программы. Тут все силы брошены на написание сложной оценочной функции. Учитывается и взаимодействие фигур друг с другом, и прикрытость короля, и контроль позиций, и чуть ли не фаза луны. В терминах nps программа работает в 10-100 раз медленнее, чем fast searches, но играет в хорошие позиционные шахматы. Точнее, хорошими эти шахматы являются только тогда, когда на доске нет глубокой тактики, или же контроль времени таков, что у программы есть достаточно времени, чтобы эту тактику просчитать. А ещё одна стандартная беда - что-то не учтено в оценочной функции. Простейший (хотя и искусственный) пример - предположим, что программа не знает, что отставшая пешка - это плохо. Быстрая программа будет избегать позиций с отставшей пешкой, так как глубоко в дереве увидит *тактические* возможности противника - он подтянет свои фигуры и слопает несчастную пешку, т.е. сумеет перевести позиционное преимущество (о котором программа не знает) в материальное. Медленная же программа без такого знания обречена - что это позиционно плохо она не знает, а глубины перебора для нахождения выигрыша пешки противником не хватит.

Соответственно, большая часть современных программ есть некий гибрид между двумя крайностями. Они работают в 2-5 раза медленнее быстрых программ и имеют уже достаточно сложную оценочную функцию. Тем не менее они регулярно попадают впросак из-за слабой позиционной игры - впрочем, это же делают все программы, даже с самой лучшей оценочной функцией, ибо всегда оказывается, что какой-то фактор создатели программы просмотрели. Утверждается, что имеется очень много факторов, которые в явном виде нигде не записаны, но которые любой мастер (не говоря уж о гроссмейстере) считает самоочевидными.

Вокруг того, как должна быть устроена программа, идут непрекращающиеся споры. Все шахматисты, ровно как и большинство потребителей, очень любят knowledge-based программы. Стало даже модно писать <программа рассматривает всего 10,000 позиций в секунду>. Почему-то из этого делается вывод, что программа обладает немыслимо сложной оценочной функцией, как будто не бывает просто плохо написанных программ.

Программисты любят быстрые программы - хотя бы потому, что можно написать быструю программу, не будучи сильным шахматистом. Эта программа будет играть очень неплохо, во всяком случае, она сумеет обыграть своего автора, равно как и 99% игроков на Земле.

В какую силу играют сейчас лучшие программы на коммерческих машинах (например, на PIII/500)? Точного ответа вам никто не даст, ибо имеется очень мало статистики (гроссмейстеры не любят играть с программами в <серьёзные> шахматы, в основном из-за того, что им за это не платят). Так что ниже есть моя личная точка зрения (хотя я могу подтвердить её некоторой статистикой).

Всё зависит от контроля времени:

  1. В блиц программы играют лучше любого игрока. Позиционная слабость программ более чем компенсируется тактическими ошибками человека, ибо в блице регулярно ошибаются самые лучшие гроссмейстеры.
  2. В активные шахматы (30 минут на партию) программы играют в силу лучших гроссмейстеров. За счёт большего количества времени на обдумывание гроссмейстеры не делают особо грубых тактических ошибок и могут переиграть программу позиционно.
  3. In standard chess (2 hours for 40 moves), the programs are played by a weak grandmaster.

Of course, much depends on the style of the game that is loved by a person. So, Kasparov is especially hard to play with computers - he loves a sharp tactical game, and here computers are especially strong. Replay the computer in tactics is very difficult. But Karpov’s style of play, on the contrary, is hard for a computer - he doesn’t know how to play in a closed position, while Karpov slowly strengthens his position. There is a so-called "anti-computer" style - the idea is that modern programs underestimate the danger of an attack on a king while this attack is beyond their . A good anti-computer player is slowly preparing an attack, dragging figures, and in the meantime gives the program the opportunity to frolic (engage in pawncrest) on the other flank. When the program sees the danger - it is too late, she no longer has time to pull up the queen stuck there (or someone else). Very often, good anti-computer players play badly with people, because a person sees danger earlier.

Deep Blue Story

Probably, there is no person who would not hear about Deep Blue. And I would have heard it exactly, at the level “this is such a program that plays better than Kasparov”. Without any details. The lack of details is related to the marketing policy of IBM. The advertising department there needs an average American to know that "IBM has made an invincible chess computer." He knows it. Therefore, Deep Blue will (probably) never play chess - IBM will not gain anything from this, but there is something to lose (if a person wins).

So, it is possible to judge Deep Blue (more precisely, Deep Blue II - the last car from the family) only by 6 games played in the second match with Kasparov, by rare publications in technical journals, and by lectures that are sometimes arranged by developers. One of these lectures was a year and a half ago at MS, I attended it; it was read by Murray Campbell, who in Deep Blue was responsible for the evaluation function. Very interesting is the article Feng Hsiung Hsu (the author of specialized chess processors) in IEEE Micro, N3 for this year. According to rumors, Joel Benjamin (grandmaster who worked with the team of Deep Blue; champion of the USA) is preparing a book about Deep Blue, but I have not seen it. Interesting information sometimes slips into rec.games.chess.computer, but there is very little signal-to-noise ratio.

Deep Blue is not the first attempt to create a specialized "hardware" for playing chess. We can recall, for example, that in the late 60s - early 70s one of the domestic M-20 machines was slightly modernized, special chess teams were inserted into it (Russia is the birthplace of elephants; what elephants I saw at the zoo yesterday .. but we digress from the topic). Real specialized chess processors appeared in the West, at the end of the 70s - the beginning of the 80s. The most famous developments were made by Berliner - HiTech and Ken Thompson (one of the authors of Unix) - Belle. In these machines, a hardware progress generator, an evaluation function, and an alpha-beta brute force are implemented. Developments were academic, and despite the fact that specialized hardware is usually an order of magnitude faster than a comparable general purpose, there was no comparability of speech. The Cray Blitz program on the ultramodern then Cray machine (worth $ 60 million) still played better.

Deep Blue began as a student development - it was interesting for a group of capable students to try their hand, and the topic for a diploma is excellent. The progress of the technology allowed the first version of the processors (called ChipTest) to be made very fast. A follow-up version, called Deep Thought, followed. At this point, the group was noticed by the marketing department of the IBM firm and addressed it with an offer that could not be refused. The result was Deep Blue and Deep Blue II. Thus, Deep Blue II is the result of more than 10 years of work of a very capable group, in which there were both programmers / hardware workers and strong grandmasters (except for the already mentioned Benjamin other grandmasters were involved, with good pay). The whole work was funded by IBM, so the group had resources that were not dreamed of by academic organizations (“God is on the side of the big battalions”).

How does the Deep Blue

Deep Blue II is made on the basis of IBM's powerful RS / 6000 server. The server has 31 regular processors; one declared the main, obey 30 others. 16 specialized chess processors are connected to each “working” processor, so there are 480 chess processors in total. They are the “heart” of the system, thus the statement of IBM “We are selling an RS / 6000 server that beat Kasparov to chess”, how to say this ... not quite right.

The first few half-moves (4) get over on the “main” RS / 6000 processor. After 4 half-moves, he transfers the work to the remaining 30 processors, and they consider the next 4 half-moves. The program here is written in ordinary C, and is not very different from traditional parallel chess programs. The most significant difference is that when writing a program it was known that the system has tremendous speed (since specialized chess processors perform the main work in it), respectively, it is possible not to save much. Let me remind you that when writing a chess program there is always a great desire to consider some options deeper (“go deeper”), but you always have to remember that if we go deeper here, then there will be less time to consider other options. Accordingly, it is usually necessary to find a compromise between a deeper consideration of “tactical” options (in which only classical recesses work) and “silent” options (which from a human point of view can also be tactical, but only the moves do not fall under the too narrow machine definition of tactical ). Here you can also consider longer options, not caring that others will not have time - there will be enough resources for everyone. So the formal 4 half moves here can easily turn into 30 half moves.

After that, the work is intercepted by specialized processors, and they do noticeably more than 99% of the entire work of the system - namely, they go over another 4 half-runs. Due to the fact that they do not have a program as such, and everything that needs to be done is hardwired in hardware (not firmware!), When designing them, it was necessary to seriously think about what to do and what they do not. In particular, they provide almost no recess. But they provide a rather complicated evaluation function - after all, what is hard to program is often very easy to do at the hardware level. For example, a check detector works one clock, because all possible options are checked simultaneously. The control of X-rays is very simple (I don’t know how it will be in Russian - this is when one long-range figure stands behind the other), forks, overloading of figures, etc.

In general, the evaluation function in the Deep Blue chess processor is very complex - it is argued that it corresponds to approximately 40,000 commands of a conventional processor. In particular, there are about 8,000 coefficients. These coefficients are not stitched tightly, and loaded into the processor after switching on. Thus, it was possible to tune the processor after its creation, and not to run to the factory for each change. How the optimization problem was solved in the space of dimension 8000 is a separate story (and also published).

Chess processors were made using outdated 0.6 micron technology and operate at a speed of 2-2.5 million positions per second. Thus, the entire monster processes approximately one billion chess positions per second. Please note that the general-purpose system for this would require a speed of 40 trillion operations per second.

(The processors were seriously reworked after the first match with Kasparov, when it became clear that the previous version lacks many factors that are necessary for games at this level; most of the group did not believe that this could be done in less than a year, but Hsu did wonders and managed in time)

Of course, efficiency parallel brute force is always lower than 100% (it has already been written before that alpha-beta is parallelized very hard). In this case, an efficiency of about 25% was achieved, which in itself is a very great achievement. That is, the system works as a single-processor, non-vector general-purpose machine at a speed of 10 trillion operations per second. There are no such machines and will not appear in the coming years (or maybe it will never appear, because technology will rest on the size of the atom and quantum effects).
created: 2014-09-23
updated: 2023-05-19
132518



Rating 9 of 10. count vote: 2
Are you satisfied?:



Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Computer games developming

Terms: Computer games developming