Meta reasoning

Lecture

Since the calculation of optimal solutions in games is most often impracticable, it is necessary to use some assumptions and assumptions in all algorithms. The standard approach, based on the use of minimax values, evaluation functions and alpha-beta cut-off , is just one of the ways to achieve this goal. Apparently, due to the fact that it was proposed a long time ago, this standard approach was intensively developed and began to dominate over other methods involved in the struggle for the right to exist.

Some experts in this field believe that this state of affairs has led to the separation of the problems of playing games from the main direction of development of research in artificial intelligence, since this standard approach no longer provides much scope for new discoveries to find answers to common questions in the field of decision-making. This section describes some alternatives.

First consider the minimax search. The minimax search algorithms allow you to choose the optimal course in this particular search tree, provided that the estimates of the leaf nodes are absolutely correct. But in reality, estimates are usually rough forecasts of the value of a given position, so we can assume that significant errors are associated with them.

The figure shows a game tree with two half-moves, for which the minimax search seems inappropriate. Minimax search prompts that the right branch should be selected, whereas it is very likely that the true value of the left branch should be higher. The minimax selection is based on the assumption that all nodes marked with 100, 101, 102 and 100 are indeed better than the node marked with 99. However, the fact that the node with 99 has sister nodes with 1000 is suggestive that in fact it may have a higher true value.

One way to deal with this problem is to use some sort of estimate that returns the probability distribution among the possible values. In this case, it is possible to calculate the probability distribution for the value of the parent node using standard statistical methods. Unfortunately, usually the values of sister nodes are highly correlated, so such a calculation can be costly and requires great effort to obtain information.

Then consider the search algorithm in which the tree is formed. Each algorithm designer seeks to create a method of calculation that acts quickly and allows you to determine a good move. The most obvious disadvantage of the alpha-beta algorithm is that it is designed not only to select a good move, but also to calculate the limits of the values of all allowable moves. To understand why this extra information is often not needed, consider the position in which there is only one valid move.

A tree with two half moves, for which a minimax search may not be suitable

The alpha-beta search still generates and evaluates a large and completely useless search tree. Of course, you can provide some kind of verification of this situation in this algorithm, but it will simply mask the main problem: many of the calculations performed in the alpha-beta algorithm do little to solve the problem. The situation in which there is only one valid move is not much different from the one where there are several valid moves, moreover, one of them is correct, and the others are clearly catastrophic.

In such a situation, having a “well-defined favorite” would be desirable to quickly reach a solution after conducting a small search volume, rather than wasting time that could be used more productively later, in a more problematic position. This leads to the idea of a node deployment utility.

A good search algorithm should choose options for the deployment of sites with high utility, i.e. those options that are likely to lead to the discovery of a much better move. If there are no deployment options for nodes whose utility exceeds their cost (measured in time-consuming), the algorithm should stop searching and select the course. It should be noted that such an improvement concerns not only situations with clear favorites, but also those cases when there are symmetric moves for which an arbitrarily large search volume does not allow showing that one move is better than another.

Arguments about which calculations should be performed, and which should not, are called meta- arguments (discourses on reasoning). This approach applies not only to the conduct of games, but in general to reasoning of any kind. All calculations are performed in order to develop the best solutions, they all have a cost and are characterized by a certain probability of achieving a specific improvement in the quality of the solution. Alpha-beta search is an implementation of the simplest form of meta-reasoning, namely, the theorem that some branches of a tree can be ignored without harming the whole game. But such meta reasoning can be done much better.

Finally, consider once again the nature of the search itself. Algorithms for heuristic search and gaming operate by generating sequences of specific states (starting from the initial state), and then using the evaluation function. Of course, people play games differently. In chess, the player is often guided by a specific goal (for example, to catch the opponent's queen) and can use this goal to selectively develop workable plans to achieve it. Sometimes this kind of approach based on reasoning, guided goal, or planning, allows you to completely eliminate the combinatorial search.

The David Wilkins Paradise program is the only program that successfully used goal-driven reasoning to play chess; she was able to solve some chess problems, for which combinations of 18 moves were required. Nevertheless, there is still no complete understanding of how to combine these two types of algorithms into a reliable and efficient system, although the Bridge Baron program can be considered as a step in the right direction.

A fully integrated system would be a significant achievement not only in the field of game research, but also for all research on artificial intelligence in general, since it would serve as a good basis for creating an intelligent general purpose agent.

Comments

To leave a comment

If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.

To reply

Comment

To confirm that you are not a bot, answer:

Name

Email(not published)

Vote

Meta reasoning

Comments

To leave a comment

Knowledge Representation Models

Terms: Knowledge Representation Models