Artificial intelligence

Lecture



The key to solving problems with the help of artificial intelligence lies in reducing the search of options when searching for a solution. To do this, programs must implement the same principles that people use in the process of thinking.

Suppose one summer evening on the way to San Francisco you found yourself at a crossroads in Nebraska. The road you drive leads. The highway that goes to the left is lost in the endless corn fields. Shielding your eyes from the sun, you see the same picture on the right. Since you do not have a card with you, you decide at random to go left. Soon you will arrive at another intersection, then another one and you will have to make a whole series of random choices. In the end, you come to a dead end, you will have to go back to the last intersection and go the other way. If you find yourself a long-lived, and you are very lucky, someday you will get to San Francisco, but you have very little chance of that, about one out of 1030. However, you know something about the world in which you live, and you do not have to randomly choose the path in the maze of roads - at the first intersection, you turn right.

Most of the tasks, including much more interesting than this, can be represented in the same form - as the search for a path leading from some initial state to the desired final state. Most of the interesting problems have in common that they are too complex to be solved by random search, since the number of options increases exponentially as you move further from the first intersection, or the point of making the first decision. In this sense, a classic example is given by the game of chess, where the number of possible positions on the board is estimated at 10120. However, a good player reduces the task of choosing the next move to an acceptable size and considers only about 100 positions corresponding to the most promising development of the game. In this, I think, the whole essence of the intellect manifests itself when solving problems - to be able to cope with intractable “head-on” tasks, shortening the search.

For about 30 years, a relatively small group of researchers has sometimes tried to create programs that would allow computers to solve problems “reasonably” with some degree of success, sometimes less successfully. In the mid-1970s, after two decades of slow and barely noticeable progress in this new field of artificial intelligence, the researchers arrived at the following fundamental conclusion about rational behavior in general: it requires an enormous amount of knowledge that people possess as something that goes without saying which you need to gradually "feed" the car. Facing a crossroads in Nebraska, the traveler will remember that San Francisco lies to the west of Nebraska, that the evening sun sets in the west and that, moving in the direction of the setting sun, it will be on the right path. Thus, he will not need to check where the other two roads lead.

The simplicity of decision making in this problem is not typical for many other everyday tasks that people solve instantly without thinking. Understanding even the simplest fragments of the English language, for example, requires knowledge of the context, to some extent, familiarity with the speaker and, finally, knowledge of the world in general, which far exceeds the capabilities of modern computer programs. The central role of knowledge in rational behavior explains why so far programs that represent "experts" in narrowly specialized fields, such as meningitis diagnostics, or some game programs, have achieved the greatest success. In contrast, in the early attempts to create "universal systems for solving problems" it was assumed that the main component of intelligence is the ability to reason. However, these attempts were less fruitful and in most cases they are now abandoned.

When solving complex problems, people use various methods or, figuratively speaking, power sources that can shorten the process of finding a solution based on knowledge of certain laws. They can use mathematical theorems or less formal rules based on acquired experience, resort to dismembering a difficult problem into easier subtasks, or can argue by analogy with those problems that were solved earlier. And those properties of the intellect that are already manifested by computer programs are explained by the use of the same principles or sources. The task of artificial intelligence is to learn in the future to use effectively those sources that it is using so far weakly.

Many programs created during the first two decades of research in the field of artificial intelligence, were largely based on the methods of formal logical reasoning. When a task is defined in a fairly limited area, such methods can provide powerful tools for truncating the search tree or even completely eliminate the search procedure. For example, there is no need to find ways to solve the problem of dividing the angle into three equal parts each time or to find the best method for calculating the body's trajectory in the field, since theorems have already been proved and algorithms have been developed that solved these problems once and for all.

One of the most popular methods of formal reasoning was and remains the method of logical inference, based on the technique of proof, called resolution and using the refutation of negation (proof “by contradiction”). In order to apply the resolution method, you must first present a provable statement within the framework of a logical formalism called predicate calculus. Then the statement is denied and its denial is "resolved" together with a set of axioms - statements that are obviously true in this particular area or situation under consideration. If combining the negation of an assertion with axioms leads to a contradiction, then the negation must be false and, therefore, the original statement is true.

In 1964, A. Robinson proved that the resolution method has the property of "completeness": if the initial statement is true, then in any case, sooner or later this method will lead to a contradiction. (If the initial statement is false, then there is no guarantee that the process of withdrawal with the help of the resolution will not be infinite). Robinson's findings marked the beginning of active research related to the application of the resolution method and other related methods in the field of automatic proof of theorems. However, the resolution method has one major drawback: it is subject to the “combinatorial explosion” phenomenon, when the number of resolutions carried out by the program grows exponentially as a function of the complexity of the problem. Programs that successfully apply the resolution method on small sample tasks, as a rule, do not cope with more interesting real-world problems, the scale of which is much wider.

Programs that are based on another logical method, called structural induction, also face the same difficulty. Such programs receive as input a large amount of data about objects belonging to the area considered in the problem, and on their basis they must build a decision tree in order to distinguish objects. The problem that arises when using structural induction algorithms, however, is that the data does not contain any information that allows deciding which variables are important and which are not, or indicating what to do with the data containing “noise” or what to do. in exceptional cases. If the number of objects and properties associated with them is large, the tree generated by the program becomes so cumbersome that it is almost impossible to use it.

In order to create an effective program based only on some method of formal logical reasoning, the problem must be sufficiently small. One of the promising applications of formal methods can apparently be the modeling of qualitative physical reasoning. J. Brown and J. Dekler from the Palo Alto Xerox Research Center have developed a program that simulates the processes in a pressure regulating valve using qualitative equations. If, for example, the program becomes aware that the pressure on the left side of the valve has increased, then the equations change accordingly and the program prompts a change in pressure on the other side of the valve; in the end, the system will come to a state of equilibrium. This is, of course, a very simple example, but a similar approach is used in the analysis and design of electrical circuits.

However, the tasks of greatest interest cannot be solved only on the basis of methods of formal logical reasoning. The power of logical methods lies in the fact that they allow to represent objects and connections between them in the form of symbols, which can be easily operated with the help of well-studied methods (such as the resolution method), thereby realizing logical reasoning. The weakness of logical methods is the same as their strength: many types of knowledge, including inaccurate and incomplete knowledge, so characteristic of most tasks in the real world, cannot be imagined within the framework of strict logical formalisms. Programs based solely on logic reflect only one component of the understanding that helps an intelligent being trying to solve a difficult problem.

Today, there are dozens of large programs that solve difficult problems in various fields, such as medical diagnostics, planning of genetic experiments, geological exploration and automatic design. The main tool in these expert systems is informal reasoning, based on broad knowledge, carefully collected from experts - people. In most of these programs, knowledge is encoded in the form of hundreds of rules, such as if-then, based on experience. Such rules are called heuristics. The rules limit the search by drawing the program’s attention to the most likely solutions. Moreover, and this is the difference between the programs managed by heuristics and those based on formal methods, these expert systems can explain the course of their reasoning in a form that is understandable to humans.

Such explanations are made possible by the fact that decisions made by the program are based on rules adopted from experts - people, and not on abstract rules of formal logic.

Consider the MYCIN program, developed by E. Shortliff at Stanford University for the diagnosis of bacterial infections in the blood. The problem solved by this system is to determine which of the possible microorganisms caused the disease, and recommend a course of treatment based on the diagnosis. Moreover, the MYCIN system has a knowledge base consisting of 500 heuristic rules. Here are some typical examples: "If (1) the body color is gram-positive and (2) the body morphology corresponds to cocci, and also (3) the body grows in clots, then there is reason to believe (0.7) that this organism is staphylococcus."

In the course of execution, the program maintains a dialogue with the user, requesting additional information about the patient, which will allow the application of certain rules, and sometimes the program offers to do laboratory tests. At any time, the user can ask the MYCIN system to clarify its question or conclusion. The program explains its reasoning, referring to certain rules, which she used. When tested, the MYCIN system proved to be no worse than medical specialists.

In addition to heuristic rules, expert systems draw strength from other sources, such as considerations dictated simply by common sense, which people rarely think about. In many programs, for example, the search focuses by focusing on achieving more or less specific goals. The MYCIN system again gives a good example. Having the initial general information about the patient, the system in its arguments goes from the ultimate goal - the identification of the pathogenic organism. She asks questions that can reveal specific symptoms that support the hypothesis of a particular diagnosis.

Having established, say, “the coloring of the organism is gram-positive”, MYCIN, without conducting any other search, will inquire about the morphology of the organism, which will make it possible to decide whether these bacteria are related to staphylococci. Such a focus on the goal does not mean that the decision-making sequence is simply “sealed” in the program, as it is sometimes claimed by those who do not believe that expert systems actually exhibit any properties of intelligence. Programs like the MYCIN system actually adapt to situations not foreseen by a programmer who cannot know in advance exactly how the program will use its knowledge.

Another effective strategy used by sentient beings — humans, including software developers — is to break down a complex task into simpler subtasks, as they say, divide and conquer. "A team of researchers from Carnegie Mellon University created four programs each of which is controlled by heuristics; they are designed to "rediscover" well-known physical and chemical laws. These programs allow researchers to understand more deeply and mechanize the The main aspects of the formation of scientific theories. In the end, researchers want to combine all the solutions obtained within a single model.

In a relatively narrow sense, the divide-and-conquer strategy is implicitly embedded in the software itself. Programs focused on a particular goal decompose the search into more or less independent sub-goals (represented by the nodes of the search tree). At a higher level, programs managed by heuristics should distinguish between a task and a meta-task. The difficulty here is to decide which of the hundreds of different rules should work at the moment. Meta-problem is solved separately, often in the course of a complex process (sometimes requiring its own heuristics), by comparing the resulting state with the conditions contained in the "if" part of the if-then rules.

Formal methods, even if they cannot be used as the main means of reasoning, can play a useful role in the management of expert systems. In some systems, for example, in solving such questions as whether the costs of a further search are justified, logical or statistical procedures are applied. Moreover, since in expert systems the heuristic rules of the "if-then" type, generally speaking, do not express relationships that are always true, a value characterizing the degree of confidence in the validity of this rule can be associated with each rule ("0.7" is the example given above when describing the system MYCIN). These assessments, joined at each step of the sequence of conclusions, are combined to characterize the credibility of the final decision. This is done using the Bayes formula or some other formal procedure from probability theory.

Each rule used by the expert system can be simple in itself. Sometimes these rules are poorly organized, or they have no organization at all. However, this system of rules as a whole is capable of solving technically complex tasks, while demonstrating the competence of an expert. A certain kind of synergy phenomenon is observed here, when the whole is obtained larger than the simple sum of the components. This phenomenon is somewhat common, which is taken for granted, but in almost all expert systems it plays the role of one of the main factors.

One of the most successful programs with the properties of artificial intelligence was also historically the first expert system. This is the DDRDRAL program, created by E. Feigenbaum and his colleagues at Stanford University in the late 60s. Like its follower, the GENOA system, it is widely used in the organic - chemical laboratories of many countries. The DENDRAL system, establishes the structure of organic molecules, based on the data of mass spectrometry, nuclear magnetic resonance and other types of information. Like MYCIN, the DENDRAL system is essentially diagnostic.

A fundamentally different expert system is a system that has the ability to receive new information or, based on some fundamental principles, to come to information that is already known. An example of a system of this kind is the EURISKO program, developed by the author together with students at Stanford University. Снабдив программу относительно небольшим количеством информации из областей, в которых она должна была работать, мы "запустили" ее в такие непохожие области, как теория множеств, военные игры, программирование ЭВМ и нейтрализация химического загрязнения.

In the process of searching, consisting in the synthesis, analysis and evaluation of new concepts, EURISKO is controlled by hundreds of rather general heuristics. For example, one of them is to "consider extreme cases." When the program "pondered" over the function "divisors" in set theory, this heuristic led the program to consider only those numbers that have few divisors. In this way, the system re-opened the prime numbers, i.e. numbers, which have only two divisors, and also established the fact that any number can be uniquely decomposed into factors, which are prime numbers.

Эта же простая эвристика оказалась бесценной, когда мы с EURISKO занялись военной игрой "Трэвеллер", цель которой заключается в том, чтобы подобрать оптимальный состав эскадры, сражающейся с эскадрами противников согласно многочисленным строгим правилам. Ознакомившись с правилами игры, EURISKOсоставила эскадру, почти полностью состоящую из маленьких быстрых атакующих судов, подобных торпедным катерам. В состав эскадры был также включен один настолько быстрый и маленький корабль, что его практически невозможно было поразить. Люди, увлекающиеся этой игрой, осмеяли стратегию EURISKO и выставили против нее эскадры с более традиционным составом кораблей; размеры кораблей были у них сбалансированными. Программа выиграла.

Другая широко применявшаяся в системе эвристика - это "сблизить". Она советует программе рассмотреть, что произойдет с функцией двух переменных x и у, когда им присваиваются одинаковые значения. После того как EURISKO уже вывела функции сложения и умножения, основываясь на теории множеств, правило "сблизить" помогло ей открыть функцию удвоения (x плюс x) и возведения в квадрат (x умножить на x). Применив эвристику "сблизить" к игре "Трэвеллер", система EURISKO выработала новую стратегию: поврежденный противником корабль следовало бы взорвать и затопить. Поскольку правила игры определяют общую подвижность эскадры по судну, имеющему наименьшую скорость, эта стратегия помогала повысить боеспособность всей эскадры. Наконец, изучая программирование, EURISKO проанализировала функцию "x вызывает у", где x - программная секция, активирующая другую секцию у. Эвристика "сблизить" привела EURISKO к определению важного понятия - рекурсивных вызовов, когда программа может вызвать (активировать) саму себя.

"Сблизить" и "рассмотреть экстремальные случаи" - примеры эвристик, помогающих программе - исследователю определять новые понятия. Однако, если цель этой программы в том, чтобы находить действительно интересные идеи, она должна также располагать эвристиками второго типа, помогающими ей решить, какое из многочисленных порождаемых ею понятий представляет больший интерес. Правила синтеза понятий направляют программу на начальном этапе поиска, оценивающие эвристики направляют поиск вдоль наиболее перспективных ветвей. СистемаEURISKO, например, содержит такие правила, как "если все элементы множества неожиданно удовлетворяют какому-то редкому свойству, то следует повысить оценку "интересности" этого множества и эвристики, которая привела к его определению".

Another rule provides the program with a criterion by which it can decide which of two very similar concepts should be studied. According to this criterion, you need to choose a concept that requires less machine time and fewer questions asked to the user.

From the use of heuristics for the discovery (or "re-discovery") of new concepts or facts is theoretically close to generating new heuristics based on old ones. This step is related to what has long been considered the main goal of research in the field of artificial intelligence, with the creation of programs that can learn from the experience they acquire during implementation. In recent years, a number of researchers have indeed managed to develop programs that generate common rules based on the experience they have gained in solving particular problems. The process of generalization is controlled by metaheuristics.

Успех системы DENDRAL побудил ее авторов к созданию новой программы META-DENDRAL, которая выводит общие правила для масс - спектрометрии, основываясь на наблюдении того, как отдельные соединения разделяются на фрагменты в масс - спектрометре. Примером метаэвристики в этом случае является простое утверждение, согласно которому наиболее важные свойства молекулы при определении ее фрагментации - это те, которые наблюдаются вблизи точек разрыва связей. Применяя эту эвристику, META-ENDRAL может вывести правило, согласно которому органические молекулы имеют тенденцию распадаться в тех положениях, где атомы углерода и кислорода связаны одинарными связями.

Полученная таким образом эвристика может оказаться полезной при определении структуры молекул по их спектрам. Аналогичными принципами руководствовались Т. Митчел и П. Атгоф из университета Ратгерса, написавшие программу под названиемLEX2, которая выводит эвристические правила для решения задач интегрального исчисления, основываясь на опыте, полученном ею при вычислении отдельных интегралов.

Перспектива создания более "разумных" обучающихся программ в значительной степени зависит от того, будут ли найдены пути использования такого важнейшего источника, каким для человеческого интеллекта является способность к пониманию и рассуждению посредством аналогии. Если внимательно поразмыслить над тем, как рассуждают люди, то нетрудно будет прийти к выводу, что в процессе объяснения, понимания или создания новых понятий они постоянно пользуются аналогиями. В программах искусственного интеллекта использование этого источника в настоящее время лишь только начинается, однако в будущих исследованиях оно, несомненно, будет в центре внимания.

Я не хочу сказать, что до сих пор в этой области не наблюдалось никакого прогресса. Двадцать лет назад. Эванс из Массачусетского технологического института написал программу, способную улавливать аналогии между геометрическими фигурами. Наличие способностей подобного рода проверяется тестами по оценке умственного развития людей. Научить программу находить концептуальные аналогии - задача более трудная, и ряд исследователей работает сейчас над этой проблемой. Дж. Карбонелл из университета Карнеги-Меллона имеет программу, улавливающую сходство между алгоритмами, записанными на различных языках программирования, но имеющими одно и то же назначение. С другой стороны, если вернуться в программеEURISKO, то она не столько отыскивает аналогии, сколько пользуется рассуждениями по аналогии, причем на довольно низком уровне.

Работа, например, в области конструирования электронных интегральных схем,EURISKO натолкнулась на тот факт, что симметрия является для них желательным свойством, хотя, почему именно, программе осталось непонятно. Когда позже перед ней была поставлена задача подобрать состав эскадры для военной игры "Трэвеллер",EURISKO решила сделать эскадру симметричной, оправдав свое решение ссылкой на опыт, полученный ею в области конструирования интегральных схем.

Однако по сравнению с человеческими способностями эти успехи выглядят чрезвычайно скромно. Слабость, проявляемая компьютерными программами при отыскании аналогий и их использовании, можно объяснить скорее узостью базы знаний, которыми располагают программы, чем неспособностью исследователей разработать подходящие алгоритмы. Люди в своих рассуждениях располагают огромным запасом понятий, из которого они могут черпать всевозможные аналоги. У каждого человека наберется, наверное, около миллиона отчетливых воспоминаний об отдельных предметах, действиях, чувствах, ситуациях и т.д. Такой обильный багаж знаний невозможно встроить в современные программы. Накопить столь огромный опыт в ходе работы программы также не могут.

Программы, работающие в течение долгого времени, при повторном запуске не имеют достаточно хороших записей о результатах своих прошлых поисков; при остановке большая часть усвоенных ими уроков оказывается утерянной. Даже у программыEURISKO, которая работает непрерывно в течение нескольких недель, а при остановках и повторных запусках сохраняет большую часть своих записей, общая продолжительность разумной" жизни еще очень мала, ее опыт уступает по своему многообразию даже опыту малолетнего ребенка.

В свете вышесказанного рецепт повышения способности программ к рассуждению по аналогии, так же как и общего прогресса в программах искусственного интеллекта, заключается в том, чтобы расширять базу знаний. В идеальном случае можно было бы записать целую энциклопедию в форме, доступной для машины, но не в виде текста, а в виде системы структурированных, многократно проиндексированных фрагментов. В результате подготовительной работы, проведенной в этом направлении некоторыми исследователями, выяснилось, что эта задача намного труднее, чем казалось: дело в том, что для понимания статей энциклопедии требуется уже обширный запас знаний на уровне "здравого смысла", которым не обладают современные программы.

С одной стороны, компьютерные программы должны располагать гораздо большими значениями, прежде чем они смогут эффективно рассуждать по аналогии. С другой стороны, чтобы быть способными приобретать знания в таких огромных количествах, надо полагать, что компьютеры должны, по крайней мере "понимать" аналогии, которые им представляются, - фактор исключительной важности в способности к обучению, свойственной людям. Таким образом, мы приходим к проблеме "курицы и яйца". К счастью, для компьютера, так же как и для человека, проще понять аналогию, которую ему представляют, чем отыскивать ее самостоятельно, и исследования, проводимые в настоящее время, дают некоторые основания надеяться, что описанная выше дилемма не окажется совершенно непреодолимым препятствием.

The key to the machine understanding of analogy lies in presenting information about the objects being compared in a convenient form, for example, in the form of data structures called frames and consisting of sets of cells, each of which contains the value of one or another attribute (property) of the object. When the machine is informed that there is an analogy between some two objects ("Fred is like a bear"), the empty cells of one frame are filled with values ​​taken from the corresponding cells of the other. The hardest thing for an "illiterate" program, of course, is to decide which properties-values ​​can be transferred and which ones cannot. (Which attributes of Fred are similar to the attributes of a bear?) Making such decisions can be controlled in the program by special heuristics. Here, for example, the heuristic “to consider extreme cases” again turns out to be useful - if the analogy is appropriate, it is more often because some unusual properties of one object are also characteristic of another.

The use of frames to automate the understanding of analogies illustrates a certain fact of a general nature, namely that the way of presenting knowledge can itself be a source of efficiency in artificial intelligence systems. Elements of knowledge can be represented in the program in many ways, and I have no intention to consider them all in this article. The bottom line is simply that with each method of representing knowledge they are effective for some operations and ineffective for others. If, for example, each attribute of each object was presented in the knowledge base of a program as a separate sentence of formal logic, then drawing analogies would be associated with a very long and inconvenient search. Choosing the right presentation for a particular task allows you to reduce or eliminate the search.

On the other hand, people are not limited to this choice, once and for all; we have the ability to switch repeatedly between several forms of knowledge representation - words, symbols, images - and in the process of solving a problem, consider it from different points of view. In programs, this flexibility is difficult to model. In 1962, G. Gelernter created a program that solves school problems in plane geometry (planimetry). Each task had a dual representation - axiomatic and in the form of a diagram. Logical representation allowed the program to build formal evidence. On the other hand, the diagrams helped in choosing the methods of proof and made it possible for the program to check its proposals.

She, for example, could recognize parallel lines of straight lines, equality or mutual complementarity of two corners, etc. And although coincidences of this kind could simply be a consequence of how a particular diagram was drawn, the probability that these coincidences were random was very small, and the method of multiple representation very effectively limited the search.

Unfortunately, Gelernter’s “prophetic” thoughts on multiple representations have not yet been applied in other areas, although recently several researchers have taken up the classification of presentation forms and have begun working on methods that allow the program to translate information from one presentation form to another. It should be noted that the diagrams in the Gelernter program were effective not only because they gave just another form of presentation, but also because they were analog: their elements corresponded to real objects, the distances between these elements corresponded to real distances, as on the road map. This is something that can not give a logical presentation, and now a number of specialists are looking for ways to use the potentially powerful means of analog representation.

One of the directions in these studies deserves more detailed consideration. The “whiteboard” (similar to the blackboard; see p. 179) is no longer a way of representing individual knowledge items, but a way of organizing them within a large program: the board represents the space of the task itself. In the speech recognition system, where this approach was first applied, the horizontal axis of the board represents the time counted from the beginning of the spoken phrase from left to right. The vertical axis characterizes the level of abstraction, which grows from the forms of sound waves to syllables and ultimately to the sentence, as the program moves towards understanding the utterance.

Each if-then rule (or a set of such rules) observes a particular part of the board and only works when information appears in this area of ​​space. Thus, the board helps to solve the meta-problem: how to choose a rule that should work at any given moment. Moreover, knowledge modules that operate independently of each other do not necessarily have to be "if-then" rules. Therefore, the structure of the “whiteboard” provides a natural way to use synergies between different types of knowledge within a single system.

Recently, in the circles of specialists in artificial intelligence, it has become increasingly common to hear the word "parallelism". It denotes another potential source of efficiency for artificial intelligence systems. Currently, most computers process information sequentially, one operation follows another. However, some groups of specialists in computer technology, including those working in Japan to create a "fifth generation computer" and in the United States on a "strategic computer project", are developing machines consisting of about a million processors operating "in parallel", those. at the same time. The prospect of increasing the speed of processing information a million times so inspired some researchers that they predict revolutionary changes in the software of artificial intelligence systems.


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Artificial Intelligence. Basics and history. Goals.

Terms: Artificial Intelligence. Basics and history. Goals.