Database AI

Lecture

On the prospects for the use of object-oriented database systems and knowledge in artificial intelligence systems

I believe that the practice of operating relational database management systems has revealed significant limitations in the relational model of data representation. Currently, there is a need to abandon the relational model and pay attention to the undeservedly forgotten network and object-oriented models of data representation. This will allow in the near future to achieve significant success in the use of artificial intelligence in solving actual problems of modern business.

AI - artificial intelligence;

DB - database;

GZ - knowledge base;

RDB - relational (tabular) database;

OBD - object database;

RMD is a relational model of data representation;

OMD - an object model of data representation;

DBMS - database management system;

SUBZ - knowledge management system;

RDBMS - relational database management system;

MSSE is a network object-oriented knowledge base.

As you know, currently widely spread database management systems based on the relational data model. Most software developers and systems analysts have a strong prejudice that the relational model finally won the competition and drove other data models out of the market. I believe that this situation is temporary and in the near future we will all witness the breaking of this stereotype. Database management systems alone are not valuable to users. Even the data stored in the databases do not represent a special value by themselves. The core value has complete applications that allow users to model some aspects of their activities and business using computing technology. The current business processes are characterized by high complexity. There is a tendency of complication of business processes in connection with the development of processes of integration and globalization. Accordingly, the requirements for the business data presentation model are being tightened.

Despite the advantages, the relational model of data representation has several disadvantages. In the case of the normalization of the domain model, the database is a set of related tables. Along with the trivial need to perform multiple connections on multiple tables, such a structure makes it very difficult to maintain the database. When changing business, you have to re-design a new version of the normalized database and discard the work of developers spent on developing the previous version of the database and applications that work with the previous version. The situation is not saved by the design of the relational structure of tables, specifically designed to change the business model during the operation of the application. In this case, the developers of relational databases use three approaches.

Dynamic modification of the database structure by the application. This approach preserves the normalization of the database structure. It has the following disadvantages: A. When changing the structure of the database, the data structure introduced into the database at the previous stage of its operation changes. In some cases, the law or business process requires keeping historical data intact. B. The absence of inheritance in classical RMD does not allow one to model the inheritance of concepts from the business model in a natural way. B. Most modern implementations of RDBMS do not allow modification of the database structure within a transaction. When modifying the table structure, transactions are completed. In case of an erroneous modification of the structure, the changes will not be rolled back.
Rotation of the database structure by 90 degrees. This approach eliminates the disadvantages of the previous approach. For the purity of the concept, all objects stored in the database have to be placed in just a few tables. Most often these are two tables - a table of objects, and a table of the values of their attributes. This leads to a Cartesian multiplication of the number of objects by the number of their attributes and a significant increase in the rows in the attribute table. Also, this approach leads to an increase in the number of compounds. Against the background of an increased number of rows and an increase in the size of the index, this approach leads to a catastrophic decrease in database performance. Additional disadvantages of this approach are: - loss of information about the type of an attribute of an object, which can lead to loss of data integrity in the database; - the difficulty of implementing the attributes of storing collections of links with other objects located in the database (in the previous approach, this is solved using one-to-one or one-to-many relationships modeled by separate tables).
Using template tables. In this case, template tables are created in the database with a set of columns “for emergency” not associated with the attributes of business objects. In each template table, several columns of the same type are created, for example, int1, int2, int3, ..., int99, float1, float2, float3, ..., float99, varchar1, varchar2, varchar3, ... varchar99, ... Also, a model is created in the database a business object that describes in which column each attribute of an object is located. In this case, the same column of the template table contains the values of completely different attributes of objects of different types. Compared with the previously described, this approach facilitates the situation, however, it is not without significant drawbacks. The developer needs to provide all types of attributes of the modeled objects and create “for emergency” a sufficient number of columns of each potentially useful type for future use. Along with the obvious restriction on the number of attributes of the same type for each object, this approach requires the construction and execution of dynamic SQL queries. This leads to additional costs for the implementation and execution of the dynamic query builder according to the data model. Which again reduces the performance and reliability of this solution.

So, the deplorable situation prevailing in the use of relational database management systems continues to become more complex. We can all observe how desperate RDBMS vendors introduce object extensions (Oracle, Informix, ...) and XML processing capabilities, business application implementation on the DBMS side using procedural languages (Oracle, Microsoft, ...) in their products. In fact, this situation suggests the defeat of the relational data model as a universal means of modeling modern business processes.

The owners of the existing relational database management systems who have invested many millions of dollars in product development, marketers who have the main task of advertising the product on the market and simply naive users are trying to convince us that the evolution of the RMD can preserve investments in an evolutionary way and solve all the problems described. Is it so?

To answer this question, let's digress from the database management systems and consider the modeling of the business process as the main task, which today is being solved with the use of a DBMS. Obviously, the goal of the industrialization of our society is to replace human labor with labor of the machine. Informatization of society leads to the replacement of human intelligence by machine intelligence. Automation of business processes brought to the absolute implies the complete alienation of a person from performing routine intellectual tasks. As a result, we can conclude that the business process modeling system, made to the absolute, must have artificial intelligence. The introduction of such a system should leave the person only creative tasks, fully automating the routine operations of modern enterprise management. Such a system should have the knowledge and skills comparable to a mid-level business analyst. This means that the knowledge management system (knowledge, not data) must provide a representation and processing of a business process model comparable in complexity to the business process model used by human consciousness. Systems that do not meet this requirement will sooner or later be outdated and will be replaced with systems that have artificial intelligence. Yes, this is not the near future. But business is already making a challenge to developers by setting tasks that require the use of artificial intelligence. The lack of AI systems in widespread use is not at all due to the lack of tasks requiring models based on AI for their solution. The tasks of business automation were not set yesterday, or not the day before yesterday, they were set back in the era of the appearance of the first invoice and adding meters. The current level of development of information systems is determined by the current level of achievements in the field of modeling and processing business data and business knowledge. Our users do not receive at all what they need, but only what we are able to develop using modern design tools.

As can be seen from the analysis of RMD conducted here earlier - RMD is not the ultimate dream. The RDBMS vendors themselves understand the current situation, and slowly evolve under the influence of public pressure towards “post-relational data reporting models”. Manufacturers of RDBMS and groups of scientists funded by them are primarily concerned with the preservation of multi-million investments. Most developers at all costs try to preserve the evolutionary path of development of the database. Therefore, in studies devoted to “post-relational models,” there is not a word about artificial intelligence as a means of modeling business processes. Based on the analysis, it becomes obvious that in the long run, attempts to reanimate the RMD are doomed to failure. In an attempt to solve modern problems of business, a deliberately dead-end path of development was chosen - to develop the RMD somewhere in the future. The challenge posed by the business can be answered only by applying the well-proven method “from the general to the particular”. Instead of under pressure from circumstances to carry out improvements in the RMD spontaneously - layering one change on another, you should look at the problem from top to bottom: determine the general path of development from the current state of the DBMS / SMS to the AI systems. In this case, it’s not at all that the RMD can turn out to be the starting point from which the holy grail of modern IT industry, artificial intelligence, can be achieved with lower total costs.

Industry and customers need knowledge management systems. Professionals working in the field of AI, there are many models of knowledge representation, with not less, and possibly more flexibility and versatility, compared with RMD. Rather common and well-known knowledge representation models are hierarchical semantic networks, active semantic networks, semantic networks of frames, hidden Markov models, ... Recently, neural networks have experienced rebirth. The history of the development of neural networks deserves special attention. The development of science in this direction was stalled for decades after critical publications, which declared neurocomputing the stubby branch of scientific and technological progress. Later, the authors admitted that they were too categorical, but time was lost. Fortunately, science is a more democratic community than the software industry. Opinion of authority in science does not weigh so much on ordinary scientists, compared with the pressure of corporations on an ordinary developer. Today, research on artificial neural networks has been resumed, and we are learning about progress in this area not only from scientific publications, but also from the mass media. But in the field of database management systems, the relational model retains its “unshakable” and monopolistic positions. As in the case of artificial neural networks, data models competing with RMD were declared as a dead end, and research in the areas of hierarchical, network, and object-oriented database management systems was significantly inhibited. RDBMS conquered the market of commercial DBMS. At first, the euphoria caused by these DBMSs was largely justified. A significant share of business is financial transactions. Financial data in its natural form is presented in a variety of tables. Therefore, the relational model turned out to be here as it should be, by the way, and occupied the sector of financial applications. At the moment, this sector has already been mastered and automated to a great extent. RDBMS is gradually developing adjacent areas. The areas of business left without automation are characterized by a data presentation model far from a tabular form. In this regard, a significant slowdown in the rate of spread of RMD is noticeable. Now, DBM developers are forced to expand RMD by means of storing and processing complex data structures.

Business requires automation of any, and not just financial activities. To model any business model, an artificial intelligence system and a knowledge management system are required. Therefore, in the long run, all attempts to improve the RMD one way or another are doomed to end either by failure or by creating AI. Here are two ways. Evolutionary - when by random walk under pressure from the current not yet automated business tasks of the RMD, it is smoothly transformed into a model of knowledge representation of AI. And revolutionary - when developers, clearly understanding their goal, will immediately begin to create knowledge management systems using all the best that has been accumulated over the years of research in the field of artificial intelligence. In either of the two options, the current state of the RMD and the current attempts to reanimate it will not be a place. The data representation model suitable for use in AI systems will be very far from RMD. Spending enormous resources on the evolutionary development of the RMD in such a situation is too expensive. Moreover, if successful, only memories will remain from RMD. I propose to abandon the evolutionary approach. When developing the next generation DBMS / SUBZ, the RDM should be considered as equal among equals. The development of a model for presenting data and knowledge of the next generation DBMS / SUBZ should be based not on the millions of dollars spent on developing current products, but on the effectiveness of applying solutions in advanced AI systems. This will allow to go to the target goal of absolute automation of business processes in a straight line, and not under the pressure of random fluctuations caused by the next limitation of RMD when modeling the business process.

Which model of data presentation, known today, more adequately reflects the model of the world and reality in which we all live? I believe that this is a network object-oriented model of data representation and knowledge. Current advances in object-oriented software development also confirm this point.

The disadvantages of object databases are usually considered to be difficulties in implementing object representations, difficulties in implementing unplanned database queries, and the need to iterate over collections of objects when searching for objects by the values of their attributes. If we compare the pure RMD and the pure MDM, then we can agree that the representations in the RMD can be considered as relations, but the representations of objects in the OMD can be considered as heavier objects. In practical cases, surrogate keys are introduced into the DDB tables to support the requirement “there is no entity without an identifier”. In this case, the RDB becomes difficult to fully equivalent HBS in the implementation of representations. This, for example, is evidenced by many restrictions on the implementation of updatable views. Consequently, in practice, from the point of view of the implementation of the ideas of the HBS and the DDB can be considered almost equal. The difficulties of implementing unplanned requests to the database of objects are fictional. Unplanned queries to the object tree can be implemented, for example, based on OQL or XPath languages. To optimize the search for objects by the value of their attributes in the HBS, as well as in the DDB, it is possible to create and use indexes. So, from the point of view of the considered possibilities, the MDD is not inferior to the RMD.

I believe that the most promising direction that could lead to the creation of AI is artificial neural networks with the ability to self-modify and self-analyze. Semantic neural networks developed by the author possess such abilities. As follows from the analysis, for the implementation of the neural network model, the application of network object-oriented knowledge management systems is promising.I believe that the neural network model should be implemented in the form of an application running in the context of a network object-oriented database management system. The knowledge management component should be implemented in the context of a self-modifying neural network. Currently, RBD technology has too much influence on the developers of the HBS. Existing implementations of HBS are made with regard to compatibility with DDB. This greatly influenced the model of existing HBS implementations, and from my point of view, not for the better. When implementing a knowledge base management system, it is required to provide opportunities for modeling neural networks with a free topology and a universal model of the behavior of an individual neuron. I believe that the control system of the object BZ should be functionally sufficient to simulate the semantic neural network.In this regard, I decided to develop my own research model of the network object-oriented knowledge base Cerebrum, a model representing the objects of which is oriented for further use in the semantic neural network.

The developed system has the following features:

Save the current state of the object graph or neural network in the PSBB between user sessions. Including the current topology of the network of objects. When you restart the application, you do not need to re-create the network of objects.

При большем количестве экземпляров объектов ограничить объем памяти, используемый графом объектов или нейронной сетью. Наиболее часто используемые объекты остаются в оперативной памяти, остальные вытесняются в файловое хранилище и загружаются в оперативную память по мере необходимости. При загрузке экземпляра в оперативную память он вытесняет другие, редко используемые объекты. Ограничение объема памяти позволяет избавиться от использования файла подкачки операционной системы, что значительно повышает производительность моделирования сетей с большим количеством экземпляров объектов (при суммарном размере всех экземпляров большем, чем размер текущей свободной памяти в системе)

If the amount of a network of objects is less than the size of the current free memory in the system, the entire network is in RAM and there is no loss of performance associated with serialization-deserialization.

The use of an OBPS does not impose any restrictions on the used business logic of an object or a mathematical model of a neuron, which can be implemented as methods of objects located in an OBR.

Объектно-ориентированная модель представления данных, используемая в Cerebrum, свободна от перечисленных ранее недостатков РБД. Возможность моделировать сложноструктурированные объекты позволяет объединять несколько экземпляров объектов в единое целое, называемое компонентом. В отличие от РМД, такой компонент может храниться в базе данных как единое целое. Это значительно увеличивает эффективность работы системы. Но не это главное. Так как компонент представляет собой агрегацию нескольких экземпляров объектов, в объектной модели, возможно динамически изменять внутреннюю структуру компонента, не затрагивая при этом структуры других компонентов того же типа, хранящихся в БД. Объектная модель позволяет реализовывать наследование классов и множественное наследование интерфейсов. В отличие от рассмотренного ранее первого подхода, ООБД позволяет изменять внутреннюю структуру отдельно взятого компонента, не влияя на другие компоненты, находящиеся в БД. Это решает проблему представления и обработки версий объектов. Наличие развитой информации о типах позволяет обращаться к внутренней структуре такого компонента, так же как и к отдельным полям таблицы в случае РМД. В отличие от второго подхода, сохраняя возможность работать с внутренней структурой компонента, ООБД позволяют избавиться от разрастания размера индексов и необходимости применять соединения при доступе к атрибутам компонента. Так же исчезают проблемы потери типа атрибутов и трудности при реализации коллекций объектов. Объектная модель свободна от ограничений третьего подхода на количество атрибутов одного типа. Дополнительными достоинством объектной модели представления данных является возможность трактовать любую представляемую в БД сущность как объект. Это позволяет сохранять в атрибутах объекта не только простые значения, но и компоненты со сложной внутренней структурой.

Уже сейчас результаты теоретических исследований и практических экспериментов позволяют успешно реализовать сетевую объектно-ориентированную систему управления знаниями. Такая система окажется полезна не только в решении перспективных задач, но и при решении насущных проблем бизнеса, традиционно решаемых с использованием РСУБД. Я считаю, что, учитывая необходимость перехода к системам, основанным на ИИ, требуется отказаться от догмы о превосходстве реляционной модели данных и сосредоточить основные усилия в исследованиях и разработках альтернативных моделей. Я надеюсь, что сетевая объектно-ориентированная база знаний Cerebrum позволит определить путь дальнейшего развития систем управления данными и знаниями и приблизит создание промышленных систем с искусственным интеллектом.

Comments

To leave a comment

If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.

To reply

Comment

To confirm that you are not a bot, answer:

Name

Email(not published)

Vote

Database AI

Comments

To leave a comment

Presentation and use of knowledge

Terms: Presentation and use of knowledge