Auto grammar

Lecture

In one of the promising areas of research that can lead to the creation of artificial intelligence systems is the use of object-oriented data management systems. Moreover, such systems must be specifically designed for use in artificial intelligence systems. One of the urgent problems in the field of artificial intelligence is the understanding and processing of natural language texts. Analysis of natural language texts is conveniently performed using grammars. One of the tasks in the field of text analysis is the problem of inflection. The inflection task is well studied, so it can be used as a test problem to test the possibility of applying an OOSUBD in the analysis of natural language texts. To implement the inflection of the Russian language, it is possible to apply an automaton grammar. In this paper, the task is to consider the possibility of using the Cerebrum object-oriented database / knowledge management system [2, 3] for the implementation of automata grammar. For research in the field of artificial intelligence is the use of artificial neural networks for the analysis and processing of natural language texts. Therefore, the automaton grammar will be implemented in the form close to the structure of the synchronized linear tree of the semantic neural network [4, 5].

Cerebum CMSO Architecture

An important feature of the Cerebrum SSSR is the difference in the addressing of objects as compared to the ODMG object database standard [6]. By analogy with the natural language, when the same word in different contexts has a different meaning, and when different words in a context have the same meaning, it is also possible to address different instances of objects in different contexts with the same identifier, or, in some context, address the same instance with different identifiers. In this case, each instance of the persistent object defines its own context of assignment. Within any instance, only synonymy is allowed - when different IDs address the same external entity with respect to the current instance of another object (including the same one if we have a link to ourselves). Homonymy within the context of the distribution is not allowed. On the other hand, there are as many contexts in the database as there are instances. Therefore, within the entire database, each instance can have an almost unlimited number of different identifiers, without any contradiction or conflict.

In Cerebrum, each object, as in ODMG, is addressed using the soft pointer or the object ID of Cerebrum.Runtime.NativeHandle. Pointers to instances of objects external to some current object are retrieved within the context of the current object. An object can only address objects with which it has established a connection. Each object associated with the current object has an identifier. Each unique identifier identifies within the current object an instance of its associated object. Within the current object, each identifier can address only one instance. However, an important difference from ODMG is the possibility of having several different NativeHandles addressing the same instance within a certain object. Within a given object, several different identifiers can refer to the same object instance. Unlike ODMG, where each object has only one unique identifier, in Cerebrum the same object can have different identifiers. If we consider not one copy, but the entire base, then, by analogy with the phenomena of the natural language of synonymy and homonymy, the phenomenon of synonymy and homonymy of object identifiers arises in the Cerebrum CSP. In different contexts defined by different instances of objects, the same NativeHandle object identifier can address both different instances and the same instance. This phenomenon can be called homonymy of object identifiers. As in natural language with homonymy, the same identifier in different contexts of the same database can address both different and identical instances of objects. As in the same context, and in different contexts, different identifiers can refer to the same object instance. In this case, the phenomenon of synonymy of object identifiers arises, when one copy can have many different identifiers - synonyms within the base. This feature is convenient to use for building semantic and hierarchical semantic networks.

The least frequently used objects are automatically pushed onto the long-term storage system, freeing up RAM. Therefore, at any time it is possible to force out and destroy any unlocked user object. If the user has received a pointer to an object and has not performed a lock operation, then the Cerebrum garbage collector may consider this object unused and forcefully push it out of RAM. As a result, a destroyed instance of the object will be in memory on this pointer. To prevent such a phenomenon, a pointer to the IConnector shell object is returned instead of pointers to instances of user objects. In the constructor of the wrapped object, the lock counter of the object is increased, in the destructor, the lock counter is reduced. As long as at least one shell is in memory, the blocking counter is not equal to 0 and the object corresponding to this shell is protected from destruction. This prevents the user instance from being preempted during use. You cannot use pointers to a user instance without saving a pointer to the IConnector shell. An important feature of this system is the need to call the Dispose method on all shell objects to prevent excessive memory consumption.

The IConnector shell object has a Component property through which a user instance is accessible. In most cases, the pattern of working with an object is as follows:

using (IComposite composite = this.m_Connector.Workspace.AttachConnector (h))

{

ISomeInterface node = composite.Component as ISomeInterface;

result = node.SomeMethod (connector, ...);

...

}

When the addressing is done by the h identifier of the object in the this_m_Connector.Workspace.AttachConnector method, its IConnector shell is returned. After completing the object, the shell must be destroyed by calling Dispose. C # using () directives are needed to ensure that an object is kept in memory while calling its methods, and then to guarantee a call to Dispose after working with the object. Upon exiting from using, Dispose is automatically called and destroys the wrapped object. At destruction, the wrapper object will reduce the lock count. When all the shells are destroyed and the lock counter reaches 0, an instance of the user object will be available to be wiped out of RAM. If for some reason using is not applicable, and the user has forgotten or was unable to call Dispose, then the .NET Framework itself will call Dispose during the garbage collection process. As a result, a memory leak will not occur even with programmer errors.

From the point of view of the developer, Cerebrum is a collection of objects for various purposes. In one process, there may be several open databases. Each database is available to the developer as an object that implements the IWorkspace interface. An instance that implements this interface can be considered the root of the database.

public interface INativeDomain: System.IDisposable {

IActivator Activator {get; }

IContainerEx GetSector ();

}

public interface IWorkspace: INativeDomain, IContainer {

IActivator Activator {set; }

NativeHandle CurrSequence ();

NativeHandle NextSequence ();

}

Where

Activator - a property that allows you to set a factory of custom objects;

GetSector is a method that returns the shell of the root index of user instances;

NextSequence- method that returns the next unique object identifier;

CurrSequence- method that returns the last generated identifier.

This object also has a context that allows identifiers to be distributed into pointers to objects. Routing is performed using the IContainer interface, which is the base for the IWorkspace interface. IWorkspace is also a wrapper to the DomainContext object representing the database at the Cerebrum.Integrator level.

The IContainer interface contains the following methods:

public interface IContainer: IDisposable {

IComposite AttachConnector (NativeHandle plasma);

IComposite CreateConnector (NativeHandle plasma, NativeHandle typeID);

void EngageConnector (NativeHandle plasma, ICon connector connector);

void RemoveConnector (NativeHandle plasma);

}

Where

AttachConnector - a method that returns a wrapped object of a user object by the identifier of this object;

CreateConnector is a method that creates an instance of a user object and returns its shell;

RemoveConnector is a method that removes an instance of a user object from the current container. In the case of deleting a user object from all containers, the object is completely removed from the database;

EngageConnector is a method that connects an existing instance of a user object to the current container with a given identifier.

When using AttachConnector or CreateConnector, the shell of the IComposite instance is returned. The IComposite interface inherits from the IConnector.IConnector interface has the following properties:

public interface IConnector: IDisposable {

object Component {get; set; }

IWorkspace Workspace {get; }

}

Where

Component - a property that returns an instance of a user object;

Workspace is a property that returns the root database interface.

IComposite has the following properties:

public interface IComposite: IConnector {

NativeHandle GroundHandle {get; }

NativeHandle LiquidHandle {get; }

NativeHandle PlasmaHandle {get; }

NativeHandle TypeIdHandle {get; }

IConnector GetPrecedent ();

}

Where

PlasmaHandle - a property that returns a logical instance identifier;

TypeIdHandle is a property that returns the instance type identifier in the Types table.

GetPrecedent is a property that returns the wrapper object of the parent object;

GroundHandle - a property that returns the object identifier in the repository;

LiquidHandle is a property that returns the object identifier in the cache.

The instance context is determined by the object of the IConnector shell. If this instance has the ability to establish links with other objects, then the IConnector shell object can be converted to IContainer type and used to distribute some NativeHandle identifier instead of IWorkspace.

Creating a persistent instance of a custom class can be done in various ways. The standard method is the following: It is necessary to register the type of the user class in the Tables table. The QualifiedTypeName attribute must contain the full .NET name of the custom class. The KernelObjectClass attribute contains the type of the kernel object. For Scalar = 8, for Warden = 9. Then you should determine the identifier of the object being created. A unique identifier within the current database can be obtained using the NextSequence function.

Cerebrum.Runtime.NativeHandle h =
this.m_Connector.Workspace.NextSequence ();

Then, use the CreateConnector function to get a pointer to the wrapper object of the created instance. Passing the identifier of the created instance h and the identifier of the custom typeId from the Tables table as parameters. A pointer to the created instance is available through the Component property of the wrapper object.

using (IComposite composite =

this.m_Connector.Workspace.CreateConnector (h, typeId)) {

ISomeInterface node = composite.Component as ISomeInterface;

result = node.SomeMethod (connector, ...);

...

}

Custom classes are conveniently inherited from the GenericComponent class. Every GenericComponent has a DomainContext property. This property returns an instance of the DomainContext class representing the database of the Cerebrum.Integrator module. The GetChildComponents method is accessible from within the user object and returns an instance of the class that implements the IContainer interface. This allows you to assign pointers to child objects from within the user instance without having access to the object shell returned by AttachConnector / CreateConnector.

Implementation of automaton grammar

Let's create a new solution .sln in the Microsoft Visual Studio 2003 environment. Let's call it Cerebrum.Samples.Objects-01.sln. We set references to the assemblies Cerebrum.Runtime.dll, Cerebrum.Integrator.dll and Cerebrum.Windows.Forms.dll . An existing form is renamed to PrimaryWindow. The Main function created by default is removed from the PrimaryWindow. Then add the Application class. In the Application class, we create the static function Main:

[STAThread]

static void Main () {

Application app = new Application ();

Cerebrum.Windows.Forms.Application.Instance = app;

app.Initialize ();

app.Show ();

System.Windows.Forms.Application.Run (app.PrimaryWindow);

app.Shutdown ();

app.Dispose ();

}

Override the CreatePrimaryWindow and CreateContextService functions.

protected override System.Windows.Forms.Form CreatePrimaryWindow () {

return new PrimaryWindow ();

}

protected override IContextService CreateContextService () {

string primaryDirectory =

m_MasterContext! = null? (m_MasterContext.SystemDirectory):

Path.GetDirectoryName (this.GetType (). Assembly.Location);

string databaseFileName =

Path.Combine (primaryDirectory, "Cerebrum.Database.Master.bin");

string activityFileName =

Path.Combine (primaryDirectory, "Cerebrum.Database.Master.log");

string mergeFileName =

Path.Combine (primaryDirectory, "Cerebrum.Database.Master.xdb");

return new Cerebrum.Integrator.SimpleContextService (

this, databaseFileName, activityFileName, 4,

true, mergeFileName, false);

}

The given implementation of Application is the minimum required for SDIWindows.Froms applications using Cerebrum.

For the neuron we create the base class Cerebrum.Samples.Objects01.NeuronBase this class will have two attributes - the list of dependent neurons (DependentObjectsList) and the list of neurons on which the current neuron depends (PrecedentObjectsList). The INeuronLinks interface is used to manipulate these lists.

public interface INeuronLinks {

void AddPrecedent (NativeHandle h);

void AddDependent (NativeHandle h);

void DelPrecedent (NativeHandle h);

void DelDependent (NativeHandle h);

}

We implement this interface in the NeuronBase class. From the base class, we inherit a neuron that recognizes the character of the input character sequence SymbolNeuron: NeuronBase. This neuron will implement the ILineralTreeNode interface. In this interface, we need two methods — the GenerateDependents method for training dependent neurons and the method for obtaining strings of symbols corresponding to the RestoreStrings neuron, as well as a property that returns a symbol recognized by the Symbol neuron.

public interface ILineralTreeNode {

Cerebrum.Runtime.NativeHandle [] GenerateDependents (

IComposite outer (string text);

string [] RestoreStrings ();

char Symbol {get; set;}

}

Attention should be paid to the method in which the pointer is transferred to the shell object of the same neuron. In connection with the possibility described earlier for an object to have many different identifiers in different contexts, it is impossible to determine the identifier of this object from the inside of an object without selecting an individual attribute for this identifier and not storing this identifier in this attribute. However, it is always possible to find out the identifier of the object with which the shell object was assigned. Therefore, it is much more efficient to pass a pointer to its wrapper object when calling a user object method.

We will bind the SymbolNeuron neuron to implement the ILineralTreeNodeEx interface containing the additional FindAllLeafs method, which allows detecting all leaf nodes of a synchronized linear tree.

public interface ILineralTreeNodeEx: INeuronLinks, ILineralTreeNode {

void FindAllLeafs (System.Collections.ArrayList leafs);

}

Thus, the SymbolNeuron object will have 3 attributes - DependentObjectsList, PrecedentObjectsList and Symbol. We will create the Specialized folder in the project and the class Concepts in it. This class will not have instances - only static fields. Therefore, we will make a static constructor for it, and declare the instance constructor as private. Create in this class several fields corresponding to the attributes of the neuron.

Cerebrum.Runtime.NativeHandle DependentObjectsListAttribute;

Cerebrum.Runtime.NativeHandle PrecedentObjectsListAttribute;

Cerebrum.Runtime.NativeHandle SymbolAttribute;

Cerebrum.Runtime.NativeHandle SymbolNeuronType;

In the static constructor, add the code to initialize them.

static Concepts () {

using (ICon connector connector =

Application.Instance.MasterContext.GetTable ("Attributes")) {

using (TableView view =

(connector.Component as TableDescriptor) .GetTableView ()) {

PropertyDescriptor descriptor =

view.GetItemProperties (null) ["Name"];

DependentObjectsListAttribute =

Tools.FindHandle (view, descriptor,

"DependentObjectsList");

PrecedentObjectsListAttribute =

Tools.FindHandle (view, descriptor,

"PrecedentObjectsList");

SymbolAttribute = Tools.FindHandle (view, descriptor, "Symbol");

}

using (ICon connector connector =

Application.Instance.MasterContext.GetTable ("Types")) {

using (TableView view =

(connector.Component as TableDescriptor) .GetTableView ()) {

PropertyDescriptor descriptor =

view.GetItemProperties (null) ["Name"];

SymbolNeuronType =

Tools.FindHandle (view, descriptor, "SymbolNeuron");

ObjectCollectionType =

Tools.FindHandle (view, descriptor, "ObjectCollection");

}

Now we create the NeuronBase class inherited from Cerebrum.Integrator.GenericComponent. We add two functions to it, which correspond to its attributes, and override the SetConnector function.

protected override void SetConnector (SerializeDirection direction, ICon connector connector) {

base.SetConnector (direction, connector);

switch (direction) {

case Cerebrum.Runtime.SerializeDirection.Init:

{

using (NativeWarden warden =

(this.GetChildComponents () as NativeWarden)) {

warden.Newobj (Concepts.PrecedentObjectsListAttribute,

KernelObjectClass.Warden);

warden.Newobj (Concepts.DependentObjectsListAttribute,

KernelObjectClass.Warden);

}

break;

}

Redefining the SetConnector function is necessary to force the creation of child attributes containing lists of identifiers of related neuronsConcepts.PrecedentObjects-ListAttribute and Concepts.DependentObjectsListAttribute.

Functions

public NativeWarden GetPrecedentObjectsListVector () {

return this.GetAttributeContainer (Concepts.PrecedentObjectsListAttribute)

as NativeWarden;

}

public NativeWarden GetDependentObjectsListVector () {

return this.GetAttributeContainer (Concepts.DependentObjectsListAttribute)

as NativeWarden;

}

are auxiliary and make it easy to get these lists. The following functions implement INeuronLinks interface methods:

public void AddPrecedent (NativeHandle h) {

using (NativeWarden v = GetPrecedentObjectsListVector ()) {

v.SetMap (h, MapAccess.Scalar, new NativeHandle (1));

}

public void AddDependent (NativeHandle h) {

using (NativeWarden v = GetDependentObjectsListVector ()) {

v.SetMap (h, MapAccess.Scalar, new NativeHandle (1));

}

public void DelPrecedent (NativeHandle h) {

using (NativeWarden v = GetPrecedentObjectsListVector ()) {

v.SetMap (h, MapAccess.Scalar, NativeHandle.Null);

}

public void DelDependent (NativeHandle h) {

using (NativeWarden v = GetDependentObjectsListVector ()) {

v.SetMap (h, MapAccess.Scalar, NativeHandle.Null);

}

Now create the SymbolNeuron class, inherit it from NeuronBase and ILineralTreeNodeEx. In this class, we implement the Symbol property.

public char Symbol {

get {

return Convert.ToChar (GetAttributeComponent (Concepts.SymbolAttribute));

}

set {

SetAttributeComponent (Concepts.SymbolAttribute, Convert.ToInt32 (value));

}

и перекрываем метод SetConnector

protected override void SetConnector(

SerializeDirection direction,

IConnector connector) {

base.SetConnector (direction, connector);

switch(direction) {

case Cerebrum.Runtime.SerializeDirection.Init:

{

using(NativeWarden warden =

(this.GetChildComponents() as NativeWarden)) {

warden.Newobj(

Concepts.SymbolAttribute, KernelObjectClass.Scalar);

}

break;

}

В дополнение к предыдущему классу, в SetConnector создается скалярный атрибут для хранения символа, распознаваемого нейроном. Реализуем методы интерфейсаILineralTreeNode

public Cerebrum.Runtime.NativeHandle[] GenerateDependents(

IComposite outer, string text) {

//проверяем входные параметры

if(text==null || text.Length<1) return null;

//вырезаем первый символ из строки

char symbol = text[0];

//если первый символ совпадает с тем что распознает

// this neuron is our string

if (this.Symbol! = symbol) return null;

// get the rest of the line

string tail = text.Substring (1);

if (tail.Length> 0) {

// try to find neurons that recognize the rest of the string

using (NativeWarden vector = this.GetDependentObjectsListVector ()) {

foreach (DictionaryEntry de in vector) {

NativeHandle h = (NativeHandle) de.Key;

using (IComposite connector =

this.m_Connector.Workspace.AttachConnector (h)) {

ILineralTreeNode node = connector.Component as ILineralTreeNode;

if (node! = null) {

// recursive call of the same function but from another neuron

NativeHandle [] hs = node.GenerateDependents (connector, tail);

if (hs! = null) {

NativeHandle [] hs2 = new NativeHandle [hs.Length + 1];

hs.CopyTo (hs2, 1);

hs2 [0] = outer.PlasmaHandle;

return hs2;

}

// if we are here, then nothing could be found - we create something

// what is missing (we train the network).

// create an ID for a new neuron

NativeHandle h0 = this.m_Connector.Workspace.NextSequence ();

// create a new neuron instance

using (IComposite connector =

this.m_Connector.Workspace.CreateConnector (

h0, Concepts.SymbolNeuronType)) {

// and here we are bringing ILineralTreeNodeEx

// this is necessary to get to the AddPrecedent function

// which is not described in the ILineralTreeNode interface

ILineralTreeNodeEx node = connector.Component as ILineralTreeNodeEx;

if (node! = null) {

// set the neuron its character for recognition

node.Symbol = tail [0];

// we connect the created neuron with ourselves - we add

// himself in his collection Precedents and his in his collection Dependents

node.AddPrecedent (outer.PlasmaHandle);

this.AddDependent (connector.PlasmaHandle);

// recursive call of the same function but already at the new neuron

NativeHandle [] hs = node.GenerateDependents (connector, tail);

if (hs! = null) {

// insert a link to yourself at the beginning

// result array since we got the tail

NativeHandle [] hs2 = new NativeHandle [hs.Length + 1];

hs.CopyTo (hs2, 1);

hs2 [0] = outer.PlasmaHandle;

return hs2;

}

// we were the last in the list - there is nothing to recognize further, we return ourselves

return new Cerebrum.Runtime.NativeHandle [] {outer.PlasmaHandle};

}

public string [] RestoreStrings () {

// polls all predecessors and get the lines that they recognize

ArrayList result = new ArrayList ();

using (NativeWarden vector = this.GetPrecedentObjectsListVector ()) {

foreach (System.Collections.DictionaryEntry de in vector) {

NativeHandle h = (NativeHandle) de.Key;

using (ICon connector connector =

this.m_Connector.Workspace.AttachConnector (h)) {

ILineralTreeNode node = connector.Component as ILineralTreeNode;

if (node! = null) {

// recursive call of the same function but from another neuron

string [] heads = node.RestoreStrings ();

// usually in heads there can be only one line - since

// usually only one predecessor in the tree

// in case of support of multiple values and multiple paths

// propagation of the excitation wave is envisaged

// more complex option with many predecessors

foreach (string head in heads) {

result.Add (head + this.Symbol.ToString ());

}

// this is the root node without predecessors - add yourself to the beginning of the line

if (result.Count <1) {

result.Add (this.Symbol.ToString ());

}

return (string []) result.ToArray (typeof (string));

}

A class for neurons developed.

findings

Предложенный способ идентификации экземпляров объектов включает в себя общепринятый в ООБД как частный случай и позволяет избавиться от некоторых проблем, присущих существующим ООБД. Поддержка сетевой модели данных в ядреCerebrum позволяет на ее основе реализовать такие модели как иерархические семантические сети и семантические сети фреймов. Поддержка методов у сохраняемых объектов позволяет реализовать активные семантические сети и искусственные нейронные сети. Благодаря наличию автоматической сборки мусора, управление временем жизни объектов и ресурсами берет на себя СУБЗ. Разработчику остается инициализировать БД и далее сосредоточиться на решении поставленной задачи. Это делает среду сетевой объектно-ориентированной базы знаний/данныхCerebrum очень удобным и перспективным инструментом для разработок систем искусственного интеллекта.

Рассмотренный пример построения автоматной грамматики продемонстрировал некоторые из возможностей СООБЗ Cerebrum. Как видно из примера, исходные тексты приложения реализующего автоматную грамматику в среде СООБЗ Cerebrumзанимают относительно небольшой объем. Разработчику требуется инициализировать БД и далее сосредоточиться на решении поставленной задачи. Управление временем жизни объектов и ресурсами берет на себя СУСООБЗ. Это делает среду сетевой объектно-ориентированной базы знаний/данных Cerebrum очень удобным и перспективным инструментом для разработок систем искусственного интеллекта.

Comments

To leave a comment

If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.

To reply

Comment

To confirm that you are not a bot, answer:

Name

Email(not published)

Vote

Auto grammar

Comments

To leave a comment

Presentation and use of knowledge

Terms: Presentation and use of knowledge