Start. WordNet Start. Scheme of work

Lecture



Wordnet

l Semantic lexicon of the English language

l It consists of synsets (meanings) l Synset:

l a few synonymous words

l description of meaning

l One word - several synsets (meanings)

l 150 000 words, 115 000 syncets, 207 000 pairs "word - synset"

WordNet. Semantic relations between synsets

l Nouns

l Hyperonyms : Y is a hyperonym X , if X is a type of Y l Hyponyms : Y is a hypononym X , if Y is a type X

l Equal in rank : X and Y are equal in rank if they have a common hyperonym

l Holonyms : Y is a holonim of X if X is part of Y

l Meronyms : Y is a meronym X , if Y is part X

WordNet. Semantic relations between synsets

l Verbs

l Move - Hyperonym Run

l Whisper - Hypony Speak

l Sleep - follow Snore

l Walk - equal in rank Run

Start. Wordnet

l WordNet is used in the Start system when searching for matches with T-expressions.

l Let the database have a T-expression <bird can fly>

l Canary - Bird Hyponym

l To the question: “Can canary fly?” Start will answer “Yes”

Start. Omnibase

l "Universal Base"

l Used to make requests for facts

l Model "object-property-value"

l Example: “Federico Fellini is a director of La Strada”

l Object: La Strada l Property: director

l Meaning: Federico Fellini

l Each object is associated with a data source (data source):

l star wars       imdb-movie


Start. Omnibase. Examples


Question

Who wrote the music for Star

Wars?

Who invented dynamite?

How big is Costa Rica?

How many people live in Kiribati?

Object Property Value

Star Wars Composer John Williams

Dynamite Inventor Alfred Nobel

Costa Rica Area 51,100 sq. km

Kiribati Population 94,149



What languages ​​are Guernsey Languages ​​English, spoken in French

Guernsey?

Show me paintings Monet Works [images]

by Monet

  Start.  WordNet Start.  Scheme of work

“Victor Fleming directed Gone with the wind”

Start. Omnibase

l Benefits:

l Uniform database query format

l Natural use of the model

"Object-property-value" l Disadvantages:

l Need to write a "wrapper" for each data source

Start. List of external data sources

l Wikipedia

l The World Factbook 2006

l google

l yahoo

l The Internet Movie Database

l Internet Public Library

l The Poetry Archives

l Biography.com

l Merriam-Webster Dictionary

l WorldBook

l Infoplease.com

l Metropla.net

l weather.com

Semantic web

l New Internet Development Concept

l The problem of machine analysis of information posted on the web

l All information on the web should be posted in two languages:

l human

l Computerized

l To create a computer resource description, the RDF (Resource Description Framework) format is used, based on:

l XML format

l Triplets "Object - Relationship - Subject"

Start. Natural Language Annotations

l It is proposed for each information block to make an annotation in natural language

l Compromise between machine-readable and natural description of information

l The knowledge base stores only annotations with source links attached.

l Effective organization of access to information of any type:

l Texts

l Pictures l Multimedia

l Databases

l Procedures

l Annotations can be parameterized.

Start. Natural Language Annotations

l Embed annotations:

l Add annotations to RDF document descriptions

l Using parameterized annotations

(information access schemes)

l Using answer finding schemes

Start. Add annotations

l How many people live in Kiribati?

l What is the population of Bahamas?

l Tell me Guam's population.

Start. Add annotations

1. <rdfs: Class ID = "Country">

2. <rdfs: comment> A Country in the CIA Factbook </ rdfs: comment>

3. </ rdfs: Class>

4. <rdf: Property ID = "population">

5. <rdfs: domain rdf: resource = "# Country" />

6. <rdfs: range rdf: resource = "xsd: string" />

7. <nl: ann text = "Many people live in ? S " />

8. <nl: ann text = "population of ? S " />

9. <nl: gen text = "The population of ? S is ? O " /> 10. </ rdf: Property>

 

annotations

l What is the largest area in Africa?

l Tell me what Asian country has the highest population density.

l What is the lowest infant mortality rate?

l What is the most populated South American country?

annotations

1. <nl: InformationAccessSchema>

2. <nl: ann> country of the region

$ attribute </ nl: ann>

3. <nl: pattern> ? x a: Country </ nl: pattern

4. <nl: pattern> ? x map ($ attribute ) ? val </ nl: pattern>

5. <nl: pattern> ? x : location $ region </ nl: pattern>

6. <nl: action> display (boundto ( ? X , max ( ? Val ))) </ nl: action>

7. <nl: mapping>

8. <nl: hash variable = " $ attribute ">

9. <nl: map value = "population">: population </ nl: map>

10. <nl: map value = "area">: area </ nl: map>

eleven. ...

12. </ nl: hash>

13. </ nl: mapping>

14. </ nl: InformationAccessSchema>

annotations

l Is Canada's coastline longer than Russia's coastline?

l Which country has the larger population, Germany or Japan?

l Is Nigeria's population bigger than that of South Africa?

annotations

1. <nl: InformationAccessSchema>

2. <nl: ann> $ country-1 ’s $ att is larger than $ country-2 ’ s $ att </ nl: ann>

3. <nl: pattern> ? x a: Country </ nl: pattern

4. <nl: pattern> ? x map ( $ att ) ? val-1 </ nl: pattern> 5. <nl: pattern> ? y a: Country </ nl: pattern

6. <nl: pattern> ? y map ( $ att ) ? val-2 </ nl: pattern>

7. <nl: action> display (gt ( ? Val-1 ,? Val-2 ))) </ nl: action>

8. <nl: mapping>

9. <nl: hash variable = " $ att ">

10. <nl: map value = "population">: population </ nl: map>

11. <nl: map value = "area">: area </ nl: map>

12. ...

13. </ nl: hash>

14. </ nl: mapping>

15. </ nl: InformationAccessSchema>

 

Start. Response search pattern

l What is the distance from Japan to South Korea?

l How far is the United States from Russia? l What's the distance between Germany and England?

l Plan of answering this question:

l Find the capital of one country

l Find the capital of another country

l Calculate the distance between them.

Start. Response search pattern

1. <nl: InformationPlanningSchema>

2. <nl: ann> distance between $ country1 and $ country2 </ ann>

3. <nl: plan>

4. <rdf: Seq>

5. <rdf: li> what is the capital of $ country1 : = ? capital1 </ rdf: li>

6. <rdf: li> what is the capital of $ country1 : = ? capital2 </ rdf: li>

7. <rdf: li> what is the distance between ? capital1 and ? capital2

8.: = ? distance </ rdf: li>

9. </ rdf: Seq>

10. </ nl: plan>

11. <nl: action> display ( ? Distance ) </ nl: action>

12. </ nl: InformationPlanningSchema>

Start. Natural Language Annotations

l Benefits:

l Description of information in a universal, natural language

l One parameterized summary can handle hundreds of questions.

l Automate the annotation process

l Requests to Omnibase are also described using annotations l Disadvantages:

l implementation complexity

  Start.  WordNet Start.  Scheme of work

Start. Current development

l Processing composite queries and their automatic decomposition

l Capacity building to increase information redundancy

l Automation of the analysis of semantic links in documents

l Introduction of annotations

l Specific answer-finding strategies for different subject areas.


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Creating question and answer systems

Terms: Creating question and answer systems