Statistical techniques for natural language analysis

Lecture

Example

l The dog ate.

Problem

l Salespeople sold the dog biscuits.

Statistical techniques for natural language analysis

The principle of choosing a part of speech

Statistical techniques for natural language analysis

Efficiency

l Blunt - 90%

l Modern - 97%

l Man - 98%

Hidden Markov Models

Statistical techniques for natural language analysis

Another approach (transformational tagging)

l Apply a dumb algorithm.

l There is a set of rules:

l Change the word tag X to tag Y, if the tag of the previous word - Z.

l Apply these rules a number of times.

l Work faster

l HMM vs. training TT training

(No starting base)

Treebank

l Build trees on the basis of the sentence, using the existing grammatical rules.

l Example:

(s (np (det the) (noun stranger))

(vp (verb ate)

(np (det the) (noun donut)

(pp (prep with) (np (det a) (noun fork)))))

Own Statistical Parser

l Check

l There are ready examples from Pen treebank l Compare with them

l Finding the rules to apply

l Assign probabilities to rules

l Finding the most likely

PCFG (Probabilistic contextfree grammars)

l sp → np vp	(1.0)
l vp → verb np	(0.8)
l vp → verb np np	(0.2)
l np → det noun	(0.5)
l np → noun	(0.3)
l np → det noun noun	(0.15)
l np → np np	(0.05)

We consider the probability of a built tree

Statistical techniques for natural language analysis

Build your own PCFG. Simple option.

l Take ready Pen treebank

l Read all the trees from it l Read each tree

l Add every new rule.

l P (rule) = number of occurrences divided by total

Two state-of-the-art statistical parsers. Markov grammars

l Solve the problem of the existence of very rare rules.

l Idea - instead of storing rules, we consider the probabilities that, for example

lnp = prep + ...

Lexicalized parsing p ( s , )  p ( h ( c ) m ( c ), t ( c ))  p ( r ( c ) h ( c ))

l Let us assign a word (head) to each vertex of the tree characterizing it.

l p (r | h) is the probability that the rule r will be applied for a node with a given h.

l p (h | m, t) is the probability that such h is a vertex child with head = m and has a tag t.

Lexicalized parsing

l Example

(S (NP The (ADJP most troublesome) report)

(VP may

(VP be

(NP (NP the August merchandise trade deficit)

(ADJP due (ADVP out) (NP tomorrow)))))

l p (h | m, t) = p (be | may, vp)

l p (r | h) = p (posvp → aux np | be)

Lexicalized parsing

l “the August merchandise trade deficit”

l rule = np → det propernoun noun noun noun

Conditioning events	p (“August”)	p (rule)
Nothing	2.7 * 10 ^ (- 4)	3.8 * 10 ^ (- 5)
Part of speech	2.8 * 10 ^ (- 3)	9.4 * 10 ^ (- 5)
h (c) = “deficit”	1.9 * 10 ^ (- 1)	6.3 * 10 ^ (- 3)

Comments

To leave a comment

If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.

To reply

Comment

To confirm that you are not a bot, answer:

Name

Email(not published)

Vote

Statistical techniques for natural language analysis

Example

Problem

The principle of choosing a part of speech

Efficiency

Hidden Markov Models

Another approach (transformational tagging)

Treebank

Own Statistical Parser

PCFG (Probabilistic contextfree grammars)

We consider the probability of a built tree

Two state-of-the-art statistical parsers. Markov grammars

Lexicalized parsing p ( s , )  p ( h ( c ) m ( c ), t ( c ))  p ( r ( c ) h ( c ))

Lexicalized parsing

Lexicalized parsing

Comments

To leave a comment

Creating question and answer systems

Terms: Creating question and answer systems