on the mathematical properties of linguistic theories 인지과학 협동과정 석사 정영임

21
On the Mathematical Properties of Linguistic Theories 인인인인 인인인인 인인 인인인

Upload: wilfred-mason

Post on 13-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

On the Mathematical Properties of Linguistic

Theories

인지과학 협동과정 석사 정영임

1. Introduction

The development of new formalisms for metatheories of linguistic theories Decidability Generative capacity Recognition complexity

Linguistic theories Context-free Grammar(CFG) Transformational Grammar(TG) Lexical-functional Grammar(LFG) Generalized phrase structure Grammar(GPSG) Tree adjuct Grammar(TAG) Stratificational Grammar(SG)

2. Preliminary Definition

Elementary Definition from Complexity theory

If cw(n) is O(g), then the worst-case time complexity is O(g).-> almost all inputs to M of size n can be processed in time K*g(n)

A1, A2 are available algorithms for f, O(g1)and O(g2) are their worst-case complexity and g1≤g2

-> A2 will be the preferable algorithm (∵ K1>K2)

context-sensitive

CS recursively enumerable r.e.

f(x) a recognition function of a language L

M an algorithm for f

c(x) the cost(time and space) of executing M on a specific input x

cw a function whose argument is n(the size of the input to M)

cw(n) the maximum of c(x), the worst-case complexity function for M

ce(n) the average of c(x) over all inputs of length n for M, the expected complexity function

f is O(g) n>n0 and f(n)<K*g(n) (K: a constant)

2. Preliminary Definition

Two machine models Sequential models(Aho et al. 1974)

Singletape and multitape Turning machine(TM), random-access machines(RAM), random-access stored-program machines(RASP)

Polynomially related Transforming a sequential algorithm to parallel one improves a

t most a factor K improvement in speed

Parallel models Polynomial number of processors and circuit depth O(s2)

3. Context-Free Languages

Recognition techniques for CFL CKY or Dynamic programming(Hays, J.Cocke, Kasami, Younger)

Requires grammar in Chomsky Normal Form Squares size of input of n length

Earley’s Algorithm Recognizes CFG in time O(n3) and space O(n2) and unambiguou

s CFG in time O(n2) Ruzzo(1979)

Boolean circuits in depth of O(log(n)2) Parallel recognition is accomplished in O(log(n)2) time

C.f. Possible number of parses in some grammatical sentences of length n: 2n(Church and Patil 1982)

4. Transformational Grammar

Peters and Ritchie(1973a) Reflects transformations that move, add and delete constituents

which are recoverable Every r.e. set can be generated by applying a set of transformatio

ns to CS. The base grammar can be independent of the language being gen

erated. The universal base hypothesis is empirically vacuous. If S is recursive in the CF base, then L is predictable enumerable

and exponentially bounded. If all recursion in the base grammar passes through S and all deri

vation satisfy the terminal-length-increasing condition, then the generated language is recursive.

4. Transformational Grammar

Rounds(1975) Language recognition and generation for every recognizable langu

age in exponential time are done in exponential time under the terminal-length-nondecreasing condition and recoverability deletion

NP-complete problems

4. Transformational Grammar

Berwick A formalization reduces grammaticality to well-formedness conditi

ons on the surface structure is unusual. In GB grammar G, surface structure s, yield of s w, a constant K

-> the number of node in s : K*length(w) GB languages have the linear growth or arithmetic growth property

Problems in Berwick’s The formalization is a radical simplification Recognition complexity under other constraints No immediate functional for complexity or for weak generation cap

acity.

5. Lexical-Functional Grammar

Kaplan and Bresnan(1982) Without making use of transformation Two levels of syntactic structure: Constituent structure

and Functional structure Berwick(1982)

A set of strings whose recognition problem is NP-complete is and LFL.

The complexity of LFG comes in finding the assignment of truth-values to the variables.

6. Generalized Phrase Structure Grammar

Gerald Gazdar(1982) 자연언어 특히 영어에 대한 체계적인 설명과 처리에 적합 단일화 (unification) 에 기반한 통사 자질 이론과 규칙 일반 원리 (universal principle) 과 형식적 제약 (formal

constraint)

6.1. Node admissibility

Interpretation of Context-Free rules Rewriting rules Constraints

Rounds(1970) Top-down FSTA

(q, a, n) => (q1, ……, qn) Bottom-up FSTA

(a, n, (q1, ……, qn) => q)

6.2. Metarules

Gazdar(1982) Rules that apply to rules to produce other rules

E.g. Passive metarules

W: 다중 집합 변수 (Multiple variable)

VP -> H[2], NP The beast ate the meat.VP -> H[3], NP, PP[to] Lee gave this to Sandy.VP[PAS] -> H[2], (PP(by) The meat was eaten by the be

ast.VP[PAS] -> H[3], PP[to], (PP(by)] This was given to Sandy by Le

e.

VP W, NP

VP[PAS] W, (PP[by])

6.2. Metarules

Two devices(or constraints) in metarules Essential variables Phantom categories

7. Tree Adjunct Grammar

Joshi(1982, 1984) A TAG consists of two finite sets of finite trees, the center trees a

nd the adjunct trees. Adjunction operation

CFLs ⊂ TALs ⊂ indexed languages ⊂ CSLs

c

tA

A

n

a

+

c

A a

t

=>

8. Stratificational Grammar

The Stratification Grammar(Lamb 1966, Gleason 1964) Strata

Linearly ordered and constrained by a realization relation Realization relation

Application of specific pairs of products in the different grammar (e.g. Pairing of syntactic and semantic rules (Montague))

Two-level stratificiational grammarRewriting grammar G1 and G2

Relation R: a finite set of pairs(strings: P1, P2) D1 in G1 is realized by a derivation D2 in G2

if s1 and s2 can be decomposed into substrings s1=u1….un, s2=v1

….vn R(ui, vi)

9. Seeking Significance

How to select the most useful metatheorical results among syntactic theories?

=> To claim that the computationally most restrictive theory is preferable!

9.1. Coverage

Scope(Linebarger 1980) An item is in the immediate scope of NOT if

(1) it occurs only in the proposition which is the entire scope of NOT(2) within the proposition there are no logical elements intervening betwe

en it and NOT Polarity reverser(Ladusaw 1979)

1. A negative polarity item will be acceptable only if it is in the scope of a polarity-reversing expression

2. For any two expressions α and β, constituent of a sentence S, α is in the scope ofβ with respect to a composition structure of S, S’, iff the interpretation of α is used in the formulation of the argument β’s interpretation in S’

3. An expression D is a polarity reverser with respect to an interpretation function Φ if and only if, for all expressions X and Y,

Φ(X) ⊆ Φ(Y) => Φ(d(Y)) ⊆ Φ(d(X))

9.1. Coverage

Constraint separation Syntax-semantics boundary (e.g. polarity-sensitive) Syntax(e.g. GB, LFG) Separation sometimes has beneficial computational effect.

e.g. Separating constraints imposed by CFGs from constraints by indexed grammar=> recognition complexity remains low-order polynomial

9.2. Metatheoretical results as lower bounds

What are the minimal generative capacity and recognition complexity of actual languages?

9.3. Metatheoretical results as upper bounds

The class of possible languages could contain languages that are now recursive.

Putnam(1961) Languages might just happen to be recursive.

Peters and Ritchie(1973)1. Every TG has an exponentially bounded cycling function, and thus

generates only recursive languages,2. Every natural language has a descriptive adquate TG3. The complexity of language investigated so far is typical of the cla

ss

9.3. Metatheoretical results as upper bounds

O(g)-result Asymptotic worst-case measures. Depends on machine model and RAMs.