learning of primitive formal systems defining labeled ... · learning of primitive formal systems...

48
Learning of Primitive Formal Systems Defining Labeled Ordered Tree Languages via Queries Tomoyuki Uchida 1 , Satoshi Matsumoto 2 , Takayoshi Shoudai 3 , Yusuke Suzuki 1 , Tetsuhiro Miyahara 1 1 Hiroshima City University, Japan, 2 Tokai University, Japan, 3 Kyushu International University, Japan ILP2017

Upload: lelien

Post on 03-Jun-2019

216 views

Category:

Documents


0 download

TRANSCRIPT

Learning of Primitive Formal Systems Defining Labeled Ordered Tree Languages

via Queries

Tomoyuki Uchida1, Satoshi Matsumoto 2, Takayoshi Shoudai 3,

Yusuke Suzuki 1 , Tetsuhiro Miyahara 1

1 Hiroshima City University, Japan,2 Tokai University, Japan, 3 Kyushu International University, Japan

ILP2017

Contents

2

Contents

1. Introduction

2. Primitive Formal Systems Defining Labeled Ordered Tree

Language

a. Term Tree Pattern

b. Primitive Formal Ordered Tree System (pFOTS) New!

and pFOTS Languages New!

3. Contributions New!

4. Conclusions and Future Work

Learning of Formal Systems Defining Graph Languages

Learning string grammars Learning graph grammars

Given a language

𝐿 = ab, aabb, aaabbb,… ,

learn a string grammar Δ

such that 𝐿 = 𝐿 Δ, 𝑝 ,where

Δ =𝑝 ab ←,𝑝(a𝑥b) ← 𝑝(𝑥)

Given a graph language 𝐿 =

, , , , ,

learn a graph grammar Γ such that

𝐿 = 𝐿 Γ, 𝑝 , where

Γ =𝑝 ←,

𝑝( ) ← 𝑝( )

x1

2x

2

1

3

1. Introduction

Formal Graph System (FGS)

[first-order logic programming system,

Uchida+1995]

Elementary Formal System (EFS)

[first-order logic programing system,

Smullyan 1961; Arikawa+1992;

Miyano+2000; Sakamoto+2003]

Learning of Formal Systems Defining Graph Languages

4

1. Introduction

We consider the efficient learnabilities via queries of formal systems

defining ordered tree languages by using background knowledge.

a6GlcNAcb4 GlcNAcManb2

(root)(leaf)

(leaf)

(leaf)

Gal GlcNAc Man

Gal

Galb4

b4

GlcNAc

GlcNAc

b4

b2

Mana3

b4 b4

Glycan data

1. Machine Learning strategies for large graph data isto create a new rule representing replacements of subgraphs that are provable

from background knowledge with appropriate variables.

2. Subgraph isomorphism problem for ordered trees is solvable in

polynomial time.

3. Ordered trees can represent much graph data such as XML data, glycan data

and parse structures of natural languages.

A logic program is suitable to represent!

S

VP

NP PP

N

Uchida V

gives D

a

IN

in

N

Orléans

N

talk

Parse Tree

Learning of Formal Systems Defining Graph Languages

5

1. Introduction

Learner(Computer)

a tree 𝑻 in Hypothesis

൜𝒚𝒆𝒔𝒏𝒐

if the target contains 𝑇otherwise

Oracle(Experts)

Query Learning

We consider the efficient learnabilities via queries of formal systems

defining ordered tree languages by using background knowledge.

A logic program is suit to represent!

Set of all ordered trees

Target set𝐋 𝚪 ∪ 𝜶∗ = (𝐋 𝚪 ∪ 𝜶𝒏

Γ is a forma system defining an ordered tree language as background knowledge.

Target class

𝐋 𝚪 ∪ 𝜶∗𝑻𝟏𝑻𝟐

𝐋 𝚪 ∪ 𝜶𝟏𝐋 𝚪 ∪ 𝜶𝟐

𝐋 𝚪 ∪ 𝜶𝟑

𝑻𝟑

𝑻𝒏

Term Tree Patterns

A term tree pattern 𝑡 = 𝑉𝑡 , 𝐸𝑡 over Σ, Λ ∪ 𝜒 is a node- and

edge-labeled ordered tree having variables, which are edges labeled

with variable labels.

𝑉𝑡 : vertex set with 𝜓𝑡 ∶ 𝑉𝑡 ⟶ 𝜮𝐸𝑡 ∈ 𝑉𝑡 × 𝑉𝑡 : edge set with 𝜑𝑡: 𝐸𝑡 ⟶ 𝜦∪𝝌 = {𝒂, 𝒃, 𝒄, 𝒅,⋯ } :

finite set of vertex labels

= {𝒂, 𝒃, 𝒄,⋯ } :

finite or infinite set of edge labels

𝝌 = {𝒙, 𝒚, 𝒛,⋯ } : infinite set of variable labels

A term tree pattern over Σ, Λ ∪ 𝜒 is linear

if all variables have mutually distinct variable labels.

6

2(a). Term Tree Patterns

x

y

z

a

c ab

𝒂

𝒄

𝒄

𝒃

𝒃𝒃

𝒄

𝒃

1

2

2

1

2

1

variable

𝒂 𝒃x1 2A primitive term tree pattern is a term tree pattern over Σ, 𝜒

consisting of two nodes and one variable.

A substitution is a finite set of bindings for variable labels.

Term Tree Patterns– Substitution –

7

2(a). Term Tree Patterns

𝜽 = 𝑥 ≔ 𝑡1, 𝑣1, 𝑣2 , 𝑦 ≔ 𝑡2, 𝑤1, 𝑤2 , 𝑧 ≔ 𝑡3, 𝑠1, 𝑠2

Binding for 𝒙 Binding for 𝑦 Binding for 𝑧

For a term tree pattern 𝑡 and a substitution 𝜃, a new ordered tree 𝑡𝜃 is obtained

by replacing variables of 𝑡 with ordered trees of bindings in 𝜃.

x

y

z

a

c ab

𝒂

𝒄

𝒄

𝒃

𝒃𝒃

𝒄

𝒃

1

2

2

1

2

1

𝒕𝒕𝟏

a b𝒂

𝒄𝒃

𝑣1

𝑣2 𝒃

c

𝒕𝟑a b

𝒂

𝒃𝒃

𝑠1

𝑠2

𝒃

c

b𝒂

𝒄

𝑤1

𝑤2𝒃

c

𝒕𝟐𝑢1

𝑢2

𝑢3𝑢4

𝑢5

𝑢6

𝑢7𝑢8

𝒕𝜽

ab

𝒄𝒃

𝒃c

b𝒄c

a b𝒃𝒃 𝒃

c

a

c ab

𝒂

𝒄

𝒄

𝒃

𝒄

𝒃

identify

identify

identify

identify

identify

identify

Primitive Formal Ordered Tree Systems (pFOTS)

8

2(b). Primitive FOTS

A primitive Formal Ordered Tree System (pFOTS) 𝚪 is a finite set of

primitive graph rewriting rules (definite clauses) over Π, Σ, Λ ∪ 𝜒 of

the form

𝒑 𝒕 ← 𝒒𝟏 𝒇𝟏 , … , 𝒒𝒏 𝒇𝒏 (𝒏 ≥ 𝟏) or

𝒑 𝒕′ ← (𝒏 = 𝟎)

where 𝑡 is a linear term tree pattern over Σ, Λ ∪ 𝜒 , 𝑡’ is an ordered tree

over Σ, Λ , 𝑓1, … , 𝑓𝑛 are primitive term tree patterns over Σ, Λ ∪ 𝜒 and

𝑝, 𝑞1, … , 𝑞𝑛 are unary predicate symbols in Π.

pFOTS Language

For a pFOTS Γ over 𝚷, 𝜮, 𝜦 ∪ 𝝌 and a predicate symbol 𝒑 in 𝚷, a

pFOTS Language 𝐿 𝚪, 𝒑 is the set of all ordered trees 𝑡 such that the

graph rewriting rule ‘𝒑(𝑡) ←’ is provable from 𝚪, that is,

𝑳 𝚪, 𝒑 = ordered tree 𝒕 | 𝚪 ⊢ 𝒑(𝒕) ← .

Example of pFOTS and its pFOTS Language

9

2(b). Examples of pFOTS and its pFOTS Language

𝐿(𝚪𝑂𝑇, 𝑝) =

a bao

a bbo

, bbo

a cao

bao

a cao

bao

cao

a cbo

, , , ,⋯

, , , , , ,b

ba

o b

ab

b

b

aoba

a b

b

b

aobb

a

b

b

aoba

ac bao

ba

o b

a b

bc

o b

b

⋯b

ba

o a

b

a bao

𝒑( ) ←, 𝒑 ←,a bbo

Γ𝑂𝑇 =𝒑( ) ← 𝒑( ), 𝒑 ,

oa by1 2a bx1 2

o

a cx1 2 by1 2o o

𝒑( ) ← 𝒑( ), 𝒑oa by1 2a bx1 2

o

by

12

bx12

ao

Target Classes of Query Learning Model

10

3(b). Target Class of Query Learning Model

For a pFOTS Γ and a predicate symbol 𝑟 that does not appear in Γ, let

𝐩𝐆𝐑𝐑 𝚪, 𝒓 be the set of all primitive graph rewriting rules 𝛼 such that

• the predicate symbol in the head of 𝛼 is 𝑟 and

• any predicate symbol in the body of 𝛼 appears in the head of a

primitive graph rewriting rule in Γ.

10

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒒( ), 𝒑 , 𝒒a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

𝒒( ) ← 𝒑( ), 𝒒 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

c1 2

x1 2

o

c

z

1

2

bya

𝒑( ) ←, 𝒑( ) ←, 𝒒( ) ←, 𝒒 ←,a bo a a b

o b a bo b a b

o c

Γ =

pGRR(Γ, 𝑟) ∋

Target Classes of Query Learning Model

11

3(b). Target Class of Query Learning Model

We consider the learnabilities via queries of the classes

1. 𝐿 Γ ∪ 𝛼 , 𝑟 𝛼 ∈ pGRR Γ, 𝑟 }

2. 𝐿 Γ1 ∪⋯∪ Γ𝐾 ∪ 𝛼 , 𝑟 𝛼 ∈ pGRR Γ1 ∪⋯∪ Γ𝐾 , 𝑟 }

of sets of pFOTS languages for fixed pFOTSs Γ, Γ1, ⋯ , Γ𝐾,

which are known in advance as background knowledge.

We consider the learnabilities via queries of the classes

1. 𝐿 Γ ∪ 𝛼 , 𝑟 𝛼 ∈ pGRR Γ, 𝑟 }

2. 𝐿 Γ1 ∪⋯∪ Γ𝐾 ∪ 𝛼 , 𝑟 𝛼 ∈ pGRR Γ1 ∪⋯∪ Γ𝐾 , 𝑟 }

of sets of pFOTS languages for fixed pFOTSs Γ, Γ1, ⋯ , Γ𝐾,

which are known in advance as background knowledge.

For a pFOTS Γ and a predicate symbol 𝑟 that does not appear in Γ, let

𝐩𝐆𝐑𝐑 𝚪, 𝒓 be the set of all primitive graph rewriting rules 𝛼 such that

• the predicate symbol in the head of 𝛼 is 𝑟 and

• any predicate symbol in the body of 𝛼 appears in the head of a

primitive graph rewriting rule in Γ.

Π(Γ) is the set of all predicate symbols in Γ

Contributions and Related Work

12

3. Contributions and Related Work

BK: 𝚪𝑶𝑻Target: rule 𝜶∗

BK: 𝚪, Target: rule 𝜶∗

𝚷 𝚪 =1 𝚷 𝚪 ≥ 𝟐

𝜦 = 𝟏Polynomial-time

inductive inference from

positive data [Suzuki+2006(TCS)]

Open

One Positive & Mem.

Query

[This work, Th. 1]

One Positive &

Mem. Query

If Cond. 2 holds

[This work, Cor. 1]

𝟐 ≤ 𝜦 ≤ ∞

BK: 𝚪𝟏 ∪⋯∪ 𝚪𝑲 Target: rules 𝜶𝟏∗ , … , 𝜶𝑵

where 𝚷 𝚪𝒌 = 𝟏 𝟏 ≤ 𝒌 ≤ 𝑲 and

𝚷 𝚪𝒊 ∩ 𝚷 𝚪𝒋 = ∅ (𝟏 ≤ 𝒊 < 𝒋 ≤ 𝑲)BK: none

Target: 𝚪∗

𝑲 > 𝟏,𝑵 = 𝟏 𝑲 = 𝟏,𝑵 > 𝟏

𝜦 = 𝟏Open Identification in the limit

by polynomial-time

update from positive data

& Mem. Query if Cond. 1

holds [Shoudai+2016(ILP)]

𝟐 ≤ 𝜦 < ∞One Positive & Mem.

Query if Cond. 2 holds

[This work, Cor. 2]𝜦 = ∞

Eq. Query & restricted

Subset Query [Matsumoto+ 2008,

Okada+2007]

Π(Γ) is the set of all predicate symbols in Γ

Contributions and Related Work

13

3. Contributions and Related Work

BK: 𝚪𝑶𝑻Target: rule 𝜶∗

BK: none

Target: 𝚪∗

𝜦 = 𝟏Polynomial-time

inductive inference from

positive data [Suzuki+2006(TCS)]

Identification in the limit

by polynomial-time

update from positive data

& Mem. Query if Cond. 1

holds[Shoudai+2016(ILP)]

𝟐 ≤ 𝜦 ≤ ∞

Contributions and Related Work

14

3(a). Contributions and Related Work

BK: 𝚪𝑶𝑻Target: rule 𝜶∗

BK: none

Targt: 𝚪

𝜦 = 𝟏Polynomial-time

inductive inference

from positive data [Suzuki+2006(TCS)]

Identification in the limit by

polynomial-time update from

positive data & Mem. Query

if Cond. 1 holds

[Shoudai et al., ILP2016]

𝟐 ≤ 𝜦 ≤ ∞

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

𝒑( ) ← 𝒑( ), 𝒑b

y12

bx12

ao

a bx1 2o

a by1 2o

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

Polynomial-time Inductive Inference

from positive data

(Identification in the limit)

Target rule 𝜶∗Target rule 𝜶∗Γ𝑂𝑇 ∪ 𝜶∗

=

Background Knowledge 𝚪𝑶𝑻defining the set of all ordered trees

over 𝒑 , 𝑎, 𝑏, 𝑐 , {𝒂, 𝒃}

Background Knowledge 𝚪𝑶𝑻defining the set of all ordered trees

over 𝒑 , 𝑎, 𝑏, 𝑐 , {𝒂, 𝒃}

Contributions and Related Work

15

3(a). Contributions and Related Work

BK: none

Target: 𝚪∗

𝜦 = 𝟏 Identification in the limit

by polynomial-time

update from positive

data & Mem. Query if

Cond. 1 holds

[Shoudai+2016(ILP)]

𝟐 ≤ 𝜦 ≤ ∞

𝒑( ) ← 𝒑( ), 𝒒 , a sx1 2 by1 2o o

a bx1 2o

a cy1 2o

𝒑( ) ← 𝒑( ), 𝒑 ,b

y12

bx12

ao

a bx1 2o

a by1 2o

𝒑( ) ←, 𝒒 ←,a bo a a c

o b

𝒒 ← 𝒒( ), 𝒒a cx1 2o

a cy1 2o

Target pFOTSTarget pFOTS

Identification in the limit

by polynomial-time update

from positive data

with membership query

pFOTS in Chomsky normal form

Γ∗ =

cy1

2

cx12

ao

BK: 𝚪, Target: rule 𝜶

Π Γ =1 Π Γ ≥ 2

𝜦 = 𝟏 Open

One Positive & Mem. Query

[This work, Th. 1]

One Positive &Mem. Query

If Cond. 2 holds[This work, Cor. 1]

𝟐 ≤ 𝜦 ≤ ∞

Contributions and Related Work

16

3(b). Contributions and Related Work

BK: 𝚪𝟏 ∪⋯∪ 𝚪𝑲 Target: rules

where 𝚷(𝚪𝒌) = 𝟏 𝟏 ≤ 𝒌 ≤ 𝑲

𝚷 𝚪𝒊 ∩ 𝚷 𝚪𝒋 = ∅ (𝟏 ≤ 𝒊 <

𝑲 > 𝟏,𝑵 = 𝟏 𝑲 =

Open

One Positive & Mem.

Query if Cond. 2 holds

[This work, Cor. 2]

Eq. Query

restricted Subset

Query [Matsumoto+ 2008,

Okada+2007]

BK: 𝚪, Target: rule 𝜶∗ BK: 𝚪𝟏 ∪⋯∪ 𝚪𝑲,

Target: rule 𝜶∗

𝚷 𝚪 =1 Π Γ ≥ 2 𝐾 > 1,𝑁 = 1

𝜦 = 𝟏 Open

𝟐 ≤ 𝜦 ≤ ∞One Positive & Mem.

Query

[This work, Th. 1]

One Positive &

Mem. Query

If Cond. 2 holds

[This work, Cor. 1]

One Positive & Mem.

Query if Cond. 2

holds

[This work, Cor. 2]

Contributions and Related Work

17

3(b). Contributions and Related Work

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

a bz1 2o

c1 2

x1 2

o

c

z

1

2

bya

Target rule 𝜶∗Target rule 𝜶∗

Query learning model

• One Positive Example

• Membership Query

Γ ∪ 𝜶∗ =

Background Knowledge 𝚪Background Knowledge 𝚪

Main Result- Learnability of the class 𝐿 Γ ∪ 𝛼 , 𝑟 𝛼 ∈ pGRR Γ, 𝑟 } via queries -

Theorem 1.

• Let Γ be a fixed pFOTS over Π, Σ, Λ ∪ 𝜒 with Λ ≥ 2 and

Π(Γ) = 1,

where Π Γ is the set of predicate symbols appearing in Γ.

• Let 𝑟 be a predicate symbol in Π that does not appear in Γ,

i.e., 𝑟 ∈ Π ∖ Π(Γ).

Let 𝜶∗ be the target in 𝐩𝐆𝐑𝐑 𝚪, 𝒓 . A primitive graph

rewriting rule 𝜶 ∈ 𝐩𝐆𝐑𝐑 𝚪, 𝒓 such that

𝑳 𝚪 ∪ 𝜶 , 𝒓 = 𝑳(𝚪 ∪ 𝜶∗ , 𝒓) holds

is exactly identified using 𝑶(𝑵𝟐) membership queries

and one positive example, where 𝑵 is the number of edges in 𝒕.

18

3(b). Main Result

BK: 𝚪, Target: rule 𝜶∗BK: 𝚪𝟏 ∪⋯∪ 𝚪𝑲,

Target: rule 𝜶∗

Π Γ =1 𝚷 𝚪 ≥ 𝟐 𝐾 > 1,𝑁 = 1

𝜦 = 𝟏 Open

One Positive & Mem.

Query

[This work, Th. 1]

One Positive &

Mem. Query

If Cond. 2 holds

[This work, Cor. 1]

One Positive & Mem.

Query if Cond. 2

holds

[This work, Cor. 2]

𝟐 ≤ 𝜦 ≤ ∞

Contributions and Related Work

19

3(b). Contributions and Related Work

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

𝒒( ) ← 𝒑( ), 𝒒 , 𝒑a bx1 2o

a by1 2o

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒒( ), 𝒑 , 𝒒a bx1 2

o

a by1 2o

a bz1 2o

a bz1 2o

c1 2

x1 2

o

c

z

1

2

bya

Target rule 𝜶∗Target rule 𝜶∗

Query learning model

• One Positive Example

• Membership Query

BK 𝚪 satisfies Condition 2

(explained in the poster session)Background

Knowledge 𝚪Background

Knowledge 𝚪

Γ ∪ 𝜶∗ =

𝒑( ) ←, 𝒑( ) ←, 𝒒( ) ←, 𝒒 ←,a bo a a b

o b a bo b a b

o c

Contributions and Related Work

20

3(b). Contributions and Related Work

Query learning model

• One Positive Example

• Membership Query

BK Γ𝑂𝑇 and 𝚪 satisfy Condition 2

𝒒( ) ← 𝒒( ), 𝒒 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

𝒒( ) ← 𝒒( ), 𝒒 , 𝒒a bx1 2o

a by1 2o

𝒒( ) ←,a bo b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒒( ), 𝒑 , 𝒒a bx1 2

o

a by1 2o

a bz1 2o

a bz1 2o

c1 2

x1 2

o

c

z

1

2

bya

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

𝒑( ) ← 𝒑( ), 𝒑b

y12

bx12

ao

a bx1 2o

a by1 2o

𝒑( ) ←,a bo a 𝒑( ) ←,a b

o b

𝒒( ) ←,a bo c

Target rule 𝜶∗Target rule 𝜶∗

Background

Knowledge 𝚪Background

Knowledge 𝚪

Background Knowledge 𝚪𝑶𝑻Background Knowledge 𝚪𝑶𝑻Γ𝑂𝑇 ∪ Γ ∪ 𝜶∗ =

BK: 𝚪𝟏 ∪⋯∪ 𝚪𝑲,

Target: rule 𝜶∗

𝑲 > 𝟏,𝑵 = 𝟏

Open

One Positive & Mem.

Query if Cond. 2

holds

[This work, Cor. 2]

Conclusions and Future Work

21

4. Conclusions

BK: 𝚪𝑶𝑻Target: rule 𝜶∗

BK: 𝚪, Target: rule 𝜶∗

|𝚷(𝚪)| = 𝟏 |𝚷(𝚪)| ≥ 𝟐

𝜦 = 𝟏Polynomial-time

inductive inference from

positive data[Suzuki+2006(TCS)]

Open

One Positive & Mem.

Query

[Th. 1]

One Positive &

Mem. Query

If Cond. 2 holds

[Cor. 1]

𝟐 ≤ 𝜦 ≤ ∞

BK: 𝚪𝟏 ∪⋯∪ 𝚪𝑲 Target: rules 𝜶𝟏∗ , … , 𝜶𝑵

where 𝚷(𝚪𝒌) = 𝟏 𝟏 ≤ 𝒌 ≤ 𝑲 and

𝚷 𝚪𝒊 ∩ 𝚷 𝚪𝒋 = ∅ (𝟏 ≤ 𝒊 < 𝒋 ≤ 𝑲)BK: none

Target: 𝚪

𝑲 > 𝟏,𝑵 = 𝟏 𝑲 = 𝟏,𝑵 > 𝟏

𝜦 = 𝟏Open Identification in the limit

by polynomial-time update

from positive data & Mem.

Query if Cond. 1 holds [Shoudai+2016(ILP)]

𝟐 ≤ 𝜦 < ∞One Positive & Mem.

Query if Cond. 2 holds

[Cor. 2]𝜦 = ∞

Eq. Query & restricted

Subset Query [Matsumoto+ 2008,

Okada+2007]

Conclusions and Future Work

22

4. Future Work

BK: 𝚪𝑶𝑻Target: rule 𝜶∗

BK: 𝚪, Target: rule 𝜶∗

𝚷 𝚪 = 𝟏 𝚷 𝚪 ≥ 𝟐

𝜦 = 𝟏Polynomial-time

inductive inference from

positive data [Suzuki+2006(TCS)]

Open

One Positive & Mem.

Query

[Th. 1]

One Positive &

Mem. Query

If Cond. 2 holds

[Cor. 1]

𝟐 ≤ 𝜦 ≤ ∞

BK: 𝚪𝟏 ∪⋯∪ 𝚪𝑲 Target: rules 𝜶𝟏∗ , … , 𝜶𝑵

where 𝚷(𝚪𝒌) = 𝟏 𝟏 ≤ 𝒌 ≤ 𝑲 and

𝚷 𝚪𝒊 ∩ 𝚷 𝚪𝒋 = ∅ (𝟏 ≤ 𝒊 < 𝒋 ≤ 𝑲)BK: none

Target: 𝚪

𝑲 > 𝟏,𝑵 = 𝟏 𝑲 = 𝟏,𝑵 > 𝟏

𝜦 = 𝟏Open Identification in the limit

by polynomial-time update

from positive data & Mem.

Query if Cond. 1 holds [Shoudai+2016(ILP)]

𝟐 ≤ 𝜦 < ∞One Positive & Mem.

Query if Cond. 2 holds

[Cor. 2]𝜦 = ∞

Eq. Query & restricted

Subset Query [Matsumoto+ 2008,

Okada+2007]

THANK YOU FOR YOUR ATTENTION!

If you’d like to know more information, please ask us in the poster session.

ILP2017

24

Appendix!!

Motivation!

1. the efficient learning for graph classes using a logic programing system to design efficient graph mining algorithms for graph data.

2. formalization of our previous work using formal graph system that directly manipulate graphs.

3. and so on

25

Why is in Query learning model?

Query learning is a model of a process of data mining using interaction between computer and experts.

So, one of applications of our results is to design an efficient graph mining algorithm for several tree structured data such as XML files, glycan data and so on.

26

Computer

Hypothesis

൜𝒚𝒆𝒔𝒏𝒐

if Hypothesis has featuresotherwise

Experts

Graph mining process

27

First Stage

Second Stage

First Stage

of Learning_pFOTS

Learning Algorithm

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

c1 2

x1 2

o

c

z

1

2

bya

Background Knowledge 𝚪Background Knowledge 𝚪

∈L(Γ ∪ 𝛼∗ ,r)

One Positive Example

Target rule 𝜶∗Target rule 𝜶∗Γ ∪ 𝛼∗ =

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒄

𝒄𝒄

𝒄oa b

𝒄

𝒃b

a

a

a

𝒄a

𝒃

𝒄

o

b

bo

o

o

Learning Algorithm

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

c1 2

x1 2

o

c

z

1

2

bya

Background Knowledge 𝚪Background Knowledge 𝚪

Matching

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒄

𝒄𝒄

𝒄oa b

𝒄

𝒃b

a

a

a

𝒄a

𝒃

𝒄

o

b

bo

o

o

Target rule 𝜶∗Target rule 𝜶∗Γ ∪ 𝛼∗ =

Learning Algorithm

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

c1 2

x1 2

o

c

z

1

2

bya

Background Knowledge 𝚪Background Knowledge 𝚪

Matching

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒄

𝒄𝒄

𝒄oa b

𝒄

𝒃b

a

a

a

𝒄a

𝒃

𝒄

o

b

bo

o

o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒄

𝒄𝒄

𝒄oa b

𝒃

a

a

b

𝒄a

𝒃

𝒄

o

b

b

o

o

Target rule 𝜶∗Target rule 𝜶∗Γ ∪ 𝛼∗ =

Learning Algorithm

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

c1 2

x1 2

o

c

z

1

2

bya

Background Knowledge 𝚪Background Knowledge 𝚪

Matching

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒄

𝒄𝒄

𝒄oa b

𝒄

𝒃b

a

a

a

𝒄a

𝒃

𝒄

o

b

bo

o

o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒄

𝒄𝒄

𝒄oa b

𝒃

a

a

b

𝒄a

𝒃

𝒄

o

b

b

o

o

Membership Query

Target rule 𝜶∗Target rule 𝜶∗Γ ∪ 𝛼∗ =

∈L(Γ ∪ 𝛼∗ ,r)?

Learning Algorithm

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

c1 2

x1 2

o

c

z

1

2

bya

Background Knowledge 𝚪Background Knowledge 𝚪

Matching

Membership Query

Yes

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒄

𝒄𝒄

𝒄oa b

𝒄

𝒃b

a

a

a

𝒄a

𝒃

𝒄

o

b

bo

o

o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒄

𝒄𝒄

𝒄oa b

𝒃

a

a

b

𝒄a

𝒃

𝒄

o

b

b

o

o

Target rule 𝜶∗Target rule 𝜶∗Γ ∪ 𝛼∗ =

∈L(Γ ∪ 𝛼∗ ,r)?

Learning Algorithm

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒄

𝒄𝒄

𝒄oa b

𝒃

a

a

b

𝒄a

𝒃

𝒄

o

b

b

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

c1 2

x1 2

o

c

z

1

2

bya

Background Knowledge 𝚪Background Knowledge 𝚪

o

o

Target rule 𝜶∗Target rule 𝜶∗Γ ∪ 𝛼∗ =

Learning Algorithm

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒄

𝒄𝒄

𝒄oa b

𝒃

a

a

b

𝒄a

𝒃

𝒄

o

b

b

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

c1 2

x1 2

o

c

z

1

2

bya

Background Knowledge 𝚪Background Knowledge 𝚪

o

o

Target rule 𝜶∗Target rule 𝜶∗Γ ∪ 𝛼∗ =

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒄

𝒄𝒄

𝒄oa b

𝒃

a

a

b

𝒄a

𝒃

𝒄

o

b

b

o

o

Learning Algorithm

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

c1 2

x1 2

o

c

z

1

2

bya

Background Knowledge 𝚪Background Knowledge 𝚪

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒄

𝒄𝒄

𝒄oa b

𝒃

a

a

b

𝒄a

𝒃

𝒄

o

b

b

Target rule 𝜶∗Target rule 𝜶∗Γ ∪ 𝛼∗ =

o

o

Learning Algorithm

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

Background Knowledge 𝚪Background Knowledge 𝚪

Target rule 𝜶∗Target rule 𝜶∗c1 2

x1 2

o

c

z

1

2

bya

Γ ∪ 𝛼∗ =

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o

Learning Algorithm

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

Background Knowledge 𝚪Background Knowledge 𝚪

Target rule 𝜶∗Target rule 𝜶∗c1 2

x1 2

o

c

z

1

2

bya

Γ ∪ 𝛼∗ =

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o

Hypothesis can not update no more in first stage!

Second Stage

of Learning_pFOTS

Learning Algorithm

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

Background Knowledge 𝚪Background Knowledge 𝚪

Target rule 𝜶∗Target rule 𝜶∗c1 2

x1 2

o

c

z

1

2

bya

Γ ∪ 𝛼∗ =

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

Hypothesisa

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o

Learning Algorithm

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

Background Knowledge 𝚪Background Knowledge 𝚪

Target rule 𝜶∗Target rule 𝜶∗c1 2

x1 2

o

c

z

1

2

bya

Γ ∪ 𝛼∗ =

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o

Hypothesisa

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o

a

b cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o ∈L(Γ ∪ 𝛼∗ ,r)?

Membership Query

No

Learning Algorithm

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

Background Knowledge 𝚪Background Knowledge 𝚪

Target rule 𝜶∗Target rule 𝜶∗c1 2

x1 2

o

c

z

1

2

bya

Γ ∪ 𝛼∗ =

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

Hypothesisa

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o

Learning Algorithm

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

Background Knowledge 𝚪Background Knowledge 𝚪

Target rule 𝜶∗Target rule 𝜶∗c1 2

x1 2

o

c

z

1

2

bya

Γ ∪ 𝛼∗ =

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o

Hypothesisa

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o∈L(Γ ∪ 𝛼∗ ,r)?

Membership Query

Yes

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

b𝒃

o

Learning Algorithm

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

Background Knowledge 𝚪Background Knowledge 𝚪

Target rule 𝜶∗Target rule 𝜶∗c1 2

x1 2

o

c

z

1

2

bya

Γ ∪ 𝛼∗ =

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o

Hypothesis

∈L(Γ ∪ 𝛼∗ ,r)?

Membership Query

Yes

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

b𝒃

o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a𝒃

o

1

2x

Learning Algorithm

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

Background Knowledge 𝚪Background Knowledge 𝚪

Target rule 𝜶∗Target rule 𝜶∗c1 2

x1 2

o

c

z

1

2

bya

Γ ∪ 𝛼∗ =

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

Hypothesisa

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a𝒃

o

1

2x

Learning Algorithm

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

Background Knowledge 𝚪Background Knowledge 𝚪

Target rule 𝜶∗Target rule 𝜶∗c1 2

x1 2

o

c

z

1

2

bya

Γ ∪ 𝛼∗ =

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o

Hypothesis

∈L(Γ ∪ 𝛼∗ ,r)?

Membership Query

Yes

a

a ca

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

b

a𝒃

o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

𝒃o

1

2x

21 y

this process applies to all edges

Learning Algorithm

𝒑( ) ←, 𝒑 ←,a bo a a b

o b

ay

a cb

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃zx

𝒄𝒓 ← 𝒑( ), 𝒑 , 𝒑a bx1 2

o

a by1 2o

a bz1 2o

𝒑( ) ← 𝒑( ), 𝒑 , 𝒑a bx1 2o

a by1 2o

a bz1 2o

Background Knowledge 𝚪Background Knowledge 𝚪

Target rule 𝜶∗Target rule 𝜶∗c1 2

x1 2

o

c

z

1

2

bya

Γ ∪ 𝛼∗ =

𝒑( ) ← 𝒑( ), 𝒑 , a cx1 2 by1 2o o

a bx1 2o

a by1 2o

a

a cb

𝒂o

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒄

a

a

a𝒃

o

Hypothesisa

za c

b

𝒂

1

2

2

1 2

1

oo

oo

𝒄

𝒄𝒃𝒃

𝒃

𝒃yx

𝒄