harnessing deep neural networks with logic rules

Harnessing Deep Neural Networks with Logic Rules

Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard Hovy, and Eric P. Xing

ACL2016

116/09/12 8NLP

[Hu+ 16]

A but B B

B-PERI-ORG

equal(yi-1, B-PER) equal(yi, I-ORG)

216/09/12 8NLP

p(y|x)CNN q(y|x) q(y|x)

Posterior regularization [Ganchev+ 10]

316/09/12 8NLP

p(y|x)CNN q(y|x) q(y|x)

Posterior regularization [Ganchev+ 10]

4

q(y|x)

16/09/12 8NLP

q(y|x)

loss

5

min

1

N

NX

n=1

(1 )loss(yn,(xn)) + loss(q(y|xn),(xn))

xn

16/09/12 8NLP

p(y|x)

weights instead of relying on explicit rule rep-resentations, we can use p for predicting new ex-amples at test time when the rule assessment isexpensive or even unavailable (i.e., the privilegedinformation setting (Lopez-Paz et al., 2016)) whilestill enjoying the benefit of integration. Besides,the second loss term in Eq.(2) can be augmentedwith rich unlabeled data in addition to the labeledexamples, which enables semi-supervised learningfor better absorbing the rule knowledge.

3.3 Teacher Network ConstructionWe now proceed to construct the teacher networkq(y|x) at each iteration from p(y|x). The itera-tion index t is omitted for clarity. We adapt theposterior regularization principle in our logic con-straint setting. Our formulation ensures a closed-form solution for q and thus avoids any significantincreases in computational overhead.

Recall the set of FOL rules R = {(Rl,l)}Ll=1.Our goal is to find the optimal q that fits the ruleswhile at the same time staying close to p. For thefirst property, we apply a commonly-used strategythat imposes the rule constraints on q through anexpectation operator. That is, for each rule (indexedby l) and each of its groundings (indexed by g)on (X,Y ), we expect Eq(Y |X)[rlg(X,Y )] = 1,with confidence l. The constraints define a rule-regularized space of all valid distributions. For thesecond property, we measure the closeness betweenq and p with KL-divergence, and wish to minimizeit. Combining the two factors together and furtherallowing slackness for the constraints, we finallyget the following optimization problem:

min

q,0KL(q(Y |X)kp(Y |X)) + C

Xl,gl

l,gl

s.t. l(1 Eq[rl,gl(X,Y )]) l,glgl = 1, . . . , Gl, l = 1, . . . , L,

(3)

where l,gl 0 is the slack variable for respec-tive logic constraint; and C is the regularizationparameter. The problem can be seen as project-ing p into the constrained subspace. The problemis convex and can be efficiently solved in its dualform with closed-form solutions. We provide thedetailed derivation in the supplementary materialsand directly give the solution here:

q(Y |X) / p(Y |X) exp

8

harnessing deep neural networks with logic rules

Engineering

egk13 - harnessing innovation for wider educational...

lehrveranstaltungen imws 2019/20 -...

harnessing the invisible fuel - rti · harnessing the...

harnessing the power of mobile computing surveyed

-artificial neural network- hopfield neural network(hnn)

harnessing rhetorical figures for argument...

sistemas de controle fuzzy neural e neural adaptativo

harnessing the power of sap query part ii - jeremy · pdf...

plasticidade neural

convolutional neural networks (cnns) and recurrent neural

harnessing the phytotherapeutic treasure troves of the

ニューラルネットを用いた...

harnessing the invisible fuel -...

harnessing the power of volunteers

harnessing and hitching donkeys, mules and … ·...

neural firing

harnessing the potential

terapia neural

becoming platforms: harnessing the power of communities,...

research 2.0 harnessing collective intelligence