chapter 11 structural svm

24
SVM本読み会 Chapter11. Structural SVM Waseda Univ. Hamada Lab. Taikai Takeda Twitter: @bigsea_t

Upload: taikai-takeda

Post on 13-Apr-2017

1.280 views

Category:

Engineering


4 download

TRANSCRIPT

  • SVMChapter11.Structural SVM

    Waseda Univ. Hamada Lab. Taikai Takeda

    Twitter: @bigsea_t

  • Structural SVM (SSVM)SVM

    SVM

    NLP[Yue et al. 2007] Bioinformatics[Yu et al. 2006]

    (cutting plane training)

  • [John Yu et al. 2009]SSVM http://www.cs.cornell.edu/~cnyu/latentssvm/ written in C Cornell Univ.Prof. Thorsten http://www.cs.cornell.edu/People/tj/

    This is an implementation of latent structural SVM accompanying the ICML '09 paper "Learning Latent Structural SVMs with Latent Variables". It was developed under Linux and compiles under gcc, built upon the SVM^light software by Thorsten Joachims. There are two versions available. The standalone version using the SVM^light QP solver is available below. Another version using the Mosek quadratic program solver is also available. It has been developed and tested for a longer period of time but requires the separate installation of the solver.

  • Formulate SSVMSSVM

    SVMxy yy

  • Formulate SSVMNotationsDecision function :

    , = +(,) : space of input , : feature vector 0: parameter vector

    Classifier : , = argmax

    9(,)

    : space of (structural) output

  • Formulate SSVMHard-margin problemConstraint

    Max-Margin

  • Formulate SSVMSoft-Margin Problem

  • Formulate SSVMLagrange Function

    Dual Problem(;9)(

  • Formulate SSVMKernel Function

  • Optimize SSVM

    cutting plane training

    [Joachims et al. 2009]

  • 1-Slack Formulation1-slack OP

    N-slack OP (previous one)

    1-slack OP and N-slack OP are equivalent

  • 1-Slack FormulationTheorem1. Any solution of 1-slack OP is also a solution of N-slack OP (and vice versa), with = ?? . (prove later)

    Proof sketch. optimal n-slack

    optimal 1-slack

    Therefore, the objective functions are equal for any

  • 1-Slack FormulationDual ProblemLagrange

    Dual Problem

  • Cutting Plane Training M(J)

    [Joachims et al. 2009]

  • Cutting Plane TrainingAlgorithm

  • Cutting Plane Training

    ; = ;M = 0

  • Loss functions SSVMhinge loss

    For example, in natural language parsing, a parse tree that is almost correct and differs from the correct parse in only one or a few nodes should be treated differently from a parse tree that is completely different. [Tsochantaridis et al. 2005]

    margin-rescaling, slack-rescalingloss function

    [Tsochantaridis et al. 2005]

  • Loss functionsn-slack formulationMargin rescaling

    Slack rescaling

  • Loss functionsn-slack formulationMargin rescaling

    Slack rescaling

  • Application: learning to rank IRInformation Retrieval queryranking

    relevant documents

  • Application: learning to rankEvaluation MeasureAverage Precision(AP)

    Loss Function

  • Application: learning to rankNotations

    = Q, , |T|: :

    :

    ;< = _1 ; >