improved semidefinite programming hierarchy for h`39`42 ... · with tools from algebraic geometry...
TRANSCRIPT
Improved Semidefinite ProgrammingHierarchy for hSep
with tools from Algebraic Geometry
Aram W. Harrow 1 Anand Natarajan 1 Xiaodi Wu 2
1MIT
2University of Oregon
QMA(2) Workshop, QuICS, Aug 3rd 2016arXiv:1506.08834
The problem hSep
Definition (Separable states)A bipartite state ρ ∈ D (X ⊗ Y) is separable if
ρ ∈ conv(σX ⊗ σY : σX ∈ D (X ) , σY ∈ D (Y)).
Let Sep def= separable states .
Definition (hSep)Given a Hermitian operator M over X ⊗ Y,
hSep(M) := maxρ∈Sep
Tr[Mρ].
The problem WOpt(M, ε) is to compute an ε-additiveapproximation of hSep(M).
Entanglement Testing
Definition (Weak Membership)WMem(ε, ‖·‖) : for any ρ ∈ D (X ⊗ Y), decide whether ρ ∈ Sepor ‖ρ− Sep‖ ≥ ε.
I By “GLS theorems”in convexoptimization, cansolve WMem(ε, ‖·‖1)efficiently with anoracle toWOpt(M, ε). (Usesellipsoid method)
I Any M withhSep(M) ≤ λmax(M)is an entanglementwitness.
Applications
I Finding the maximum acceptance probability of a QMA(2)machine⇔ solving WMem(ε, ‖·‖1) for M arising frompoly-time quantum circuit.
I Many more (mean-field aproximations, output entropies ofchannels, etc.)
Algorithms and Hardness
I Complexity is a function of dimension n and accuracy ε.I Algorithms based on Sum-of-Squares [DPS] and ε-nets,
[SW, BH, ...].I When ε = Ω(1):
I Need time nΩ(log n) assuming Exponential-Time Hypothesis[HM]. (Unconditionally for SoS hierarchy [HNW]).
I Exists nO(log(n)/ε2) algorithm for 1-LOCC M [BCY, BH].I When ε = 1/poly(n):
I WOpt(·, ε) is NP-hard. [Gur, Ioa, Gha]I Exist (n/
√ε)O(n) algorithms based on SDPs. [NOP]
I What about scaling with ε?
Why care about ε scaling?
I Surprising jumps in complexity in high-accuracy regime:I QIP jumps from PSPACE to EXP with inverse doubly
exponential ε. [IKW]I QMA equals PSPACE for inverse-exponential ε. [FL]I QMA(2) equals NEXP for inverse-exponential ε. [Per]
I Better understand “brute-force algorithms” for quantumproblems
I Ex: no known algorithm to approximate entangled valueof non-local game to constant factor.
I Could lead to algorithms that perform better in practice
Results
Theorem (Main)There exists an algorithm that estimates hSep(n)(M) to error ε intime exp(poly(n)) poly log(1/ε). similar for the multi-partite case.
I One algorithm based on quantifier elimination—lessconceptually interesting, so skip for this talk.
I Other algorithm based on SoS hierarchy with addedconstraints, inspired by [Nie, NR].
I Algorithm is exact: only approximation comes fromnumerical error in SDP solver
I As a complexity result:
QMAlog(2)[c, s = c−1/exp(exp(n))] ⊆ DTIME(exp(poly(n))).
Comparison with DPS
Advantages
I Primal of SDP leads to new monogamy relations relativeto particular observables.
I Dual of SDP leads to a new class of entanglementwitness.
I Analog of the exact convergence achievable for discreteoptimization, e.g., SoS for Boolean CSPs. (Note: DPSdoes not converge at any finite level)
Disadvantages
I Unlike [DPS], set of entanglement witnesses could benon-convex.
I Hence, cannot solve WMem directly (need to use GLSreduction).
Proof Outline
I Treat WOpt as a constained polynomial optimizationproblem.
I Use Karush-Kuhn-Tucker conditions to restrict to criticalset (technique from [Nie, NR])
I Claim 1: For generic M, critical set consists ofexp(n)-many discrete points. (From Bertini’s thm)
I Algorithm: use SoS to approximately optimize over criticalpoints by searching for certificates of positivity.
I Claim 2: For generic M, optimum has SoS certificate ofdegree exp(n). (From Claim 1 and Grobner basis theory)
I (Continuity argument to handle non-generic M).
Simplification: the symmetric real case
By simple reductions [HM], we can restrict to states of the form|ψ〉 ⊗ |ψ〉 , |ψ〉 ∈ Rn.
hProdSym(n)(M) :=
maxx∈Rn
f0(x) :=∑
i1,i2,j1,j2
M(i1,i2),(j1,j2)xi1xi2xj1xj2
subject to f1(x) := ||x ||2 − 1 = 0.(1)
REMARK: this is a polynomial optimization problem with ahomogenous degree 4 objective and a degree 2 constraints.
Principle of Sum-of-Squares
One way to show that a polynomial f (x) is nonnegative could be
f (x) =∑
ai(x)2 ≥ 0.
Example
f (x) = 2x2 − 6x + 5= (x2 − 2x + 1) + (x2 − 4x + 4)
= (x − 1)2 + (x − 2)2 ≥ 0.
Such a decomposition is called a sum of squares (SoS)certificate for the non-negativity of f .
Principle of SoS : constrained domain
Definition (Variety)A set V ⊆ Cn is called an algebraic variety ifV = x ∈ Cn : g1(x) = · · · = gk (x) = 0.
Non-negativity of f (x) on V can be shown by
f (x) =∑
ai(x)2 +∑
bj(x)gj(x) ≥ 0.
Question: do all nonnegative polynomials on certain varietyhave a SoS certificate?
Putinar’s Positivstellensatz
Definition (Ideal)The polynomial ideal I generated by g1, . . . ,gk∈ C[x1, . . . , xn] is
I = ∑
aigi : ai ∈ C[x1, . . . , xn] := 〈g1, · · · ,gk 〉.
Theorem (Putinar’s Positivstellensatz)Under the Archimedean condition, if f (x) > 0 on V (I)∩Rn, then
f (x) = σ(x) + g(x),
for some choice of SoS polynomial σ(x) and g(x) ∈ I.
SoS in Optimization
max f (x)
subject to gi(x) = 0 ∀i(2)
is equivalent to (under Archimedean condition)
min ν
such that ν − f (x) = σ(x) +∑
i
bi(x)gi(x), (3)
where σ(x) is SoS and bi(x) is any polynomial.
SoS hierarchy
I If σ(x) and bi(x) can have arbitrarily high degrees, then theoptimization problem (3) is equivalent to problem (2).
I By bounding the degrees, i.e., deg(σ(x)),deg(bi(x)gi(x)) ≤ 2D for some integer D, we get SoShierarchy at level D.
min ν
such that ν − f (x) = σ(x) +∑
i
bi(x)gi(x), (4)
where σ(x) is SoS and bi(x) is any polynomial and deg(σ(x)),deg(bi(x)gi(x)) ≤ 2D.DPS algorithm is SoS applied to hSep
Why is it an SDP?
Observation
I Any p(x) (of degree 2D) = mT Qm, where m is the vectorof monomials of degree up to 2D and Q is the coefficients.
I p(x) is a SoS iff Q ≥ 0.
minν,biα∈R
ν
such that νA0 − F −∑iα
biαGiα ≥ 0.(5)
Complexity: poly(m) poly log(1/ε), where m =(n+D
D
).
Karush-Kuhn-Tucker ConditionsFor any optimization problem
max f (x) s.t. gi(x) ≤ 0,hj(x) = 0,∀i , j ,
if x∗ is a local optimizer, then ∃µi , λj ,
∇f (x∗) =∑
µi∇gi(x∗) +∑
λj∇hj(x∗)
gi(x∗) ≤ 0,hj(x∗) = 0,µi ≥ 0, µigi(x∗) = 0.
Remark: for convexoptimization (our case), anyglobal optimizer satisfies KKT.
Our caseRecall our optimization problem is
max f0(x) s.t. f1(x) = 0.
The KKT condition is ∇f0(x) = λ∇f1(x), which is equivalent to
rank
∂f0(x)∂x1
∂f1(x)∂x1
......
∂f0(x)∂x2n
∂f1(x)∂x2n
< 2.
gij(x) :=∂f0(x)
∂xi
∂f1(x)
∂xj− ∂f0(x)
∂xj
∂f1(x)
∂xi= 0, ∀i , j
DPS+: Optimization with KKT constraints
min ν
such that ν − f0(x) ≥ 0f1(x) = 0
KKT gij(x) = 0 ∀1 ≤ i 6= j ≤ 2n
I DPS+ hierarchy: SoS applied to above optimizationproblem
I Will show finite convergence when D = exp(poly(n)). Thenm =
(n+DD
)= exp(poly(n)). Thus the final time is
exp(poly(n)) poly log(1/ε).
KKT Ideal
Definition (KKT Ideal & Variety)
IK =
v(x)f1(x) +∑
hij(x)gij(x)
=< f1(x),gij(x) > .
V (IK ) =
x ∈ C2n : ∀p(x) ∈ IK ,p(x) = 0
Definition (KKT Ideal to degree m)
ImK = v(x)f1(x) +
∑hij(x)gij(x) : deg(v(x)f1(x)) ≤ m,
∀ i , j ,deg(hijgij) ≤ m.
Main Technical Theorems
Theorem (Zero-dimensionality of generic IK )For a generic M, |V (IK )| <∞ and IK is zero-dimensional.
Theorem (Degree bound)There exists m = O(exp(poly(n))), s.t. for a generic M, ε > 0,
v − f0(x) + ε = σ(x) + g(x),
where σ(x) is SoS and deg(σ(x)) ≤ m,g(x) ∈ ImK .
Corollary (SDP solution)For a generic M, hProdSym(n)(M) can be estimated to error ε intime exp(poly(n))poly log(1/ε).
From generic to arbitrary M
Observations
I Generic Ms are dense, and value of SDP should be acontinuous function of M.
I Issue: SDP might be infeasible for non-generic M.
Solution
I Switch to the dual SDP: satisfies Slater’s condition, i.e,strictly feasible.
I For a generic M, by strong duality,hProdSym(n)(M) =OPTmom(M).
I For non-generic inputs M, use the continuity of the dualSDP.
Proof of Theorem 1
Let U = f1(x) = 0,W = ∀ i , j ,gij = 0. then V (IK ) ⊆ U ∩W.It suffices to show |U ∩W| <∞. Construct A =X ∩U s.t.
A ∩W = ∅ and dim(X ) = n − 1. NoteW ∩A = (W ∩ U) ∩ X .By Bezout’s theorem, two varieties with dimension sum≥ nmust intersect. Thus
dim(W ∩ U) + dim(X ) = dim(W ∩ U) + n − 1 < n.
This implies dim(W ∩ U) = 0 and thus |V (IK )| <∞.
Proof of Theorem 1: construct X
Let X = f0(x) = µ for generic (µ,M). dim(X ) = n − 1.By Bertini’s theorem, dim(A) = dim(U ∩ X ) = n − 2.
The Jacobian matrix JA =
∂f0∂x1
∂f1∂x1
......
∂f0∂xn
∂f1∂xn
has rank(JA) = 2.
W by definition says rank(JA) = 1. Thus no intersection!
Subtleties: varieties over projective space, so need toconsider point at infinity
Proof of Theorem 2
I Goal: given SoS certificate
ν − f0(x) = σ(x) + g(x), σ(x) ∈ SoS,g(x) ∈ ImK ,
upper bound deg(σ(x)),deg(g(x)).I Idea: start from any SoS certificate and lower degree ofσ(x) using Grobner basis.
I Then, show that remaining high-degree terms in g(x)vanish
I Grobner basis: set of polynomials generating Ik suchthat leading term of any poly in Ik is divisible byleading term of a basis element.
I [MR]:
|V (IK )| <∞ =⇒ max degγi ≤ D = exp(poly(n)).
Proof of Theorem 2: SoS term
I Let γi be a Grobner basis for IK .I Take any SoS certificate:
v − f0(x) = σ(x) + g(x). s.t. σ(x) SoS ,g(x) ∈ ImK .
I Let σ(x) =∑
sa(x)2. By properties of Grobner basis
sa(x) = ga(x) + ua(x), s.t. ga(x) ∈ IK ,deg(ua(x)) ≤ nD.
I Thus
v−f0(x) = σ′(x)+g′(x),deg(σ′(x)) ≤ exp(poly(n)),g′ ∈ IK .
Proof of Theorem 2: Ideal term
Suffices to show g′ ∈ ImK ,m = exp(poly(n)).
I Since f0(x) and σ(x) have degree at most m,deg(g′(x)) ≤ m. Remains to show that g′(x) can beobtained as sum of degree-m multiples of originalgenerators gij .
I First step: decompose g′(x) in Grobner basis:g′(x) =
∑tkγk (x) with deg(tkγk (x)) ≤ m.
I Second step: decompose Grobner basis in terms oforiginal generators: γk (x) =
∑uij(x)gij(x) with
deg(uij) ≤ m.I Thus
g′(x) =∑
tkuijgij(x),deg(tkuij) ≤ m, =⇒ g′(x) ∈ ImK .
Numerical Results
DPS+ beats DPS at very high levels, but what about low levelsof the hierarchy?
I DPS+ beats DPS atlevel 2, n = 3 for afamily ofmeasurementsintroduced by[DPS].
Mγ = Id− (Aγ ⊗ Id)Z (Aγ ⊗ Id)
Aγ = diag(1,1/γ, . . . , 1γ).
I For all γ ∈ [0,1],hSep(Mγ) = 1.
Open Questions
DPS+I Analyze the low levels of DPS+.I Advantages of adding KKT conditions other than
presented here.Nonlocal games
I What is the correct version of KKT conditions fornon-commutative polynomials?
I Can we have finite convergence for thecommuting-operator game value? Would imply an upperbound on commuting-operator version of MIP* (noneknown so far)
SoS hierarchyI Any other applications to quantum information?
Thank you!Q & A