statistical performance of convex tensor decompositionryotat/talks/tensor12kyoto.pdf · • cp...
TRANSCRIPT
![Page 1: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/1.jpg)
Statistical Performance of Convex Tensor Decomposition
Ryota Tomioka
2012/01/26 @ Kyoto University
Perspectives in Informatics 4B
Collaborators: Taiji Suzuki,Kohei Hayashi,Hisashi Kashima
Slides available: h-p://www.ibis.t.u‐tokyo.ac.jp/ryotat/tensor12kyoto.pdf
![Page 2: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/2.jpg)
Netflix challenge (2006-2009) • $1,000,000 prize • Goal: Improve the performance of a video
recommendaFon system
(predict who likes which movies)
• Example:
Likes “Star Wars” and “E.T.”,
Doesn’t like “Minority Report”.
Does he like “Blade Runner”?
![Page 3: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/3.jpg)
Matrix completion view
+1 +1 ‐1 ? ?
+1 ? ? +1 ?
? +1 ‐1 ? +1
+1 ? ? ? +1
. . . . .
. . . . .
User A User B User C User D
Goal: fill the missing entries!
![Page 4: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/4.jpg)
Matrix completion • Impossible without an assumpFon. (Missing
entries can be arbitrary) ‐‐‐ problem is ill‐posed
• Most common assumpFon:
Low‐rank decomposiFon
Y ≒ U VT × Users
Movies
Users’ features Movies’ features
![Page 5: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/5.jpg)
Matrix completion • Most common assumpFon:
Low‐rank decomposiFon (rank r)
≒ × Users
Movies
Users’ features Movies’ features
ui vj yij
yij = ui!vj
dot prodcut in r‐dim space
! "
![Page 6: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/6.jpg)
Geometric Intuition r‐dimensional space
(r: the rank of the decomposiFon)
ua
Movies he likes Movies he
doesn’t like
U
VT
![Page 7: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/7.jpg)
Geometric Intuition r‐dimensional space
(r: the rank of the decomposiFon)
U
VT
ua ub
![Page 8: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/8.jpg)
Tensor data completion • Tensor = MulF‐dimensional array
• Beyond 2D
Users
Movies
Movie preference + Fme / context / acFon
![Page 9: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/9.jpg)
Tensor data completion • Tensor = MulF‐dimensional array
• Beyond 2D
Sens
ors
Location
Climate monitoring ‐ temperature ‐ humidity ‐ rainfall
![Page 10: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/10.jpg)
Tensor data completion • Tensor = MulF‐dimensional array
• Beyond 2D
Sens
ors
Time
Neuroscience (brain imaging) Se
nsors
Time
![Page 11: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/11.jpg)
Rest of this talk • CompuFng low‐rank matrix decomposiFon
• Generalizing from matrix to tensor
• Analyzing the performance
– StaFsFcal learning theory
![Page 12: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/12.jpg)
CompuFng low‐rank matrix decomposiFon
![Page 13: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/13.jpg)
Computing low-rank decomposition • If all entries are observed (no missing entries)
– Given Y, compute singular value decomposiFon (SVD)
Y U VT ≒ Σ m
n
m
r
r r
r n
where U, V: Orthogonal (UTU=I, VTV=I)
! =
!!1
. . .!r
"σj: jth largest singular value
![Page 14: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/14.jpg)
Tolerating missings
U m r
V n r
Set of observed index pairs
OpFmizaFon problem
minimizeU ,V
!
(ij)!!(yij " ui
#vj)2
Users’ features
Movies’ features
![Page 15: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/15.jpg)
Tolerating missings OpFmizaFon problem
(‐1,‐1)
(1,1)
Non‐convex!
u
v
minimizeU ,V
!
(ij)!!(yij " ui
#vj)2
U m r
V n r
Users’ features
Movies’ features
![Page 16: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/16.jpg)
Tolerating missings OpFmizaFon problem SFll non‐convex!
Rank constraint
is NP hard
minimizeW
!
(ij)!!(yij " wij)
2,
subject to rank(W ) # r
![Page 17: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/17.jpg)
Convex relaxation of rank Scha-en p‐norm (to the pth power)
!! !" !# $ # " !$
#
"
!
%
&
&
'('$)$#
'('$)*
'('
("
p=1 is the Fghtest
convex relaxaFon
(also known as
trace norm /
nuclear norm)
!j(W ) : jth largest singular value
!W !pSp
:=!r
j=1!p
j (W )
!W !pSp
p"0###" rank(W )
![Page 18: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/18.jpg)
Tolerating missings
Cf. Lasso (L1 norm) for variable selecFon
= linear sum of abs. coefficients
OpFmizaFon problem Convex relaxaFon minimize
W
!
(ij)!!
(yij " wij)2,
subject to #W #S1$ !
Scha-en 1‐norm (nuclear norm,
trace norm)
!W !S1=
!r
j=1!j(W )
!j(W ) : jth largest singular value
![Page 19: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/19.jpg)
Take home messages • Rank constrained minimizaFon is hard to solve
(non‐convex and NP hard)
• Can be relaxed into a tractable convex problem
using Scha-en 1‐norm.
![Page 20: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/20.jpg)
How about tensors?
‐ How to define tensor rank?
‐ How related to matrix rank?
![Page 21: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/21.jpg)
Ranf of a tensor DefiniFon. Let
where
X ! Rn1!···!nK (Kth order tensor)
is rank one.
(can be wri-en as an outer product of K vectors)
The smallest number R such that the given tensor X is wri-en as
• Called CP (CANDECOMPO/PARAFAC) decomposiFon
• Bad news: NP hard to compute the rank R even for
a fully observed X.
X =R!
r=1
Ar Ar =
![Page 22: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/22.jpg)
Bad news 2: Tensor rank is not closed X is rank 3
Y is rank 2
!X " Y!F # 0 (!#$)
Kolda & Bader 2009
X = a1 ! b1 ! c2
+ a1 ! b2 ! c1
+ a2 ! b1 ! c1
Y = !(a1 +1
!a2) ! (b1 +
1
!b2) ! (c1 +
1
!c2)
" !a1 ! b1 ! c1
![Page 23: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/23.jpg)
Tucker decomposition [Tucker 66]
• Also known as higher‐order SVD [De Lathauwer+00] • Rank (r1,r2,r3) can be computed in polynomial Fme
using unfolding operaFons.
n1 n2
n3
!Xijk =
r1"
a=1
r2"
b=1
r3"
c=1
CabcU(1)ia U (2)
jb U (3)kc
#r1
r2
r3 = !1 !2 !3
n1 r1
n2 r2
n3 r3
Core Factors
![Page 24: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/24.jpg)
Mode-k unfoldings (matricization)
![Page 25: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/25.jpg)
Computing Tucker rank • For each k=1,…,K – Compute the mode‐k unfolding X(k)
– Compute the (matrix) rank of X(k) rk = rank(X(k))
Tensor X is low‐rank in the kth mode
Matrix X(k) is low‐rank
(as a matrix)
Unfolding
Folding
![Page 26: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/26.jpg)
Computing Tucker rank • For each k=1,…,K – Compute the mode‐k unfolding X(k)
– Compute the (matrix) rank of X(k)
• Difference between Tensor rank and Tucker rank – Tensor rank is a single number R (may not be easy to compute)
– Tucker rank is defined for each mode (easy to compute)
• CP decomp is a special case of
Tucker decomp with
R=r1=r2=…=rK and diagonal core
rk = rank(X(k))
C = R
R R
![Page 27: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/27.jpg)
Basic idea • We know how to do matrix compleFon with
Scha-en 1‐norm (tractable convex opFmizaFon)
• We know how to compute Tucker rank (=the rank
of the mode‐k unfolding)
Scha-en 1‐norm
Unfolding + = Convex, tractable
tensor compleFon
![Page 28: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/28.jpg)
Overlapped Schatten 1-norm for Tensors
Schatten 1-norm of the mode-k unfolding
!!!!!!W!!!!!!
S1:=
1K
K"
k=1
!W (k)!S1
Measures the overall low‐rank‐ness
(not just a single mode)
![Page 29: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/29.jpg)
Convex Tensor Estimation Matrix
Tensor
EsFmaFon of low‐rank matrix
(hard) Convex
relaxaFon
EsFmaFon of low‐rank tensor
(hard)
Scha-en 1‐norm minimizaFon (tractable)
[Fazel, Hindi, Boyd 01]
Overlapped Scha-en 1‐norm minimizaFon
[Liu+09, Signoretto+10, Tomioka+10, Gandy+11]
Convex relaxaFon
Generalize
29
![Page 30: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/30.jpg)
Empirical performance
! !"# !"$ !"% !"& !"' !"( !") !"* !"+ #
#!!%
#!!
,-./012342542678-98:48;8<8307
=--2-4>>?@!?A>> ,
71B8C'!D'!D$!E4-.3FC)D*D+
4
4
G2398D
=H4I323/2398DJ
KL01<1B.0123402;8-.3/8
M/(n1n2n3)
(No noise)
30
Tensor compleFon result [Tomioka et al. 2010]
Phase transition!! Can we predict this theoretically?
![Page 31: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/31.jpg)
Analyzing the performance of convex tensor decomposiFon
![Page 32: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/32.jpg)
Problem setting ObservaFon model
yi = !X i, W!" + !i (i = 1, . . . , M)W! true tensor rank‐(r1,...,rK)
Example (tensor compleFon)
X1 = n1 n2
n3
!i Gaussian noise
![Page 33: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/33.jpg)
Problem setting ObservaFon model
yi = !X i, W!" + !i (i = 1, . . . , M)W! true tensor rank‐(r1,...,rK)
Example (tensor compleFon)
X2 = n1 n2
n3
!i Gaussian noise
![Page 34: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/34.jpg)
Problem setting ObservaFon model
yi = !X i, W!" + !i (i = 1, . . . , M)W! true tensor rank‐(r1,...,rK)
Example (tensor compleFon)
X3 = n1 n2
n3
!i Gaussian noise
![Page 35: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/35.jpg)
Problem setting ObservaFon model
yi = !X i, W!" + !i (i = 1, . . . , M)W! true tensor rank‐(r1,...,rK)
Example (tensor compleFon)
X4 = n1 n2
n3
!i Gaussian noise
and so on…
![Page 36: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/36.jpg)
Problem setting
OpFmizaFon
Reg. Const. Observation model X(W) = (!X 1, W" , . . . , !X M , W")!
Empirical error Regularization
(N =!K
k=1 nk)X : RN ! RM
W = argminW!Rn1!···!nK
! 12M
!y " X(W)!22 + !M
""""""W""""""
S1
#
ObservaFon model yi = !X i, W!" + !i (i = 1, . . . , M)
W! true tensor rank‐(r1,...,rK) !i Gaussian noise N(0,σ2)
![Page 37: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/37.jpg)
Analysis objective • We would like to show something like
EsFmated tensor
True low‐rank tensor
Mean
squared
error
n = (n1, . . . , nK)
r = (r1, . . . , rK)
M
The size
The rank
Number of samples
!!!!!!!!!W !W"
!!!!!!!!!2
F
N# Op
"c(n, r)
M
#
![Page 38: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/38.jpg)
Theorem: random Gauss design Assume elements of Xi are drown iid from standard normal distribuFon.Moreover
#samples (M)
#variables (N)! c1"n!1"1/2"r"1/2
!n!1!1/2 :=!
1K
"Kk=1
#1/nk
$2, !r!1/2 :=
!1K
"Kk=1
"rk
$2
!r
nNormalized rank
![Page 39: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/39.jpg)
Theorem: random Gauss design
!n!1!1/2 :=!
1K
"Kk=1
#1/nk
$2, !r!1/2 :=
!1K
"Kk=1
"rk
$2
Assume elements of Xi are drown iid from standard normal distribuFon.Moreover
!!!!!!W !W!!!!!!!2F
N" Op
"!2#n"1#1/2#r#1/2
M
#Convergence!
Normalized rank
#samples (M)
#variables (N)! c1"n!1"1/2"r"1/2 !
r
n
![Page 40: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/40.jpg)
Proof idea
What we want to derive: !!!!!!!!!W !W"
!!!!!!!!!2
F
N# Op
"c(n, r)
M
#
EsFmated tensor
True low‐rank tensor
1
2M!X(W "W#)!22 $
!X#(!)/M, W "W#
"+ !M
#########W "W#
#########S1
It is not so hard to see: X!(!) =M!
i=1
!iXi
Since W minimizes the objecFve,
Obj(W) ! Obj(W!)
![Page 41: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/41.jpg)
Proof outline (1/3)
Inequality 1: upper‐bound the dot product !X!(!)/M, W "W!
"# Op
#$!2N$n"1$1/2
M
%%%%%%%%%W "W!
%%%%%%%%%S1
&
1
2M!X(W "W#)!22 $
!X#(!)/M, W "W#
"+ !M
#########W "W#
#########S1
EsFmated tensor
True low‐rank tensor
(opFmizaFon duality / random matrix theory)
![Page 42: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/42.jpg)
Proof outline (1/3)
Inequality 1: upper‐bound the dot product !X!(!)/M, W "W!
"# Op
#$!2N$n"1$1/2
M
%%%%%%%%%W "W!
%%%%%%%%%S1
&
EsFmated tensor
True low‐rank tensor
!M ! Op
!""2N"n#1"1/2
M
#
OpFmal reg. const
1
2M!X(W "W#)!22 $
!"!2N!n"1!1/2
M + "M
# $$$$$$$$$W "W#
$$$$$$$$$S1
Trade‐off between and
![Page 43: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/43.jpg)
Proof outline (2/3)
Inequality 2: relate the scha-en 1‐norm with the Frobenius norm
1
2M!X(W "W#)!22 $
!!2N!n"1!1/2
M
"""""""""W "W#
"""""""""S1
EsFmated tensor
True low‐rank tensor
!!!!!!!!!W !W"
!!!!!!!!!S1#
"$r$1/2
!!!!!!!!!W !W"
!!!!!!!!!F
(relaFon between L1‐ and L2‐norm)
![Page 44: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/44.jpg)
Proof outline (2/3)
Inequality 2: relate the scha-en 1‐norm with the Frobenius norm
EsFmated tensor
True low‐rank tensor
!!!!!!!!!W !W"
!!!!!!!!!S1#
"$r$1/2
!!!!!!!!!W !W"
!!!!!!!!!F
1
2M!X(W "W#)!22 $
!!2N!n"1!1/2!r!1/2
M
"""""""""W "W#
"""""""""F
(relaFon between L1‐ and L2‐norm)
![Page 45: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/45.jpg)
Proof outline (3/3)
Inequality 3: lower‐bound the les hand‐side !
!!!!!!!!!W !W"
!!!!!!!!!2
F#
1
M$X(W !W")$22
EsFmated tensor
True low‐rank tensor
1
2M!X(W "W#)!22 $
!!2N!n"1!1/2!r!1/2
M
"""""""""W "W#
"""""""""F
If #samples (M)
#variables (N)! c1"n!1"1/2"r"1/2
(Gordon‐Slepian Theorem in Gaussian process theory)
![Page 46: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/46.jpg)
Proof outline (3/3)
Inequality 3: lower‐bound the les hand‐side !
!!!!!!!!!W !W"
!!!!!!!!!2
F#
1
M$X(W !W")$22
EsFmated tensor
True low‐rank tensor
If #samples (M)
#variables (N)! c1"n!1"1/2"r"1/2
!!!!!!!!!!W !W"
!!!!!!!!!2
F#
""2N$n!1$1/2$r$1/2
M
!!!!!!!!!W !W"
!!!!!!!!!F
(Gordon‐Slepian Theorem in Gaussian process theory)
![Page 47: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/47.jpg)
Back to the theorem statement
!n!1!1/2 :=!
1K
"Kk=1
#1/nk
$2, !r!1/2 :=
!1K
"Kk=1
"rk
$2
Assume elements of Xi are drown iid from standard normal distribuFon.Moreover
!!!!!!W !W!!!!!!!2F
N" Op
"!2#n"1#1/2#r#1/2
M
#Convergence!
Normalized rank
#samples (M)
#variables (N)! c1"n!1"1/2"r"1/2 !
r
n
NoFce:
• Sample‐size condiFon independent of noise σ2.
• Bound RHS proporFonal to σ2. Threshold behavior in the limit σ2 0
![Page 48: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/48.jpg)
! !"# !"$ !"% !"& '
'!!(
'!!
)*+,-./01/21/345*6571585950-4
:4-.9+-./015**/*
1
1
;/065<1=>1&1?@
;/65<1=$!1?1>@
AB-.9.C+-./01-/85*+0,5
Tensor completion results
! !"# !"$ !"% !"&!
!"#
!"$
!"%
!"&
'
()*+,-./012*,342553!'55'6#55*55
'6#
7*,89.)32,92:**)*;<!"!'
2
2
=./0<>?!2?!2#!@
=./0<>'!!2'!!2?!@
No observation noise Normalized rank
Frac
tion M/N
at
error<=
0.01
rank=[7,8,9] 0.01
size = 50x50x20 true rank 7x8x9 or 40x9x7
rank=[40,9,7] #samples (M)
#variables (N)
![Page 49: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/49.jpg)
Including 4th order tensors
! !"# !"$ !"% !"& '!
!"#
!"$
!"%
!"&
'
()*+,-./012*,342553!'55'6#55*55
'6#
7*,89.)32,920**:;!"!'
2
2
<./0;=>!2>!2#!?
<./0;='!!2'!!2>!?
<./0;=>!2>!2#!2'!?
<./0;='!!2'!!2#!2'!?
Gap between
necessity &
sufficiency?
![Page 50: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/50.jpg)
! !"# !"$ !"% !"& '!
!"#
!"$
!"%
!"&
'
()*+,-./012*,342553!'55'6#55*55
'6#
7*,89.)32,920**:;!"!'
2
2
<./0;=>!2>!2#!?
<./0;='!!2'!!2>!?
<./0;=>!2>!2#!2'!?
<./0;='!!2'!!2#!2'!?
! !"# !"$ !"% !"& !"' !"(!
!"$
!"&
!"(
!")
#
*+,-./01234,.564775!#77#8$77,77
#8$
9,.:;0+54.;42,,<=!"!#
4
4
>012=?'!4$!@
>012=?#!!4&!@
>012=?$'!4$!!@
Matrix / tensor completion
Matrix completion Tensor completion
Frac
tion M/N
at e
rror<=
0.01
Tensor completion easier than matrix completion!?
K = 2 K ≧ 3
![Page 51: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/51.jpg)
Conclusion • Many real world problems can be cast into the
form of tensor data analysis.
• Convex opFmizaFon is a useful tool also for the
analysis of higher order tensors.
• Proposed a convex tensor decomposiFon
algorithm with performance guarantee
• Normalized rank predicts empirical scaling
behavior well
![Page 52: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/52.jpg)
Issues • Why matrix compleFon more difficult than tensor compleFon?
• How big the gap between necessity and sufficiency?
• Random Gaussian design ≠ tensor compleFon ⇒ Incoherence (Candes & Recht 09)
⇒ Spikiness (Negahban et al 10)
• When only some modes are low‐rank – Scha-en 1‐norm is too strong ⇒ Mixture approach
– E.g. Mode 1, 4 is low rank but the rest is not (combinatorial problem)
![Page 53: Statistical Performance of Convex Tensor Decompositionryotat/talks/tensor12kyoto.pdf · • CP decomp is a special case of ... 71B8C’!D’!D$!E4-.3FC)D*D+ 4 4 G2398D =H4I323/2398DJ](https://reader036.vdocuments.pub/reader036/viewer/2022071218/605034a6b05f3102ae7214a1/html5/thumbnails/53.jpg)
T h n a k y u o !