From Matrix to Tensor:
The Transition to Numerical Multilinear Algebra
Lecture 8. Multilinear Rayleigh Quotients
Charles F. Van Loan
Cornell University
The Gene Golub SIAM Summer School 2010
Selva di Fasano, Brindisi, Italy
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Where We Are
Lecture 1. Introduction to Tensor Computations
Lecture 2. Tensor Unfoldings
Lecture 3. Transpositions, Kronecker Products, Contractions
Lecture 4. Tensor-Related Singular Value Decompositions
Lecture 5. The CP Representation and Tensor Rank
Lecture 6. The Tucker Representation
Lecture 7. Other Decompositions and Nearness Problems
Lecture 8. Multilinear Rayleigh Quotients
Lecture 9. The Curse of Dimensionality
Lecture 10. Special Topics
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
What is this Lecture About?
The Goal is Quite Simple to State...
Extend the Notions of Eigenvalues and Singular Values to Tensors
An Avenue Not Open...
Forget about revealing decompositions analogous to
Q−1AQ = diag(λ1, . . . , λn)
Hyperdeterminants?
Generalizing the idea of a characteristic polynomial...
det(A− λI ) = 0?
We’ll check out something more obvious...
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
What is this Lecture About?
Rayleigh Quotients!
The eigenvalues and eigenvectors of a symmetric matrix C can be
defined by the quotient
xTCx
xT x
and the singular values and singular vectors of a general matrix A
can be defined by the quotient
uTAv
‖ u ‖‖ v ‖
These are called Rayleigh Quotients and they can be generalized to
tensors.
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Rayleigh Quotients: Matrix Case
Definition
If C ∈ IRN×N is symmetric, then
φC(x) =
xTC x
xT x
=
N∑
i1=1
N∑
i2=1
C (i1, i2)x(i1)x(i2)
/(xT x)
is a Rayleigh quotient.
Gradient
If x ∈ IRN is a unit vector, then
∇φC = 2 (C x − φC(x) · x) .
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Rayleigh Quotients: Matrix Case
The Eigenvalue Connection
If ∇φC(x) = 0 then
C x = φC(x) · x
Thus, the stationary values (vectors) of φC define the eigenvalues
(eigenvectors) of C .
This has many implications for both analysis and algorithms...
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Rayleigh Quotients: Matrix Case
Largest Eigenvalue...
The power method is a way of maximizing |φC(x)|.
Rayleigh Quotient Iteration for Specific Interior Eigenvalues...
x = good initial eigenvector guess.
Repeat:
λ = φC(x)
Solve (C − λI )z = x and set x = z/‖ z ‖
Lanczos for Extremal Eigenvalues...
Maximize/minimize φC over subspaces S1, S2, . . . where
Sk = span{ x0, Cx0,C 2x0, . . . ,C k−1x0 }
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Rayleigh Quotients: Matrix Case
For Singular Values...
ψA(u, v) =
uTAv
‖ u ‖2‖ v ‖2
=
n1∑
i1=1
n2∑
i2=1
A(i1, i2)u(i1)v(i2)
/ (‖ u ‖2‖ v ‖2)
Gradient
If u and v are unit vectors, then
∇ψA(u, v) =
[
Av − ψA(u, v)u
ATu − ψA(u, v)v
]
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Rayleigh Quotients: Matrix Case
The SVD Connection
If ∇ψA(u, v) = 0, then
Av = ψA(u, v) · u
ATu = ψA(u, v) · v
Thus, the stationary values (vectors) for ψA define the singular
values (singular vectors) for A.
Find a u that is a multiple of Av and (by the way) that same v had
better be the same multiple of ATu.
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Rayleigh Quotients: Matrix Case
Power Method
To minimize ‖ Av − σu ‖22 + ‖ ATu − σv ‖
2
2 we
(a) Look at the first term and use the current v to improve the
current u this way...
u˜ = Av , σ = ‖ u˜ ‖, u ← u˜/σ
(b) Look at the second term and use the current u to improve the
current v this way...
v˜ = ATu, σ = ‖ v˜ ‖, v ← v˜/σ
Repeat ’til happy
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Rayleigh Quotients: Matrix Case
The “Sym” of a Matrix
sym(A) =
[
0 A
AT 0
]
∈ IR(n1+n2)×(n1+n2)
The SVD of A Relates to the EVD of sym(A)
If A = U · diag(σi ) · V T is the SVD of A ∈ IRn1×n2 , then for
k = 1:rank(A)[
0 A
AT 0
] [
uk
±vk
]
= ±σk
[
uk
±vk
]
where uk = U(:, k) and vk = V (:, k).
The above SVD-related power method was essentially traditional power
method applied to finding the largest eigenvector of sym(A).
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Rayleigh Quotients: Matrix Case
The Stationary Values of ψA and φsym(A)
Suppose C = sym(A) where A ∈ IRn1×n2 . If
x =
[
u
v
]
u ∈ IRn1 , v ∈ IRn2
is a stationary vector for φC , then u and v render a pair of
stationary values for φA, i.e.,
ψ(u, v) = φ(x)
ψ(u,−v) = −φ(x)
Now let’s graduate to tensors...
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Two Preliminaries
Tensor Symmetry
For tensors, there are many types of symmetry, e.g.,
A(i1, i2, i3, i4) =
A(i2, i1, i3, i4)
A(i3, i4, i1, i2)
A(i4, i3, i2, i1)
What do we need?
sym-of-a-Tensor
Anticipating connections between tensor eigenvalues and tensor
singular values, what is the tensor analog of sym-of-a-matrix?
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Symmetric Tensors
Definition
We say that an order-d tensor C ∈ IRN×···×N is symmetric if for
any permutation p of 1:d we have
C(i) = C(i(p)) 1 ≤ i ≤ N
E.g.,
C(i1, i2, i3) =
C(i1, i3, i2)
C(i2, i1, i3)
C(i2, i3, i1)
C(i3, i1, i2)
C(i3, i2, i1)
Many authors refer to this as supersymmetry but that terminology is
falling out of favor.
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Symmetric Tensors
Problem 8.1. How many distinct values can there be in a symmetric
2-by-2-by-2 tensor?
Problem 8.2. Write a Matlab function A = RandSymTensor3(N) that
generates a random N-by-N-by-N symmetric tensor. Make effective use of
the Tensor Toolbox function permute.
Problem 8.3. Write a Matlab function A = RandSymTensor(N,d) that
generates a random order-d symmetric tensor with dimension N.
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Symmetric Tensors
Rank-1 Symmetric Tensors
If x ∈ IRN , then
C = x ◦ x ◦ x ◦ x
is a symmetric rank-1 tensor. This is obvious since
C(i1, i2, i3, i4) = x(i1)x(i2)x(i3)x(i4).
Note that
vec(x ◦ x ◦ · · · ◦ x) = x ⊗ x ⊗ · · · ⊗ x
Problem 8.4. If C ∈ IRN×···×N is an order-d symmetric tensor, then can we
find a unit vector x ∈ IRN and a scalar σ so that
C = σ · x ◦ · · · ◦ x?
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Symmetric Tensors
Unfoldings
Symmetries show up when a symmetric tensor is unfolded. For
example, if C ∈ IRN×N×N×N is symmetric, then
M = tenmat(C, [3 1], [4 2]) =
C(1, 1, :, :) · · · C(1,N, :, :)... . . . ...
C(N, 1, :, :) · · · C(N,N, :, :)
This matrix is block symmetric and its blocks are themselves
symmetric. In addition, STN,NMSN,N = M where SN,N is the perfect
shuffle.
Problem 8.5. What can you say about M’s eigensystem? Is it possible to
write C as a sum of symmetric rank-1 tensors using subvectors of M’s
eigenvectors?
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Symmetric Tensors
Symmetric Rank
An order-d symmetric tensor C ∈ IRN×···×N has symmetric rank r if
there exists x1, . . . , xr ∈ IRN and σ ∈ IRr such that
C =
r∑
k=1
σk · xk ◦ · · · ◦ xk
and no shorter sum of symmetric rank-1 tensors exists. Symmetric
rank is denoted by rankS(C).
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Symmetric Tensors
Fact
If C ∈ CN×···×N is an order-d symmetric tensor, then with
probability 1
rankS(C) =
f (d ,N) + 1 if (d ,N) = (3,5),(4,3),(4,4), or (4,5)
f (d ,N) otherwise
where
f (d ,N) = ceil
(
N + d − 1
d
)
N
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Symmetric Embedding of a Tensor
Problem Statement
Given A ∈ IRn1×···×nd , construct a symmetric tensor C that has A
as a subtensor. In other words, we want the analog of
A −→
[
0 A
AT 0
]
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Symmetric Embedding of a Tensor
An Order-3 Example...
The careful placement of A’s six transposes
C(:, :, 1)
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�A<[321]>
A<[231]>
C(:, :, 2)
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�A<[312]>
A<[132]>
C(:, :, 3)
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
A<[123]>
A<[213]>
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Symmetric Embedding of a Tensor
For Order-3 Tensors...
If C = sym(A) then C is a 3-by-3-by-3 block tensor and here is the
recipe for block Cijk :
Cijk =
A< [i j k]> if [i j k] is a permutation of [1 2 3]
zeros(ni , nj , nk) otherwise.
Problem 8.6. Write a Matlab function C = Sym3(A) that computes the
sym of an order-3 tensor A. Make effective use of the Tensor Toolbox
function permute.
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Symmetric Embedding of a Tensor
Interesting Possible Connection
Easy:
d! rank(A) ≤ rankS(sym(A))
Equality is hard or perhaps not true. But if it could be established,
then we would have new insight into the tensor rank problem.
Problem 8.7. The rank of
C = sym(A) =
»
0 A
AT 0
–
is exactly twice the rank of A. This is easy to show using the SVD, but can
you prove the result without making use of any matrix factorizations? This
previews the difficulty of showing d! rank(A) = rankS(sym(A)) since we
have no rank-revealing decompositions to help us in the tensor case.
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Singular Value Rayleigh Quotients For General Tensors
Definition
ψA(u1, u2, u3) =
n∑
i=1
A(i) · u1(i1) u2(i2) u3(i3)
‖ u1 ‖ ‖ u2 ‖ ‖ u3 ‖
where A ∈ IRn1×n2×n3 ,u1 ∈ IRn1 , u2 ∈ IRn2 , and u3 ∈ IRn3 .
Alternative Formulations
ψA(u1, u2, u3) =
uT1 A(1)(u3 ⊗ u2) / (‖ u1 ‖‖ u2 ‖‖ u3 ‖)
uT2 A(2)(u3 ⊗ u1) / (‖ u1 ‖‖ u2 ‖‖ u3 ‖)
uT3 A(3)(u2 ⊗ u1) / (‖ u1 ‖‖ u2 ‖‖ u3 ‖)
Easy to get the gradient of ψA...
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Singular Value Rayleigh Quotients For General Tensors
Gradient of ψA
If u1, u2, and u3 are unit vectors then
∇ψA =
A(1)(u3 ⊗ u2)
A(2)(u3 ⊗ u1)
A(3)(u2 ⊗ u1)
− ψA(u1, u2, u3)
u1
u2
u3
If we can satisfy this equation, then we will call ψA(u1, u2, u3) a singular
value of the tensor A.
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Singular Value Rayleigh Quotients For General Tensors
Higher-Order Power Method
Initialize unit vectors u1, u2, and u3.
Repeat
u˜1 = A(1)(u3 ⊗ u2), u1 = u˜1/‖ u˜1 ‖
u˜2 = A(2)(u3 ⊗ u1), u2 = u˜2/‖ u˜2 ‖
u˜3 = A(3)(u2 ⊗ u1), u3 = u˜3/‖ u˜3 ‖
σ = ψ(u1, u2, u3)
This algorithm converges and it approximately solves the nearest rank-1
tensor problem
‖ A − σ · u1 ◦ u2 ◦ u3 ‖ = min
Forget about iterating to get the best rank-r approximation.
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Singular Value Rayleigh Quotients For General Tensors
Problem 8.8. Implement this procedure using the Tensor Toolbox tenmat
function and taking steps to exploit structure in the products
(Matrix)-times-(Kronecker product of vectors). Terminate the iteration
when the gradient is sufficiently small. What is the behavior of the
iteration? Is it sensitive to the choice of the initial ui?
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Singular Value Rayleigh Quotients For General Tensors
2nd-Order Power Method
Initialize unit vectors u1 and u2.
Repeat
u˜1 = A(1)u2, u1 = u˜1/‖ u˜1 ‖
u˜2 = A(2)u1, u2 = u˜2/‖ u˜2 ‖
new sigma = ψ(u1, u2)
Glossary: A(1) = A, A(2) = A
T , u1 = v, u2 = u.
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Eigenvalue Rayleigh Quotients For Symmetric Tensors
Definition
φC(x) =
N∑
i=1
C(i) · x(i1) x(i2) x(i3)
‖ x ‖3
where C ∈ IRN×N×N is symmetric and x ∈ IRN .
Alternative Formulation
φC(x) =
xTC(k)(x ⊗ x)
‖ x ‖3 k doesn’t matter
Easy to get the gradient of φC ...
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Eigenvalue Rayleigh Quotients For Symmetric Tensors
Gradient of φC
If x is a unit vector, then
∇φC(x) = C(1)(x ⊗ x) − φC(x) · x
Idea for improving x :
x˜ = C(1)(x ⊗ x), λ = ‖ x˜ ‖, x ← x˜/λ
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Eigenvalue Rayleigh Quotients For Symmetric Tensors
Symmetric Higher-Order Power Method
Initialize unit vector x .
Repeat
x˜ = C(1)(x ⊗ x)
λ = ‖ x˜ ‖
x = x˜/λ
Sample Convergence Result
If the order of C is even and M is a square unfolding, then the
iteration converges if M is positive definite.
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Eigenvalue Rayleigh Quotients For Symmetric Tensors
Equivalence
If A ∈ IRn1×···×nd , then applying the Higher-Order Power Method
to A, is “equivalent” to applying the Symmetric Higher order
Power Method to sym(A).
Exiting the former with σ, u1, u2 and u3 is essentially the same as
exiting the latter with σ and
x =
u1u2
u3
.
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Eigenvalue Rayleigh Quotients For Symmetric Tensors
Theorem
If σ,
u1u2
u3
is a stationary pair for sym(C) then so are σ,
u1−u2
−u3
,
−σ,
u1−u2
u3
,
−σ,
u1u2
−u3
Recall that for matrices...[
0 A
AT 0
] [
uk
±vk
]
= ±σk
[
uk
±vk
]
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
A Lanczos Procedure for the HOSVD
Lanczos SVD
Suppose UTAV = B is bidiagonal. We appreciate this as an SVD
precomputation.
But if we compare columns in
A
[
v1 v2 v3 v4
]
=
[
u1 u2 u3 u4
]
α1 β2 0 0
0 α2 β3 0
0 0 α3 β4
0 0 0 α4
then we get... and
AT
[
u1 u2 u3 u4
]
=
[
v1 v2 v3 v4
]
α1 0 0 0
β2 α2 0 0
0 β3 α3 0
0 0 β4 α4
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
A Lanczos Procedure for the HOSVD
Lanczos SVD
v1 Given unit vector
Av1 = α1u1 Defines α1 and u1
ATu1 = α1v1 + β2v2 Defines β2 and v2
Av2 = β1u1 + α2u2 Defines α2 and u2
ATu2 = α2v2 + β3v3 Defines β3 and v3
etc
The Miracle of Lanczos..
After k-steps
A
[
v1 · · · vk
]
=
[
u1 · · · uk
]
B(1:k, 1:k) + rank-1
and the u-vectors do a great job of approximating the dominant
left singular vector subspace.
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
A Lanczos Procedure for the HOSVD
The Order-3 Case
A u-vector, a v -vector, and a w -vector are generated each step.
Modal tensor-matrix products are involved.
After k steps we have a length-k Tucker approximation:
A ≈ Sk ×1 U(k) ×2 V (k) ×3 W (k)
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients
Summary of Lecture 8.
Key Words
Symmetric tensors have the property that the value of A(i)
does not depend upon the order the indices. They arise in
many settings and special properties.
Setting to zero the gradient of a multilinear Rayleigh quotient
leads to the idea of eigenvalues and singular value of tensors.
Higher-order power methods based on the Rayleigh quotient
can be used to find nearest rank-1 approximations to either
general or symmetric tensors
A Lanczos SVD procedure for tensors can be formulated and
used to construct low rank approximations to a sparse tensor.
Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients