8 Lecture8-37下载_在线阅读_37

is_399148

暂无简介

8 Lecture8-37 From Matrix to Tensor: The Transition to Numerical Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Charles F. Van Loan Cornell University The Gene Golub SIAM Summer School 2010 Selva di Fasano, Brindisi, Italy Transition to Computational Multilin...

From Matrix to Tensor: The Transition to Numerical Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Charles F. Van Loan Cornell University The Gene Golub SIAM Summer School 2010 Selva di Fasano, Brindisi, Italy Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Where We Are Lecture 1. Introduction to Tensor Computations Lecture 2. Tensor Unfoldings Lecture 3. Transpositions, Kronecker Products, Contractions Lecture 4. Tensor-Related Singular Value Decompositions Lecture 5. The CP Representation and Tensor Rank Lecture 6. The Tucker Representation Lecture 7. Other Decompositions and Nearness Problems Lecture 8. Multilinear Rayleigh Quotients Lecture 9. The Curse of Dimensionality Lecture 10. Special Topics Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients What is this Lecture About? The Goal is Quite Simple to State... Extend the Notions of Eigenvalues and Singular Values to Tensors An Avenue Not Open... Forget about revealing decompositions analogous to Q−1AQ = diag(λ1, . . . , λn) Hyperdeterminants? Generalizing the idea of a characteristic polynomial... det(A− λI ) = 0? We’ll check out something more obvious... Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients What is this Lecture About? Rayleigh Quotients! The eigenvalues and eigenvectors of a symmetric matrix C can be defined by the quotient xTCx xT x and the singular values and singular vectors of a general matrix A can be defined by the quotient uTAv ‖ u ‖‖ v ‖ These are called Rayleigh Quotients and they can be generalized to tensors. Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Rayleigh Quotients: Matrix Case Definition If C ∈ IRN×N is symmetric, then φC(x) = xTC x xT x =  N∑ i1=1 N∑ i2=1 C (i1, i2)x(i1)x(i2) /(xT x) is a Rayleigh quotient. Gradient If x ∈ IRN is a unit vector, then ∇φC = 2 (C x − φC(x) · x) . Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Rayleigh Quotients: Matrix Case The Eigenvalue Connection If ∇φC(x) = 0 then C x = φC(x) · x Thus, the stationary values (vectors) of φC define the eigenvalues (eigenvectors) of C . This has many implications for both analysis and algorithms... Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Rayleigh Quotients: Matrix Case Largest Eigenvalue... The power method is a way of maximizing |φC(x)|. Rayleigh Quotient Iteration for Specific Interior Eigenvalues... x = good initial eigenvector guess. Repeat: λ = φC(x) Solve (C − λI )z = x and set x = z/‖ z ‖ Lanczos for Extremal Eigenvalues... Maximize/minimize φC over subspaces S1, S2, . . . where Sk = span{ x0, Cx0,C 2x0, . . . ,C k−1x0 } Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Rayleigh Quotients: Matrix Case For Singular Values... ψA(u, v) = uTAv ‖ u ‖2‖ v ‖2 =  n1∑ i1=1 n2∑ i2=1 A(i1, i2)u(i1)v(i2) / (‖ u ‖2‖ v ‖2) Gradient If u and v are unit vectors, then ∇ψA(u, v) = [ Av − ψA(u, v)u ATu − ψA(u, v)v ] Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Rayleigh Quotients: Matrix Case The SVD Connection If ∇ψA(u, v) = 0, then Av = ψA(u, v) · u ATu = ψA(u, v) · v Thus, the stationary values (vectors) for ψA define the singular values (singular vectors) for A. Find a u that is a multiple of Av and (by the way) that same v had better be the same multiple of ATu. Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Rayleigh Quotients: Matrix Case Power Method To minimize ‖ Av − σu ‖22 + ‖ ATu − σv ‖ 2 2 we (a) Look at the first term and use the current v to improve the current u this way... u˜ = Av , σ = ‖ u˜ ‖, u ← u˜/σ (b) Look at the second term and use the current u to improve the current v this way... v˜ = ATu, σ = ‖ v˜ ‖, v ← v˜/σ Repeat ’til happy Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Rayleigh Quotients: Matrix Case The “Sym” of a Matrix sym(A) = [ 0 A AT 0 ] ∈ IR(n1+n2)×(n1+n2) The SVD of A Relates to the EVD of sym(A) If A = U · diag(σi ) · V T is the SVD of A ∈ IRn1×n2 , then for k = 1:rank(A)[ 0 A AT 0 ] [ uk ±vk ] = ±σk [ uk ±vk ] where uk = U(:, k) and vk = V (:, k). The above SVD-related power method was essentially traditional power method applied to finding the largest eigenvector of sym(A). Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Rayleigh Quotients: Matrix Case The Stationary Values of ψA and φsym(A) Suppose C = sym(A) where A ∈ IRn1×n2 . If x = [ u v ] u ∈ IRn1 , v ∈ IRn2 is a stationary vector for φC , then u and v render a pair of stationary values for φA, i.e., ψ(u, v) = φ(x) ψ(u,−v) = −φ(x) Now let’s graduate to tensors... Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Two Preliminaries Tensor Symmetry For tensors, there are many types of symmetry, e.g., A(i1, i2, i3, i4) =  A(i2, i1, i3, i4) A(i3, i4, i1, i2) A(i4, i3, i2, i1) What do we need? sym-of-a-Tensor Anticipating connections between tensor eigenvalues and tensor singular values, what is the tensor analog of sym-of-a-matrix? Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Symmetric Tensors Definition We say that an order-d tensor C ∈ IRN×···×N is symmetric if for any permutation p of 1:d we have C(i) = C(i(p)) 1 ≤ i ≤ N E.g., C(i1, i2, i3) =  C(i1, i3, i2) C(i2, i1, i3) C(i2, i3, i1) C(i3, i1, i2) C(i3, i2, i1) Many authors refer to this as supersymmetry but that terminology is falling out of favor. Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Symmetric Tensors Problem 8.1. How many distinct values can there be in a symmetric 2-by-2-by-2 tensor? Problem 8.2. Write a Matlab function A = RandSymTensor3(N) that generates a random N-by-N-by-N symmetric tensor. Make effective use of the Tensor Toolbox function permute. Problem 8.3. Write a Matlab function A = RandSymTensor(N,d) that generates a random order-d symmetric tensor with dimension N. Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Symmetric Tensors Rank-1 Symmetric Tensors If x ∈ IRN , then C = x ◦ x ◦ x ◦ x is a symmetric rank-1 tensor. This is obvious since C(i1, i2, i3, i4) = x(i1)x(i2)x(i3)x(i4). Note that vec(x ◦ x ◦ · · · ◦ x) = x ⊗ x ⊗ · · · ⊗ x Problem 8.4. If C ∈ IRN×···×N is an order-d symmetric tensor, then can we find a unit vector x ∈ IRN and a scalar σ so that C = σ · x ◦ · · · ◦ x? Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Symmetric Tensors Unfoldings Symmetries show up when a symmetric tensor is unfolded. For example, if C ∈ IRN×N×N×N is symmetric, then M = tenmat(C, [3 1], [4 2]) =  C(1, 1, :, :) · · · C(1,N, :, :)... . . . ... C(N, 1, :, :) · · · C(N,N, :, :)  This matrix is block symmetric and its blocks are themselves symmetric. In addition, STN,NMSN,N = M where SN,N is the perfect shuﬄe. Problem 8.5. What can you say about M’s eigensystem? Is it possible to write C as a sum of symmetric rank-1 tensors using subvectors of M’s eigenvectors? Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Symmetric Tensors Symmetric Rank An order-d symmetric tensor C ∈ IRN×···×N has symmetric rank r if there exists x1, . . . , xr ∈ IRN and σ ∈ IRr such that C = r∑ k=1 σk · xk ◦ · · · ◦ xk and no shorter sum of symmetric rank-1 tensors exists. Symmetric rank is denoted by rankS(C). Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Symmetric Tensors Fact If C ∈ CN×···×N is an order-d symmetric tensor, then with probability 1 rankS(C) =  f (d ,N) + 1 if (d ,N) = (3,5),(4,3),(4,4), or (4,5) f (d ,N) otherwise where f (d ,N) = ceil  ( N + d − 1 d ) N  Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Symmetric Embedding of a Tensor Problem Statement Given A ∈ IRn1×···×nd , construct a symmetric tensor C that has A as a subtensor. In other words, we want the analog of A −→ [ 0 A AT 0 ] Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Symmetric Embedding of a Tensor An Order-3 Example... The careful placement of A’s six transposes C(:, :, 1) � � � � � � � � � � � � � � � �A<[321]> A<[231]> C(:, :, 2) � � � � � � � � � � � � � � � �A<[312]> A<[132]> C(:, :, 3) � � � � � � � � � � � � � � � � A<[123]> A<[213]> Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Symmetric Embedding of a Tensor For Order-3 Tensors... If C = sym(A) then C is a 3-by-3-by-3 block tensor and here is the recipe for block Cijk : Cijk =  A< [i j k]> if [i j k] is a permutation of [1 2 3] zeros(ni , nj , nk) otherwise. Problem 8.6. Write a Matlab function C = Sym3(A) that computes the sym of an order-3 tensor A. Make effective use of the Tensor Toolbox function permute. Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Symmetric Embedding of a Tensor Interesting Possible Connection Easy: d! rank(A) ≤ rankS(sym(A)) Equality is hard or perhaps not true. But if it could be established, then we would have new insight into the tensor rank problem. Problem 8.7. The rank of C = sym(A) = » 0 A AT 0 – is exactly twice the rank of A. This is easy to show using the SVD, but can you prove the result without making use of any matrix factorizations? This previews the difficulty of showing d! rank(A) = rankS(sym(A)) since we have no rank-revealing decompositions to help us in the tensor case. Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Singular Value Rayleigh Quotients For General Tensors Definition ψA(u1, u2, u3) = n∑ i=1 A(i) · u1(i1) u2(i2) u3(i3) ‖ u1 ‖ ‖ u2 ‖ ‖ u3 ‖ where A ∈ IRn1×n2×n3 ,u1 ∈ IRn1 , u2 ∈ IRn2 , and u3 ∈ IRn3 . Alternative Formulations ψA(u1, u2, u3) =  uT1 A(1)(u3 ⊗ u2) / (‖ u1 ‖‖ u2 ‖‖ u3 ‖) uT2 A(2)(u3 ⊗ u1) / (‖ u1 ‖‖ u2 ‖‖ u3 ‖) uT3 A(3)(u2 ⊗ u1) / (‖ u1 ‖‖ u2 ‖‖ u3 ‖) Easy to get the gradient of ψA... Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Singular Value Rayleigh Quotients For General Tensors Gradient of ψA If u1, u2, and u3 are unit vectors then ∇ψA =  A(1)(u3 ⊗ u2) A(2)(u3 ⊗ u1) A(3)(u2 ⊗ u1)  − ψA(u1, u2, u3)  u1 u2 u3  If we can satisfy this equation, then we will call ψA(u1, u2, u3) a singular value of the tensor A. Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Singular Value Rayleigh Quotients For General Tensors Higher-Order Power Method Initialize unit vectors u1, u2, and u3. Repeat u˜1 = A(1)(u3 ⊗ u2), u1 = u˜1/‖ u˜1 ‖ u˜2 = A(2)(u3 ⊗ u1), u2 = u˜2/‖ u˜2 ‖ u˜3 = A(3)(u2 ⊗ u1), u3 = u˜3/‖ u˜3 ‖ σ = ψ(u1, u2, u3) This algorithm converges and it approximately solves the nearest rank-1 tensor problem ‖ A − σ · u1 ◦ u2 ◦ u3 ‖ = min Forget about iterating to get the best rank-r approximation. Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Singular Value Rayleigh Quotients For General Tensors Problem 8.8. Implement this procedure using the Tensor Toolbox tenmat function and taking steps to exploit structure in the products (Matrix)-times-(Kronecker product of vectors). Terminate the iteration when the gradient is sufficiently small. What is the behavior of the iteration? Is it sensitive to the choice of the initial ui? Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Singular Value Rayleigh Quotients For General Tensors 2nd-Order Power Method Initialize unit vectors u1 and u2. Repeat u˜1 = A(1)u2, u1 = u˜1/‖ u˜1 ‖ u˜2 = A(2)u1, u2 = u˜2/‖ u˜2 ‖ new sigma = ψ(u1, u2) Glossary: A(1) = A, A(2) = A T , u1 = v, u2 = u. Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Eigenvalue Rayleigh Quotients For Symmetric Tensors Definition φC(x) = N∑ i=1 C(i) · x(i1) x(i2) x(i3) ‖ x ‖3 where C ∈ IRN×N×N is symmetric and x ∈ IRN . Alternative Formulation φC(x) = xTC(k)(x ⊗ x) ‖ x ‖3 k doesn’t matter Easy to get the gradient of φC ... Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Eigenvalue Rayleigh Quotients For Symmetric Tensors Gradient of φC If x is a unit vector, then ∇φC(x) = C(1)(x ⊗ x) − φC(x) · x Idea for improving x : x˜ = C(1)(x ⊗ x), λ = ‖ x˜ ‖, x ← x˜/λ Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Eigenvalue Rayleigh Quotients For Symmetric Tensors Symmetric Higher-Order Power Method Initialize unit vector x . Repeat x˜ = C(1)(x ⊗ x) λ = ‖ x˜ ‖ x = x˜/λ Sample Convergence Result If the order of C is even and M is a square unfolding, then the iteration converges if M is positive definite. Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Eigenvalue Rayleigh Quotients For Symmetric Tensors Equivalence If A ∈ IRn1×···×nd , then applying the Higher-Order Power Method to A, is “equivalent” to applying the Symmetric Higher order Power Method to sym(A). Exiting the former with σ, u1, u2 and u3 is essentially the same as exiting the latter with σ and x =  u1u2 u3  . Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Eigenvalue Rayleigh Quotients For Symmetric Tensors Theorem If  σ,  u1u2 u3  is a stationary pair for sym(C) then so are σ,  u1−u2 −u3  ,  −σ,  u1−u2 u3  ,  −σ,  u1u2 −u3  Recall that for matrices...[ 0 A AT 0 ] [ uk ±vk ] = ±σk [ uk ±vk ] Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients A Lanczos Procedure for the HOSVD Lanczos SVD Suppose UTAV = B is bidiagonal. We appreciate this as an SVD precomputation. But if we compare columns in A [ v1 v2 v3 v4 ] = [ u1 u2 u3 u4 ]  α1 β2 0 0 0 α2 β3 0 0 0 α3 β4 0 0 0 α4  then we get... and AT [ u1 u2 u3 u4 ] = [ v1 v2 v3 v4 ]  α1 0 0 0 β2 α2 0 0 0 β3 α3 0 0 0 β4 α4  Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients A Lanczos Procedure for the HOSVD Lanczos SVD v1 Given unit vector Av1 = α1u1 Defines α1 and u1 ATu1 = α1v1 + β2v2 Defines β2 and v2 Av2 = β1u1 + α2u2 Defines α2 and u2 ATu2 = α2v2 + β3v3 Defines β3 and v3 etc The Miracle of Lanczos.. After k-steps A [ v1 · · · vk ] = [ u1 · · · uk ] B(1:k, 1:k) + rank-1 and the u-vectors do a great job of approximating the dominant left singular vector subspace. Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients A Lanczos Procedure for the HOSVD The Order-3 Case A u-vector, a v -vector, and a w -vector are generated each step. Modal tensor-matrix products are involved. After k steps we have a length-k Tucker approximation: A ≈ Sk ×1 U(k) ×2 V (k) ×3 W (k) Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients Summary of Lecture 8. Key Words Symmetric tensors have the property that the value of A(i) does not depend upon the order the indices. They arise in many settings and special properties. Setting to zero the gradient of a multilinear Rayleigh quotient leads to the idea of eigenvalues and singular value of tensors. Higher-order power methods based on the Rayleigh quotient can be used to find nearest rank-1 approximations to either general or symmetric tensors A Lanczos SVD procedure for tensors can be formulated and used to construct low rank approximations to a sparse tensor. Transition to Computational Multilinear Algebra Lecture 8. Multilinear Rayleigh Quotients

本文档为【8 Lecture8-37】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。

8 Lecture8-37

热门搜索

历史搜索