9 Lec9-39下载_在线阅读_39

is_399148

暂无简介

9 Lec9-39 From Matrix to Tensor: The Transition to Numerical Multilinear Algebra Lecture 9. The Curse of Dimensionality Charles F. Van Loan Cornell University The Gene Golub SIAM Summer School 2010 Selva di Fasano, Brindisi, Italy ⊗ Transition to Computational Multiline...

From Matrix to Tensor: The Transition to Numerical Multilinear Algebra Lecture 9. The Curse of Dimensionality Charles F. Van Loan Cornell University The Gene Golub SIAM Summer School 2010 Selva di Fasano, Brindisi, Italy ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Where We Are Lecture 1. Introduction to Tensor Computations Lecture 2. Tensor Unfoldings Lecture 3. Transpositions, Kronecker Products, Contractions Lecture 4. Tensor-Related Singular Value Decompositions Lecture 5. The CP Representation and Tensor Rank Lecture 6. The Tucker Representation Lecture 7. Other Decompositions and Nearness Problems Lecture 8. Multilinear Rayleigh Quotients Lecture 9. The Curse of Dimensionality Lecture 10. Special Topics ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality What is this Lecture About? Big Problems 1. A single N-by-N matrix problem is big if N is big. 2. A problem that involves N small p-by-p problems is big if N is big. 3. A problem that involves a tensor A ∈ IRn1×···×nd is big if N = n1 · · · nd is big and that can happen rather easily if d is big. ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality What is this Lecture About? Data Sparse Representation We are used to solving big matrix problems when the matrix is data-sparse, i.e., when A ∈ IRN×N can be represented with many fewer than N2 numbers. What if N is so big that we cannot even store length-N vectors? How could we apply (for example) the Rayleigh Quotient procedure in such a situation? ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality What is this Lecture About? A Very Large Eigenvalue Problem We will look at a problem where A ∈ IR2d×2d is data sparse but where d is sufficiently big to make the storage of length-2d vectors impossible. Vectors will be approximated by data sparse tensors of high order. ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality A Very Large Eigenvalue Problem A Problem From Quantum Chemistry Given a 2d -by-2d symmetric matrix H, find a vector a that minimizes r(a) = aTHa aTa Of course: a = amin, λ = r(amin) ⇒ Ha = λa. What if d = 100? ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Perspective H The Google Matrix ⇒ Both N -by-N matrices are data sparse, i.e., defined by many fewer than N2 numbers. A Very Large Eigenvalue Problem The H-Matrix H = d∑ ij tijH T i Hj + d∑ ijkl vijklH T i H T j HkHl Hi = I2i−1 ⊗ [ 0 1 0 0 ] ⊗ I2d−i T ∈ IRd×d is symmetric and V ∈ IRd×d×d×d has symmetries. Sparsity nzeros = ( 1 64 d4 − 3 32 d3 + 27 64 d2 − 11 32 d + 1 ) 2d − 1 ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality A Very Large Eigenvalue Problem Modeling Electron Interactions Have d “sites” (grid points) in physical space. The goal is compute a wave function, an element of a 2d Hilbert space. The Hilbert space is a product of d , 2-dimensional Hilbert spaces. (A site is either occupied or not occupied.) A (discretized) wavefunction is a d-tensor, 2-by-2-by-2-by-2... It is the vector that minimizes aTHa/aTa where... ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality A Very Large Eigenvalue Problem The H-Matrix H = d∑ ij tijH T i Hj + d∑ ijkl vijklH T i H T j HkHl ⇑ Kinetic Energy Weights ⇑ Potential Energy Weights Hi = I2i−1 ⊗ [ 0 1 0 0 ] ⊗ I2d−i ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality A Very Large Eigenvalue Problem Dealing with N = 2d ≈ 2100 Intractable: min a ∈ IRN aTHa aTa Tractable: min a ∈ IRN a data sparse aTHa aTa ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Tensor Networks Definition A tensor network is a tensor of high dimension that is built up from many sparsely connected tensors of low-dimension. ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality A(1) A(2) A(3) A(4) A(6) A(7) A(5) A(8) A(9) A(10) Nodes are tensors and the edges are contractions. A(1) A(2) A(3) A(4) A(6) A(7) A(5) A(8) A(9) A(10) At each site there is a tensor. Its dimension is k + 1 where k is the number of site neighbors. E.g., A (2) = A(2)(1:m, 1:m, 1:2) A (4) = A(4)(1:m, 1:m, 1:m, 1:m, 1:2) A(1) A(2) A(3) A(4) A(6) A(7) A(5) A(8) A(9) A(10) Each edge represents a contraction, e.g., m∑ i2,3=1 A (2)(i2,3, i2,5, n2) ∗ A (3)(i1,3, i2,3, i3,4, n3) A(n1, n2, n3, n4, n5, n6, n7, n8, n9, n10) = Σi1,3 , i1,8 , i2,3 , i2,5 , i3,4 , i4,5 , i4,6 , i4,10 , i6,7 , i7,10 , i8,9 , i8,10 , i9,10 A(1)(i1,3 , i1,8 , n1) ∗ A(2)(i2,3 , i2,5 , n2) ∗ A(3)(i1,3 , i2,3 , i3,4 , n3) ∗ A(4)(i3,4 , i4,5 , i4,6 , i4,10 , n4) ∗ A(5)(i2,5 , i4,5 , n5) ∗ A(6)(i4,6 , i6,7 , n6) ∗ A(7)(i6,7 , i7,10 , n7) ∗ A(8)(i1,8 , i8,9 , n8) ∗ A(9)(i8,9 , i9,10 , n9) ∗ A(10)(i4,10 , i7,10 , i9,10 , n10) Star Network A8) A(1) A(2) A(7) A(9) A(3) A(6) A(5) A(4) Torus Network A(1) A(2) A(3) A(4) A(5) Grid Network A(4,1) A(4,2) A(4,3) A(4,4) A(4,5) A(3,1) A(3,2) A(3,3) A(3,4) A(3,5) A(2,1) A(2,2) A(2,3) A(2,4) A(2,5) A(1,1) A(1,2) A(1,3) A(1,4) A(1,5) A 5-Site Linear Tensor Network A(1) A(2) A(3) A(4) A(5) A(1) : 2×m A(2) : m×m× 2 A(3) : m×m× 2 A(4) : m×m× 2 A(5) : m× 2 m is a parameter, typically around 100. This defines a length-32 vector which we index using five subscripts... Scalar Definition a(n1, n2, n3, n4, n5) = m Σ i1=1 m Σ i2=1 m Σ i3=1 m Σ i4=1 A(1)(n1, i1) ∗ A (2)(i1, i2, n2) ∗ A (3)(i2, i3, n3) ∗ A (4)(i3, i4, n4) ∗ A (5)(i4, n5) In general, a length-2d vector is represented by O(dm2) numbers. Computing a(1:2,1:2,1:2,1:2,1:2) A(1) A(2) A(3) A(4) A(5) a ( 1 , 1 , 1 , 1 , 1 ) Computing a(1:2,1:2,1:2,1:2,1:2) A(1) A(2) A(3) A(4) A(5) a ( 2 , 1 , 1 , 1 , 1 ) Computing a(1:2,1:2,1:2,1:2,1:2) A(1) A(2) A(3) A(4) A(5) a ( 1 , 2 , 1 , 1 , 1 ) Computing a(1:2,1:2,1:2,1:2,1:2) A(1) A(2) A(3) A(4) A(5) a ( 1 , 1 , 2 , 1 , 2 ) Computing a(1:2,1:2,1:2,1:2,1:2) A(1) A(2) A(3) A(4) A(5) a ( 2 , 2 , 2 , 2 , 2 ) Linear Tensor Network Recall the Block Vec Operation [ F1 F2 ] ⊗  G1G2 G3 ⊗ [ H1 H2 ] = [ F1 F2 ] ⊗  G1H1 G1H2 G2H1 G2H2 G3H1 G3H2  =  F1G1H1 F1G1H2 F1G2H1 F1G2H2 F1G3H1 F1G3H2 F2G1H1 F2G1H2 F2G2H1 F2G2H2 F2G3H1 F2G3H2  ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Linear Tensor Network In the ”Language” of Block Vec Products... a = [ A11 A21 ] ⊗ [ A12 A22 ] ⊗ · · · ⊗ [ A1,d−1 A2,d−1 ] ⊗ [ A1d A2d ] where [ A11 A21 ] = [ wT1 wT2 ] 2 row vectors [ A1k A2k ] = [ m-by-m m-by-m ] k = 2:d − 1 [ A1d A2d ] = [ z1 z2 ] 2 column vectors a is a length-2d vector that depends on O(dm2) numbers ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Back to the Main Problem... Constrained Minimization Minimize r(a) = aTHa aTa subject to the constraint that a = [ A11 A21 ] ⊗ [ A12 A22 ] ⊗ · · · ⊗ [ A1,d−1 A2,d−1 ] ⊗ [ A1d A2d ] Let us look at both the denominator and the numerator in light of the fact that N = 2d . ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Avoiding O(2d) 2-Norm of a Linear Tensor Network... If a = [ A11 A21 ] ⊗ [ A12 A22 ] ⊗ · · · ⊗ [ A1,d−1 A2,d−1 ] ⊗ [ A1d A2d ] then aTa = wT ( d−1∏ k=2 (A1k ⊗ A1k) + (A2k ⊗ A2k) ) z where w = A11 ⊗ A11 + A21 ⊗ A21 = w1 ⊗ w1 + w2 ⊗ w2 z = A1d ⊗ A1d + A2d ⊗ A2d = z1 ⊗ z1 + z2 ⊗ z2 A1k and A2k are m-by-m, k = 2:d − 1. Overall work is O(dm3). ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Avoiding O(d4) Recall... H = d∑ ij tijH T i Hj + d∑ ijkl vijklH T i H T j HkHl Hi = I2i−1 ⊗ [ 0 1 0 0 ] ⊗ I2d−i The V-Tensor Has Familiar Symmetries V(i , j , k, `) =  V(j , i , k, `) V(i , j , `, k) V(k, `, i , j) and so we can find symmetric matrices B1, . . . ,Br so V = B1 ◦ B1 + · · ·+ Br ◦ Br ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Avoiding O(d4) Idea Approximate V with B1 ◦ B1 (or some short sum of the B’s) because then vijk` = B1(i , j)B1(k, `) and H = d∑ ij tijH T i Hj + d∑ ijkl vijklH T i H T j HkHl = d∑ ij tijH T i Hj + ∑ ij B1(i , j)HiHj T ∑ ij B1(i , j)HiHj  Think about aTHa and note that we have reduced evaluation by a factor of O(d2). ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Optimization Approach For k = 1:d ... Minimize r(a) = aTHa aTa = r(A1k ,A2k) subject to the constraint that a = [ A11 A21 ] ⊗ [ A12 A22 ] ⊗ · · · ⊗ [ A1,d−1 A2,d−1 ] ⊗ [ A1d A2d ] and all by A1k and A2k are fixed. This projected subproblem can reshaped into a smaller, 2m2-by-2m2 Rayleigh Quotient minimization... ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Optimization Approach The Subproblem Minimize r(ak) = aTk Hkak aTk ak where ak = [ vec(A1k) vec(A2k) ] and Hk = T T k HTk Tk ∈ IR2 d×m2 can be formed in time polynomial in m. ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Tensor-Based Thinking Key Attributes 1 An ability to reason at the index-level about the constituent contractions and the order of their evaluation. 2 An ability to reason at the block matrix level in order to expose fast, underlying Kronecker product operations. ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Data-Sparse Representations and Facorizations How Could We Compute the QR Factorization of This? [ F1 F2 ] ⊗  G1G2 G3 ⊗ [ H1 H2 ] = [ F1 F2 ] ⊗  G1H1 G1H2 G2H1 G2H2 G3H1 G3H2  =  F1G1H1 F1G1H2 F1G2H1 F1G2H2 F1G3H1 F1G3H2 F2G1H1 F2G1H2 F2G2H1 F2G2H2 F2G3H1 F2G3H2  Without “Leaving” the Data Sparse Representation? ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Data-Sparse Representations and Factorizations QR Factorization and Block vector Products If [ F1 F2 ] = [ Q1 Q2 ] R then[ F1 F2 ] ⊗  G1G2 G3 ⊗ [ H1 H2 ] = [ Q1 Q2 ] ⊗  RG1RG2 RG3 ⊗ [ H1 H2 ] If  RG1RG2 RG3  =  U1U2 U3 S then... ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Data-Sparse Representations and Factorizations QR Factorization and Block vector Products [ F1 F2 ] ⊗  G1G2 G3 ⊗ [ H1 H2 ] = [ Q1 Q2 ] ⊗  U1U2 U3 ⊗ [ SH1 SH2 ] If [ SH1 SH2 ] = [ V1 V2 ] T then[ F1 F2 ] ⊗  G1G2 G3 ⊗ [ H1 H2 ] = [ Q1 Q2 ] ⊗  U1U2 U3 ⊗ [ V1 V2 ]T The Matrix in Parentheses is Orthogonal ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality Summary of Lecture 9. Key Words The Curse of Dimensionality refers to the challenges that arise when dimension increases. Clever data-sparse representations are one way to address the issues. A tensor network is a way of combining low-order tensors to obtain a high-order tensor. Reliable methods that scale with dimension are the goal. ⊗ Transition to Computational Multilinear Algebra ⊗ Lecture 9. The Curse of Dimensionality

本文档为【9 Lec9-39】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。

9 Lec9-39

热门搜索

历史搜索