Latent Models©z�Ö�w
2014c 3� 10F
Contents
1 PPCA 3
1.1 Úó . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 PPCA�. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Ïf©ÛPCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 VÇ�.�Ú\ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.3 ëê�O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.4 PPCAPCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 PPCA�`³Þ~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 ?n"êâ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2 ?n·Ü�. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 �( . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 LDA 7
2.1 Úó . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 LDA�. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 âÎÒ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1
2.2.2 �)ª��. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.3 LDA�. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.4 ÄuLDA�íäÚëê�O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.5 A^�( . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 o( 11
2
1 PPCA
1.1 Úó
PCA(Principle component analysis)´«�2¦^�êâüEâ"PCAÏL�3$��̤
©�±êâ¥é��z�A�§�Ñp��̤©5?1�D½ö§�Øéêâ&E�zØ�
A�"«Ï~�{´µb�kN d ���tn, n ∈ {1, 2, · · · , N}, ��þt §�����Ý
µ
S =
1
N
N∑
n=1
(tn − t)(tn − t)T
O��Ý
�A�ÚéAA�þSwi = λiwi, ¿UA�l��üS§��cq �A
�éA�A�þW = (x1, w2, · · · , wq) |¤q ̤©�§éz��ü(Ó�¥%z)3q
m�(Jµ
xn = W
T (tn − t) n = 1, 2, · · · , N
PPCA(Probabilistic PCA)´ÄuGaussianÛ�Cþ§^VÇ�ªÑPCA�«�)ª�.�
£ã§l
U
ÏLq,�O§·Ü�.§EM{�·^uVÇ�.�{éÄ��PCA {?
1*Ð"
�©�Ìë©zµ[1]
1.2 PPCA�.
1.2.1 Ïf©ÛPCA
PCAÏf©Û(Factor Analysis, FA)�'X"Ïf©Û�Ä�8�´
Ïé$�Ï
f(”factor”)5L«p�êâ§$ÏfU
Jøépêâ�{��)º"b�$Ïfpêâ
mk5'X§t pêâ§
x $�Ïf§
t = Wx+ µ+ �
PCA'�§�^̤©5?1êâï§
xn = W
T (tn − t) tˆn = Wxn + t
¤±§Ä�þÏf©Û±À3PCA�êâïþV\D(" [2]
1.2.2 VÇ�.�Ú\
PPCAòëêz�VÇ©Ù�.�\PCA'
Ò´/ÏÛ�Cþ(Latent Variable)ÚFA�Vg§
Û�Cþ¢þ�uFA¥�$�Ïf"/§b�Û�CþLp���̶§
�
3
Ó5�:
x ∼ N(0, I)
� ∼ N(0, σ2I)
l
±���©Cþt�^©Ùµ
f(t|x;µ, σ,W ) = (2piσ2)−d/2exp
{
1
2σ2
(t−Wx− µ)T (t−Wx− µ)
}
∼ N(Wx+ µ, σ2I)
d�VÇúª§��t�>�©Ùµ
f(t;µ, σ,W ) =
∫ ∞
−∞
f(t|x;µ, σ,W )f(x)dx = (2pi)−d/2|C|−1/2exp
{
1
2
(t− µ)TC−1(t− µ)
}
∼ N(µ,C)
Ù¥C = σ2I +WWT .
ù�§ÏLÛ�Cþ�Ú\§¼�
ëêz�VÇ©Ù�.§ëêW ,µ,σ2
1.2.3 ëê�O
dub�
Gaussian�.§q,�O�{±¼�éëê��O"XJpêât´���§
�õpd©ÙÓ§ÏL4zéêq,¼ê±��Xe�Oµ
µ∗ =
1
N
N∑
n=1
ti
C∗ = S =
1
N
N∑
n=1
(ti − µ∗)(ti − µ∗)T
?Ú§âC = WWT + σ2I��WÚσ��Oµ
WML = Uq(Λq − σ2I)1/2R
σ2ML =
1
d− q
d∑
j=q+1
λj
Ù¥§Uq ∈ Rd×qd����Ý
S�cqA�£�¤þ|¤§éA�A�/¤é�
Λq,R´?
¿��^=Ý
"
¢SO§½����Ý
S§Äk±�O��σ£�d − qA�¤,?
��W ��
O"
1.2.4 PPCAPCA
lþ¡�©Û±9����ëê�O(J±wѧPPCAÑ
PCA�«�)ª�VÇ�.
£ã"D(σ2 ��O(Jd̤©éA�A�þ|¤§^5ïþN��̤©m��
�§W��O¥§ªWÝ
��Ҵ̤©éA�²LºÝCÚ^=C��"
4
1.3 PPCA�`³Þ~
òVÇ��.Ú\PCA¥§¦�éõÄuVÇ�.£q,¼ê¤�{±9�dnاU
Ð
/�Ü�PCA¥§l
5
éõ¢SA^¥�`³"
1.3.1 ?n"êâ
̤©©Û6u¤½�êâ§Ù©Û�(JéêâO(56é"PPCAU
)û3"
êâ�¹§=pêâtn = (tn1, · · · , tnd)¥3,
êêâ�""e¡{£ã^EM{?n
"êâ�g´"33"êâ�¹e§¦^EM{U
�OÑt¥"�Ü©§Û�Cþx±9ë
êσ, µ,W"d?=rN�©©z¥²(Ñ�Ü©§äN�Oúªë©z¥�N¹B"6§e
"
^EM{?1�O§I¦^�e��VÇ©Ù�.µ
t|x;µ, σ,W ∼ N(Wx+ µ, σ2I)
x|t;W,σ2, µ ∼ N(M−1WT (t− µ), σ2M−1) M = WTW + σ2I
f(t, x;µσ2,W ) = (2piσ2)−d/2exp
{
1
2σ2
(t−Wx− µ)T (t−Wx− µ)
}
(2pi)−q/2exp
{
−1
2
xTx
}
1.3.2 ?n·Ü�.
w,§�©�PCA�.Ã{?n·Ü�.�¹§=êâU�)uØÓ�|O§
ØÓ�|O
UéAuØÓ�̤©"aqu?n"êâ�{§ÏLEM{§PPCAU
�Ð/?n·Ü�
.[4]"eãÑ
éÐ�~f"
1.4 �(
/ÏuÏf©ÛÚÛ�Cþ�Vg§PPCAòVÇ�µeÚ\
PCA§S¥"ÏL�ï«�)ª
�ëêz�VÇ�.§òPCA¯KC¤ëê�O�¯K"/Ïuù�.§PPCAU
?n�¯K�
©PCA'Oõ"
1. ÏLÚ\VÇ©Ù§¦�NõÄuq,¼ê�ÚOEâU
A^uPCA �(J?nÚ©Û¶
2. #NÏL�d�{§\\k�£§~X\\DÕ5��å¶[3]
3. �)ª��.¦�PPCAU
{ü/ÏLEM{?n"êâÚ·Ü�.�¹.
5
§S6§ 1 PPCAµEM{?n"êâ
Ñ\: "��tn ∈ Rd, n = 1, · · · , N
ÑÑ: ����tn ∈ Rd§Û�Cþxn ∈ Rq§ëêσ, µ,W
1: �Oµ
µ∗j =
1
nj
nj∑
k=1
tk(j)
jL«1j|¤§=d®�êâ�Oþþ�©þ
2: ЩzµÅЩzt˜i, σ˜, W˜§¿��Û�Cþ�Щßÿ£â��©Ù¤
x˜i = (W˜
T W˜ )−1W˜ (t˜− µ∗)
3: while Sª^Ø÷v do
4: E-step:
5: ât�^©ÙW¿"êâµ
ti = Wxi + µ
∗
6: â��©Ùx|tOÛ�Cþx�Ï"µ
< xi > = M
−1WT (ti − µ∗)
< xtx
T
i > = σ
2M−1+ < xi >< xi >T
7: M-step:
8: ât, x�éÜVÇ©ÙÏLq,�OOëêW,σ
9: end while
Figure 1: PPCA?n·Ü�.[2]
6
2 LDA
2.1 Úó
LDA(Latent Dirichlet Allocation)´«�)ª�VÇ�.§�©öÌòÙA^u©�8�©
Û§�ÙU
2$^u«lÑêâ8¥"
LDA�8�´
êâ8£©�8¤{á�£ã§ÓqU
�3Ù¥Ä��ÚO&
E§l
U
p�/^u�Y�©Û§~X©a!É~uÿ!q5�ä�"Uì�©[5]�?n§d
?±©�©Û~`²LDA�Ä�g´"3©�?n¥§é�¯K´q5��ä§=J�
Ñk|u©�©Û�A�"«'�{ü�{´O©�¥üc�q5§�´éõ¹e§éa
q�©� ¬¦^ØÓ�c®"±LDAL�a{æ^�´,A�§=J�©��ÌK"
Uì�)ª�.�g´§@¤k�©Ù��)L§)§Äk(½©Ù�ÌK§,�âÌK±½
�VÇÀJ©Ù¥¤I¦^�üc"��)ª�.'§LDA´n��(�§=@©Ù
�ÌK±ï��)ª�.§±½�VÇ�.�)§)¤�©�¥±¹õÌK"e
ã[6]Ñ
ÏLù«�)ª��.�)©�8�/�£ã"
Figure 2: ©���)ª�.
Ý
�¹Âµz©�¥ücÑy�VǧzÌK¥üc�Vǧz©�¥ÌKÑy�VÇ
�©�Ìë©z:[5]
2.2 LDA�.
2.2.1 âÎÒ
æ^�©öÓ�âÎÒµ
• word: =©�¥�üc§½ölÑêâ8�Ä��|¤ü�"Ó§b½¤k�ücÑ5g
7
u�Ó�ücL{1, · · · , V }§züc^þ w L«§XJücÑy3ücL�1v §
Kwv = 1§
wu = 0, u 6= v¶
• document: =°©�§½ö`dNüc|¤�S�§^XeªL«
w = (w1, · · · , wN )
• corpus: =©�8§dM©�|¤
D = (w1, · · · ,wM)
LDA©Û�8�´é�«VÇ�.§Ø==U
¦�©�8¥�©�k��VÇ�)§Ó
q�©�U
k��VÇÑy"
2.2.2 �)ª��.
Äk§£ãü«'�{ü�Äu©�ÌK��)ª�."
1. Unigram Modelµ{ü��)ª�.§=¤k�©�¥¤k�ücÑÕáÓ©Ù/5guÓ
õ©Ù£Xã3¤«¤:
p(w) =
M∏
i=1
Ni∏
j=1
p(wij)
2. Mixture of unigrams µü��.§z©���)L§)üÚ§ÄkÀJÌKz§,�
Äu^õ©ÙÕáÓ©Ù/�)Nüc£Xã3¤«¤"
p(w) =
M∏
i=1
∑
z
p(z)
Ni∏
n=1
p(wn|z)
Figure 3: �)ª�."µUnigram mµMixture of Unigram
LDA�.´«E,�(�§æ^n��(�§#N©�±ØÓ�§Ý¹õ«ØÓ�ÌK§
2\\�ÌK��)ª�."
8
2.2.3 LDA�.
LDA�.ò©��ÌKÀÛ�Cþ§Óq@Û�Cþ±½�VÇ�)§�uÛ�C
þq\\«�d�k�£ÀJ�k�©Ùõ©Ù��Ýk�Dirichlet©Ù¤"
§S6§ 2 LDA�.�)©�8D¥�z©�w = (w1, · · · , wN )
1: for z©� do
2: ÀJ©�5�N ∼ Possion(ξ)
3: ÀJÌK�VÇ©Ùθ ∼ Dir(α)
4: for éuT©���Nücwn do
5: (a) ÀJÌKzn ∼Multinomial(θ)
6: (b) ÀJücxn ∼Multinomial(wn|zn;β)
7: end for
8: end for
3þã�.¥§NÕáuÙ¦¤këê§é��©Ûج�)K§=9Ïþ§�¡�©
Û¥¬�ÀÙÅ5"
e¡ò©n�gg?ØLDA�ëê£Cþ¤9VÇ�."£Xã4ØÓôÚ¤«¤
Figure 4: LDA�.
1. ©�8?O�ëêµα, β
=éu¤k�©�ÑÓ§�L§¥�)g"b�kk¥ÌK§Kαkþ§
Dirichlet©Ù�ëê^u�)©�ÌK�©Ùëê§βücæ�L§¥õ©Ù�
ëê"
9
2. ©�?O�Cþ: θ
=éuz©��¤kücÑÓ§θ�©Ù´ëêα�Dirichlet©Ù§û½
z©��)ØÓ
ÌK�VǧØÓ©�mÕáө٧٩Ù/ªµ
p(θ|α) = Γ(
∑k
i=1 αi)∏k
i=1 Γ(αi)
θα1−11 · · · θαk−1k
3. üc?Oµz, w
c�mixture of unigramØÓ§z´üc?O�Cþ§=zücáu,«ÌKdθû½§z
ücÕáÓ©Ù/�)uÄuθ �õ©Ù
p(zn|θ) =
k∏
i=1
θ
I(zn=k)
k
�(½
zn, β�§ücw±dõ©Ù�)§b�ücê8V
p(wn|zn, β) =
V∏
i=1
k∏
j=1
β
I(zn=j,wn=i)
ij
u´§3½©�8?Oëêα, β�^e§Cþθ, z, w �©Ùµ
p(θ, z, w|α, β) = p(θ|α)
n∏
n=1
p(θn|θ)p(wn|zn, β)
Ó§±¼�üc�>�©Ù:
p(w|α, β) =
∫
p(θ|α)
(
N∏
n=1
∑
zn
p(zn|θ)p(wn|zn, β)
)
dθ
±9�©�8�Vǵ
P (D|α, β) =
M∏
d=1
∫
p(θd|α)
(
Nd∏
n=1
∑
znd
p(znd|θ)p(wnd|znd, β)
)
dθd
e¡ò?Ú?ØÄuLDA�.�íäÚÆSL§"
2.2.4 ÄuLDA�íäÚëê�O
d?{`²íäÚëê�O¯KU�{§Ø?ØêÆí�L§"
Ì�íä¯K´3½©��^eOÛ�Cþθ, z���©Ù§â�dúªµ
p(θ, z|w,α, β) = p(θ, z, w|α, β)
p(w|α, β)
°(ínA�J±��§ÌÄCqín�{§MCMC´«±æ^�ín{§Ì´Ä
uDirichlet-Multinomial�Ý�Gibbs SamplingB[7]§[5]¥JÑ
«Äuà5�C©{"
ëê�OL§Ì)éα, β��O§[5]ѱÏLEM {¢y"
10
2.2.5 A^�(
LDA´©�ï��éÐ��.§3êâ�÷Ú©�©Û�+¥ÑUkéÐ�A^"3
�lÑêâ8¥§LDA�.ÑU��éÐ�A^"
'uLDA�.��§±e�Ä�:�5¿µ
1. Ù`³´æ^ÌK�.é©�?1©Û§Äuücq5�©Û'§`³²w"Äk§
±)ûõÂc�¯K§Óc3ØÓ¸e�¹Âk¤ØÓ§Óӹ±^õ«cL
«¶Ùg§UüØ©�¥�D(�K§Ï©�¥�D( ¬3u
g�ÌK¥"ù
«EâJ«·J�A�3ÅìÆS¥�5§
�A�U
¢y©a��§�
LuÄ�A�qéJJ�"
3LDA�.¥§U
w�§éÌK�J�´ÃiÒ�§Jø
Ôö��ÒU
gÄÔöÑVÇ�©Ù"
2. æ^
õ�g��d��)ª�."du¦^
Dirichlet-Multinomial�ݧ¦�Äu�dn
Ø�ínÚÆSÑØ´LuE,"ù«õ�g��)ª�.UUNý¢��)L§§XJ�
.b��(§ U
3Jø���Ôö���^e§¼�'�Ð�ÆSÚín�J"
3 o(
�)ª�.3é*ÿêâï�§Ï~Ѭb½
Û��Cþ§¿
ÑÛ�CþÚ*ÿêâ�
éÜ©Ù§â
^Õá5'X§¿|^�dín�{?1íäÚÆS"XJ�O��)ª�
.U
�Ð/Cqý¢�©Ù§T�. Uw«éÐ��J"
�©{Vã
ü«ÄuÛ�Cþ(Latent Variables)��)ª�."
1. PPCA(Probabilistic PCA)§ÏLÚ\p���$m�pdÛ�Cþ§ÓqD(L«g
�¤©�{§òVÇ�.Ú\̤©©Û¥§¼�pêâÚÛ�Cþ�éÜ©Ù§l
U
¦^q,�O!EM{�?n
Ä�PCA{J±?n�ó"
2. LDA(Latent Dirichlet Allocation)§�ï
n���d�."3©�©Û¥§zücéA
Û�Cþz§
z©�qéAuÛ�Cþ觩��Û�Cþû½©�¥ÌK�©Ù§
üc�Û�Cþû½
TücéA�ÌK"
11
References
[1] Michael E Tipping and Christopher M Bishop. Probabilistic principal component analysis. Journal
of the Royal Statistical Society, Series B, 61(3): 611-622.
[2] http://www.cmlab.csie.ntu.edu.tw/ cyy/learning/tutorials/PCA.pdf
[3] Guan Y, Dy J G. Sparse probabilistic principal component analysis[C]//International Conference on
Artificial Intelligence and Statistics. 2009: 185-192.
[4] Tipping M E, Bishop C M. Mixtures of probabilistic principal component analyzers[J]. Neural com-
putation, 1999, 11(2): 443-482.
[5] David M Bei, Andrew Y Ng and Michael I Jordan. Latent Dirichlet allocation. Journal of Machine
Learning Research 2003, 3: 993-1022.
[6] http://blog.csdn.net/huagong adu/article/details/7937616
[7] http://en.wikipedia.org/wiki/Latent Dirichlet allocation
12
1 PPCA
1.1 ÒýÑÔ
1.2 PPCAÄ£ÐÍ
1.2.1 Òò×Ó·ÖÎöÓëPCA
1.2.2 ¸ÅÂÊÄ£Ð͵ÄÒýÈë
1.2.3 ²ÎÊý¹À¼Æ
1.2.4 PPCAÓëPCA
1.3 PPCAµÄÓÅÊƾÙÀý
1.3.1 ´¦ÀíȱʧÊý¾Ý
1.3.2 ´¦Àí»ìºÏÄ£ÐÍ
1.4 С½á
2 LDA
2.1 ÒýÑÔ
2.2 LDAÄ£ÐÍ
2.2.1 ÊõÓïÓë·ûºÅ
2.2.2 ²úÉúʽµÄÄ£ÐÍ
2.2.3 LDAÄ£ÐÍ
2.2.4 »ùÓÚLDAµÄÍƶϺͲÎÊý¹À¼Æ
2.2.5 Ó¦ÓÃÓëС½á
3 ×ܽá