做下一筆記

wiki里面的定義 http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation

關鍵所在:it posits that each document is a mixture of a small number of topics and that each word's creation is attributable to one of the document's topics。

將文檔看成是一組主題的混合,詞有分配到每個主題的概率。

Probabilistic latent semantic analysis(PLSA) LDA可以看成是服從貝葉斯分布的PLSA