题目:DiscLDA: Discriminative Learning for Dimensionality Reductionand Classif ication 时间:12月25日 10:30-11:30 地点:蒙民伟楼404会议室 报告人:Fei Sha Assistant Professor Department of Computer Science, University of Southern California, USA 摘要: Probabilistic topic models have become popular as methods for dimensionality r eduction in collections of text documents or images. These models are usually treated as generative models and trained using maximum likelihood or Bayesian methods. In this talk, we discuss an alternative: a discriminative framework i n which we assume that supervised side information is present, and in which we wish to take that side information into account in finding a reduced dimensi onality representation. Specifically, we present DiscLDA, a discriminative va riation on Latent Dirichlet Allocation (LDA) in which a class-dependent linear transformation is introduced on the topic mixture proportions. This parameter is estimated by maximizing the conditional likelihood. By using the transfor med topic mixture proportions as a new representation of documents, we obtain a supervised dimensionality reduction algorithm that uncovers the latent struc ture in a document collection while preserving predictive power for the task o f classification. We compare the predictive power of the latent structure of D iscLDA with unsupervised LDA on the 20 Newsgroups document classification task and show how our model can identify shared topics across classes as well as c lass-dependent topics. Joint work with Simon Lacoste-Jullien (UC Berkeley) and Michael I.Jordan (UC B erkeley) 简历: Fei Sha got his Ph.D from U. of Pennsylvania in computer and information scien ce. Afterwards, he spent a year at UC Berkeley as a postdoc with Prof. Michae l Jordan and Prof. Stuart Russell. He then joined Yahoo Research as a research scientist for a year. He has been a faculty at USC's computer science departm ent since Aug., 2008. His research interest focuses on statistical machine lea rning. His work had won best student paper awards at NIPS and ICML.
|