I am an associate professor at
The University of Texas at Austin.
I am with the
Statistics Group
at the McCombs School of Business.
I am also a core faculty member in the Department of Statistics and Data Sciences (SDS) and a core member of the Machine Learning Laboratory.
I received my Ph.D.
from Duke University in 2013, Master's
from the Chinese Academy of
Sciences in 2008, and B.Sc. from
Nanjing University in 2005.
My Ph.D. advisor was Dr. Lawrence Carin.

The research of my group lies at the intersection of machine learning and Bayesian statistics.
We are interested in probabilistic methods, approximate inference, generative models, deep neural networks, and reinforcement learning.
We are currently focused on advancing both statistical inference with deep learning and deep learning with probabilistic methods. The deep probabilistic models and inference algorithms developed in our group have been applied to tackle challenging problems in a wide variety of research areas, including computer vision, natural language processing, sequential decision making, text analysis, image processing, time series modeling, graph mining, inverse materials design, and bioinformatics. Our research has received funding/computing support from NSF, TACC, ByteDance, and Nvidia.

### Research Highlights

- [04/2021] Xinjie successfully defended his PhD thesis. Congratulations, Dr. Fan!
- [04/2021] Yuguang successfully defended his PhD thesis. Congratulations, Dr. Yue!
- [04/2021] I am joining the editorial board of the Journal of Machine Learning Research (JMLR) as an Action Editor.
- [03/2021] I will serve as an Area Chair for NeurIPS 2021.
- [02/2021] We are presenting two papers at CVPR 2021 (partition-guided GANs and adaptive standardization and rescaling normalization for single domain generalization).
- [01/2021] We are presenting one paper at ICLR 2021 (contextual dropout) and two papers at AISTATS 2021 (graph gamma process LDS and hyperbolic graph embedding with enhanced SIVI).
- [11/2020] I will serve as an Area Chair for ICML 2021.
- [10/2020] We are presenting four papers at NeurIPS 2020, including "Implicit Distributional Reinforcement Learning" with Yuguang Yue and Zhendong Wang, "Bayesian Attention Modules" with Xinjie Fan, Shujian Zhang, and Bo Chen, "Bidirectional Convolutional Poisson Gamma Dynamical Systems" with Wenchao Chen, Chaojie Wang, Bo Chen, Yicheng Liu, and Hao Zhang, and "Deep Relational Topic Modeling via Graph Poisson Gamma Belief Network" with Chaojie Wang, Hao Zhang, Bo Chen, Dongsheng Wang, and Zhengjue Wang.
- [08/2020] We have started contributing our forecasts of COVID-19 deaths and cases in the US using negative binomial dynamical system (NBDS) to The COVID-19 Forecast Hub since July 27, 2020 (our participation was delayed as one of the team members felt sick and was later diagnosed with Pneumonia, luckily due to bacterial rather than COVID-19 viral infection!). For one-week ahead forecast given the data before July 27, the cumulative number of COVID-19 deaths in the US on August 1 was forecasted to be 154,486 by the proposed NBDS model (95% prediction interval: [151,953, 158,057]; median: 154,249; mean: 154,486), while the observed number is 154,448 and the forecast by COVIDhub-ensemble is 153,024 (95% prediction interval: [151,455, 154,376]; median: 153,024). For this particular week, NBDS performs very well as the observed number of cumulative deaths on August 1 is within its 95% prediction interval and very close to its forecasted mean; note that its forecast mean is larger than its forecasted median, and is much closer to its 2.5th percentile than to its 97.5th percentile, suggesting its predictive distribution is right-skewed (the right-skewness can be further confirmed by our own visualization and the visualization in COVIDhub), as commonly observed in count data. By contrast, the observed number of cumulative deaths is above the 97.5th percentile of the forecast by COVIDhub-ensemble, and the forecasted median by COVIDhub-ensemble is closer to its 97.5th percentile than to its 2.5th percentile, suggesting that some of the models included in the ensemble might not be doing well in capturing the distributional characteristics commonly observed in count data (such as right-skewness and over-dispersion). We are sending the forecasts of NBDS on a weekly basis to The COVID-19 Forecast Hub and will keep track of its forecast performance, which we hope will help provide insights into how to further improve COVIDhub-ensemble.
- [07/2020] Quan successfully defended his PhD thesis. Congratulations, Dr. Zhang!
- [07/2020] I will serve as an Area Chair (senior meta reviewer) for AAAI 2021.
- [07/2020] I will serve as an Area Chair for ICLR 2021.
- [06/2020] "Deep autoencoding topic model with scalable hybrid Bayesian inference” is accepted by IEEE TPAMI.
- [06/2020] We are presenting three papers at ICML 2020, including “Thompson sampling via local uncertainty” with Zhendong Wang (UT SDS), “Recurrent hierarchical topic-guided RNN for language generation” with Dandan Guo, Bo Chen, and Ruiying Lu from Xidian University, and “Bayesian graph neural networks with adaptive connection sampling" with Arman Hasanzadeh, Ehsan Hajiramezanali, Shahin Boluki, Nick Duffield, Krishna Narayanan, and Xiaoning Qian from Texas A&M University.
- [05/2020] Mingzhang successfully defended his PhD thesis. Congratulations, Dr. Yin!
- [03/2020] I will serve as an Area Chair for NeurIPS 2020.
- [02/2020] Mingzhang Yin will join the Data Science Institute at Columbia University as a Postdoctoral Fellow in July 2020, working with Professors David Blei and Simon Tavare.
- [01/2020] We are presenting four papers at AISTATS 2020, including “Discrete action on-policy learning with action-value critic” with Yuguang Yue (UT SDS), Yunhao Tang (Columbia), and Mingzhang Yin (UT SDS), “Deep generative models of sparse and overdispersed discrete data” with He Zhao (Monash), Piyush Rai (IIT-Kanpur), Lan Du (Monash), Wray Buntine (Monash), and Dinh Phung (Monash), “Learnable Bernoulli dropout for Bayesian deep learning" with Shahin Boluki, Randy Ardywibowo, Siamak Zamani, and Xiaoning Qian from Texas A&M University, and “Learning dynamic hierarchical topic graph with graph convolutional network for document classification” with Chaojie Wang, Zhengjue Wang, Hao Zhang, Zhibin Duan, and Bo Chen from Xidian University.
- [12/2019] Quan Zhang will join the Eli Broad College of Business at Michigan State University as a tenure-track Assistant Professor in Fall 2020.
- [12/2019] We are presenting four papers at ICLR 2020, including "Adaptive correlated Monte Carlo for contextual categorical sequence generation" with Xinjie Fan (UT SDS), Yizhe Zhang (Microsoft Research), & Zhendong Wang (Columbia), "Variational hetero-encoder randomized generative adversarial networks for joint image-text modeling" with Hao Zhang, Bo Chen, Long Tian, & Zhengjue Wang from Xidian University, "Meta-learning without memorization" with Mingzhang Yin (UT SDS), George Tucker (Google Brain), Sergey Levine (UC Berkeley), & Chelsea Finn (Stanford), and "Mutual information gradient estimation for representation learning" with Liangjian Wen, Yiji Zhou, Lirong He, & Zenglin Xu from University of Electronic Science and Technology of China.
- We are presenting three papers at NeurIPS 2019, including "Poisson-randomized gamma dynamical systems" with Aaron Schein (Columbia), Scott Linderman (Stanford), David Blei (Columbia), and Hanna Wallach (Microsoft Research), and "Semi-implicit graph variational auto-encoders" and "Variational graph recurrent neural networks" with Ehsan Hajiramezanali, Arman Hasanzadeh, Krishna Narayanan, Nick Duffield, and Xiaoning Qian from Texas A&M University.
- I will serve as an Area Chair for AAAI-20.
- We are presenting three papers at ICML 2019, including "ARSM: Augment-REINFORCE-swap-merge estimator for gradient backpropagation through categorical variables" with Mingzhang Yin and Yuguang Yue from UT SDS, "Convolutional Poisson gamma belief network" with Chaojie Wang, Bo Chen, and Sucheng Xiao from Xidian University, and "Locally private Bayesian inference for count modeling" with Aaron Schein (Columbia), Steven Wu (University of Minnesota), Alexandra Schofield (Cornell), and Hanna Wallach (Microsoft Research).
- I will serve as an Area Chair for NeurIPS 2019.
- I will be promoted to Associate Professor (with tenure), effective September 1, 2019.
- "ARM: Augment-REINFORCE-merge gradient for stochastic binary networks," co-authored with Mingzhang Yin, is accepted for publicaiton at ICLR 2019.
- My collaborators and I are presenting six papers at NeurIPS 2018 that cover a diverse set of research topics, including "Nonparametric Bayesian Lomax delegate racing for survival analysis with competing risks" with Quan Zhang from UT McCombs, "Dirichlet belief networks as structured topic prior" with He Zhao, Lan Du, and Wray Buntine from Monash University, "Deep Poisson gamma dynamical systems" with Dandan Guo, Bo Chen, and Hao Zhang from Xidian University, "Bayesian multi-domain learning for cancer subtype discovery from next-generation sequencing count data" with Ehsan Hajiramezanali, Siamak Zamani Dadaneh, Alireza Karbalayghareh, and Xiaoning Qian from Texas A&M University, "Masking: A new perspective of noisy supervision" with Bo Han, Jiangchao Yao, Gang Niu, Ivor Tsang, Ya Zhang, and Masashi Sugiyama from University of Technology Sydney, Shanghai Jiao Tong University, and RIKEN, and "Parsimonious Bayesian deep networks" by myself.
- NSF proposal "Collaborative Research: Combinatorial Collaborative Clustering for Simultaneous Patient Stratification and Biomarker Identification" is funded by the Information Integration and Informatics (III) program.
- I will serve as an Area Chair for ICLR 2019.
- I will serve as a member of the Senior Program Committee (SPC) for AAAI-19.
- "Semi-implicit variational inference," co-authored with Mingzhang Yin, is accepted for publicaiton in ICML 2018.
- "Inter and intra topic structure learning with word embeddings," co-authored with He Zhao, Lan Du, and Wray Buntine, is accepted for publicaiton at ICML 2018.
- "A dual Markov chain topic model for dynamic environments," co-authored with Ayan Acharya and Joydeep Ghosh, is accepted for publicaiton at KDD 2018.
- I am honored to receive the 2017-2018 CBA Foundations Research Excellence Award for Assistant Professors from the McCombs School of Business
- I will serve as an Area Chair for NIPS 2018.
- "Permuted and augmented stick-breaking Bayesian multinomial regression," co-authored with Quan Zhang, is accepted for publicaiton in Journal of Machine Learning Research.
- "WHAI: Weibull hybrid autoencoding inference for deep topic modeling," co-authored with Hao Zhang, Bo Chen, and Dandan Guo, is accepted for publicaiton at ICLR 2018.
- "Nonparametric Bayesian sparse graph linear dynamical systems," co-authored with Rahi Kalantari and Joydeep Ghosh, is to appear at AISTATS 2018.
- "BayCount: A Bayesian decomposition method for inferring tumor heterogeneity using RNA-Seq counts," co-authored with Fangzheng Xie and Yanxun Xu, is accepted for publicaiton in Annals of Applied Statistics.
- "Multimodal Poisson gamma belief network," co-authored with Chaojie Wang and Bo Chen, is accepted by AAAI 2018.
- "Nonparametric Bayesian negative binomial factor analysis" is accepted for publication in Bayesian Analysis.
- The paper "Deep latent Dirichlet allocation with topic-layer-adaptive stochastic gradient Riemannian MCMC," coauthored with Yulai Cong and Bo Chen, is accepted by ICML 2017.
- The paper "BNP-Seq: Bayesian nonparametric differential expression analysis of sequencing count data," co-authored with Siamak Z. Dadaneh and Xiaoning Qian, is accepted for publication in Journal of the American Statistical Association. R code will follow soon. Using the gamma/beta negative binomial process, we remove sophisticated ad-hoc pre-processing steps commonly required in existing algorithms. Please take a look at our paper and code if you are interested in differential expression analysis of sequencing counts.
- I will serve as an area chair for NIPS 2017.
- Aaron Schein did a full oral presentation for our collaborative work in NIPS2016 at Barcelona: Aaron Schein (UMass-Amherst), Mingyuan Zhou (UT-Austin), and Hanna Wallach (MSR-NYC)," Poisson–Gamma Dynamical Systems," Neural Information Processing Systems, Barcelona, Spain, Dec. 2016.
- The paper "Augmentable Gamma Belief Networks," co-authored with Yulai Cong and Bo Chen, is now published in Journal of Machine Learning Research.

Matlab & C code for "Gamma Belief Networks" is now available in Github.

The graphical models shown below illustrate the hierarchical model and inference via data augmentation and marginalization across multiple hidden layers. - The paper "Frequency of frequencies distributions and size dependent exchangeable random partitions," co-authored with Stefano Favaro and Stephen G Walker, is accepted for publication in Journal of the American Statistical Association. Below is an interesting picture from this paper:
- Aaron Schein is presenting our collaborative work in ICML2016 at NYC: Aaron Schein (UMass-Amherst), Mingyuan Zhou (UT-Austin), David M. Blei (Columbia), and Hanna Wallach (MSR-NYC)," Bayesian Poisson Tucker decomposition for learning the structure of international relations," International Conference on Machine Learning, New York City, NY, June 2016.
- Matlab code for our JASA paper "Priors for random count matrices derived from a family of negative binomial processes" is now available in Github.
- Our NIPS2015 paper "The Poisson gamma belief network" presents a Poisson-gamma-gamma-gamma... generative model that can be used to unsupervisedly extract multilayer deep representation of high-dimensional count vectors, with the network structure automatically inferred from the data given a fixed budget on the width of the first layer. When applied to deep topic models, the Poisson gamma belief network (PGBN) extracts very specific topics at the first hidden layer and increasingly more general topics at deeper hidden layers. Jointly training all the layers is simple and the code is easy to implement, as a PGBN of T layers can be broken into T subproblems that are solved with the same subroutine, with the computation mainly spent on training the first hidden layer. The extracted deep network can also be used to simulate very interpretable synthetic documents, which reflect various general aspects of the corpus that the network is trained on.
- The paper "Priors for random count matrices derived from a family of negative binomial processes" is accepted for publication in Journal of the American Statistical Association.
- Two papers are accepted for publication at AISTATS 2015:
- Infinite edge partition models for overlapping community detection and link prediction.
- Nonparametric Bayesian factor analysis for dynamic count matrices.

- My NIPS2014 paper "Beta-negative binomial process and exchangeable random partitions for mixed-membership modeling" introduces a nonparametric Bayesian prior to describes how to partition a count vector into a latent column-exchangeable random count matrix, whose number of nonzero columns is random and whose each row sums to a fixed integer. Note that in topic modeling, one is essentially trying to partition a count vector, each element of which is the total length of a document, into a Document by Topic latent count matrix. A fully collapsed Gibbs sampling algorithm naturally arises from the latent count matrix prior governed by the beta-negative binomial process.
- The paper "Priors for random count matrices derived from a family of negative binomial processes" that I co-authored with O.-H. Madrid-Padilla and James Scott defines a family of probability distributions to generate infinite random count matrices, whose columns are independent, and identically distributed (i.i.d.). A random count matrix, with a Poisson distributed and potentially unbounded random number of columns, can be generated by either drawing all its i.i.d. columns at once or adding one row at a time. Both the gamma- and beta-negative binomial process random count matrices can model row-heterogeneity with row-wise parameters, whose conditional posteriors have closed-form expressions. The left (right) picture below shows a gamma (beta) negative binomial process random count matrix, whose rows are added one by one, with the new columns introduced by each row appended to the right of the matrix. Our models lead to nonparametric naive Bayes classifiers that do NOT need to predetermine a finite number of features that are shared across all the categories.
- The paper "Negative binomial process count and mixture Modeling" that I co-authored with Dr. Larry Carin will appear in IEEE Trans. Pattern Analysis and Machine Intelligence: Special Issue on Bayesian Nonparametrics. The picture below describes a useful bivariate count distribution discovered in the paper.

© July 2020, Mingyuan Zhou