Variable selection for high-dimensional data using known and novel graph information

来源：太阳成集团tyc411 发布时间：2018-11-20 1180

题目：Variable selection for high-dimensional data using known and novel graph information

报告人：Qi Long教授（宾夕法尼亚大学）

时间：2018.12.26（周三）下午 14:00

地点：紫金港校区管理学院行政楼14楼1417报告厅

摘要：Variable selection for structured high-dimensional covariates lying on an underlying graph has drawn considerable interest. However, most of the existing methods may not be scalable to high dimensional settings involving tens of thousands of variables lying on known pathways such as the case in genomics studies, and they assume that the graph information is fully known. This talk will focus on addressing these two challenges. In the first part, I will present an adaptive Bayesian shrinkage approach which incorporates known graph information through shrinkage parameters and is scalable to high dimensional settings (e.g., p~100,000 or millions). We also establish theoretical properties of the proposed approach for fixed and diverging p. In the second part, I will tackle the issue that graph information is not fully known. For example, the role of miRNAs in regulating gene expression is not well-understood and the miRNA regulatory network is often not validated. We propose an approach that treats unknown graph information as missing data (i.e. missing edges), introduce the idea of imputing the unknown graph information, and define the imputed information as the novel graph information. In addition, we propose a hierarchical group penalty to encourage sparsity at both the pathway level and the within-pathway level, which, combined with the imputation step, allows for incorporation of known and novel graph information. The methods are assessed via simulation studies and are applied to analyses of cancer data.

欢迎广大师生踊跃参加！

联系人：张立新（stazlx@zju.edu.cn）

浙ICP备05074421号

技术支持：创高软件管理登录

您是第 1000 位访问者

太阳成集团tyc411(中国)有限公司-百度百科

Variable selection for high-dimensional data using known and novel graph information