Spectral Clustering: Methodology and Statistical Analysis
报告题目:Spectral Clustering: Methodology and Statistical Analysis
报告人:章叶(宾夕法尼亚大学)
报告时间:2023年6月2日(周五)16:00
报告地点:海纳苑2幢312
摘要:Spectral clustering is one of the most popular algorithms to group high-dimensional data. It is easy to implement, computationally efficient, and has achieved tremendous success in many applications. The idea behind spectral clustering is dimensionality reduction. It first performs a spectral decomposition on the dataset and only keeps the leading few spectral components to reduce the dimension of the data. It then applies some standard methods such as the k-means on the low-dimensional space to do clustering. In this talk, we demystify the success of spectral clustering by providing a sharp statistical analysis of its performance under mixture models. For isotropic Gaussian mixture models, we show spectral clustering is optimal. For sub-Gaussian mixture models, we derive exponential error rates for spectral clustering. To establish these results, we develop a new spectral perturbation analysis for singular subspaces.
报告人简介:章叶,宾夕法尼亚大学统计与数据科学系助理教授。2013年本科毕业于太阳成集团tyc411(中国)有限公司-百度百科统计学专业,曾获浙江大学竺可桢奖学金。2018年在耶鲁大学获统计学博士学位。2018-2019访问芝加哥大学一年。2019年入职宾夕法尼亚大学统计与数据科学系,担任助理教授。章博士的研究领域涵盖网络分析、聚类与混合模型分析、谱分析、平均场变分推断(mean field variational inference)、排序和同步化(ranking and synchronization)等等。他已在统计学和机器学习领域的国际顶级期刊(例如,Annals of Statistics,JASA,Journal of Machine Learning Research,IEEE Transactions on Information Theory等)发表学术论文多篇。 他于2018年获耶鲁大学Francis J. Anscombe Award,2019年获ICSA New Researcher Award。(个人主页:https://statistics.wharton.upenn.edu/profile/ayz/)
联系人:庞天晓(txpang@zju.edu.cn)