REGENTS OF THE UNIVERSITY OF MINNESOTA
The proposed project aims to develop new statistical theory and methodology for high-dimensional structured data. The project is inspired by challenging problems that arise in two important biological applications: disease gene identification and gene function discovery, where one central issue is how to utilize problem structure effectively to deal with high statistical uncertainty in a discovery process. The project consists of two major components: subnetwork analysis and structured learning. With regard to subnetwork analysis, the PI and his collaborators will develop new techniques for extracting a certain low-dimensional subnetwork structure, where a network is described by a directed or an undirected graph. With regard to structured learning, the PI and his collaborators will develop new large margin techniques for partial multi-label hierarchical classification, with particular effort focused on accurate prediction under hierarchical constraints and various hierarchical loss functions. The goal is to achieve a substantial improvement on predictive accuracy over the current best techniques. In addition, computational tools will be developed to target real problems and to provide optimal or near-optimal solutions.
The proposed project will address fundamentally important issues in structured data analysis. It will generate research interest for studying emerging problems, and will promote collaborations between statisticians and scientists from other fields such as computer science and biomedical science. The research program will have an impact in several areas of research, particularly in document management and exploration, automatic machine processing, biomedical research, and social science. The educational program will integrate teaching with research to get students exposed to state-of-the-art research, and to create an interdisciplinary learning environment for training and learning.