Month: July 2023

Please Join Us in Congratulating Caiwen Ding and Derek Aguiar on Their Grant Awards

caiwen ding

Making AI More Secure with Privacy – Preserving Machine Learning

Congratulations to CSE Assistant Professor Caiwen Ding who, in collaboration with Wujie Wen from Lehigh University and Xiaolin Xu from Northeastern University, was awarded a $1.2M NSF grant for “Accelerating Privacy-Preserving Machine Learning as a Service: From Algorithm to Hardware.” This research project focuses on the design of efficient algorithm-hardware co-optimized solutions to accelerate privacy-preserving machine learning on diverse hardware platforms.

Read More @ Computer Science and Engineering

derek aguiar

New NSF CAREER Awardee: Algorithmic and Statistical Modeling of Haplotypes

Congratulations to CSE Assistant Professor Derek Aguiar who was awarded an NSF CAREER award titled “Practical algorithms and high dimensional statistical methods for multimodal haplotype modelling.” This project addresses major challenges in computational biology and applied machine learning by innovating new robust mathematical models that make few assumptions and efficient training algorithms to leverage massive and complex cellular data.

Read More @ Computer Science & Engineering

New NSF CAREER Awardee: Algorithmic and Statistical Modeling of Haplotypes

Congratulations to CSE Assistant Professor Derek Aguiar who was awarded an NSF CAREER award titled “Practical algorithms and high dimensional statistical methods for multimodal haplotype modelling.” This project addresses major challenges in computational biology and applied machine learning by innovating new robust mathematical models that make few assumptions and efficient training algorithms to leverage massive and complex cellular data.

Source: NSF

Massive and diverse datasets have been generated from human cells with the goal of explaining the many ways cellular differences affect the observed differences in traits between people. Mathematical models of the genetic differences between people can be used to explain, for example, why some individuals are predisposed to developing a particular disease. However, most mathematical models make overly simplistic assumptions about how genetic differences interact to influence an observed trait. This project addresses major challenges in computational biology and applied machine learning by innovating new robust mathematical models that make few assumptions and efficient training algorithms to leverage massive and complex cellular data. Specifically, the project considers: (a) methods for computing sequences of genetic differences by integrating different types of data, machine learning, and algorithmic techniques; (b) mathematical models for characterizing the genetic similarity between people; and (c) efficient algorithms that scale to large datasets. The results of this project include new methods that are broadly applicable to clustering massive and diverse sequential data, and specifically helpful for researchers trying to understand how genetic differences affect disease and other traits. Furthermore, the research supports the math and science high school and university communities by developing interactive learning modules and networking resources.

This project develops the statistical and algorithmic foundations for sequences of multimodal variation (i.e., multiomic haplotypes) in two research directions. The first direction introduces the multiomic haplotype data structure and develops new Bayesian nonparametric models and fast inference algorithms for clustering multiomic haplotypes from heterogeneous and high dimensional biomolecular data. Computational tractability is achieved through novel and efficient inference algorithms that operate in data-space (Bayesian coresets), model-space (deep approximations), and algorithm-space (variational approximations). The second direction develops the first model that unifies the combinatorial domain of haplotype assembly with the probabilistic haplotype phasing domain to infer latent haplotypes. The investigator will accomplish this unification goal by combining directed and undirected graphical modeling techniques with efficient particle-based inference algorithms. The completion of these research tasks will result in new methods for developing deep approximations for high dimensional Bayesian nonparametric models, models for multimodal sequential clustering, and methods to accelerate the training of high dimensional statistical models. Additionally, the research addresses (a) the longstanding open problem of haplotype assembly and haplotype phasing unification; and (b) potential sources of missing heritability in association studies: phase-dependent genetic and haplotype-epigenetic interactions. Partnerships with the university and regional high school communities will translate the research findings into educational modules and resources to motivate, engage, and retain computer science students and teachers.

Making AI More Secure with Privacy – Preserving Machine Learning

CAIWEN-DING

Congratulations to CSE Assistant Professor Caiwen Ding who, in collaboration with Wujie Wen from Lehigh University and Xiaolin Xu from Northeastern University, was awarded a $1.2M NSF grant for “Accelerating Privacy-Preserving Machine Learning as a Service: From Algorithm to Hardware.” This research project focuses on the design of efficient algorithm-hardware co-optimized solutions to accelerate privacy-preserving machine learning on diverse hardware platforms.

Source: NSF

Machine learning (ML) as a service is being overwhelmingly driven by the ever-increasing clients’ intelligent data processing needs through the use of cloud servers, where powerful ML models are hosted. Although pervasive, out-sourced ML processing poses real threats to personal or business providers’ data privacy. For example, the clients either need to share their sensitive data, such as healthcare records, financial information, with the server, or the server has to disclose the model to the clients. To guarantee privacy, the rise of cryptographic protocols, such as Homomorphic Encryption (HE), Multi-Party Computation (MPC), enable ML analytics directly on the encrypted data. While enticing, there still exists a big gap between the theory and practice, e.g., long latency due to the prohibitively expensive computation or communication overhead over ciphertext. This project aims to practically accelerate the private ML service by offering a full-fledged development of efficient, scalable and encryption-conscious computing paradigms. The project’s novelties lie in new ML-specific cryptographic operators, accuracy-preserving and crypto-friendly neural architectures, and pioneered algorithm-hardware co-design methodologies. The project’s broader significance and importance are: (1) to advance trustworthy artificial intelligence (AI), one of the national strategic pillars of the National AI Initiative; (2) to deepen the understanding of interactions among cryptography, machine learning and hardware acceleration; (3) to enrich the computer engineering curriculum, and the training of students from diverse backgrounds through relevant programs at Lehigh University, Northeastern University, and the University of Connecticut.

The project will develop a multifaceted design paradigm for efficient, scalable and practical algorithm-hardware co-optimized solutions to significantly accelerate privacy-preserving machine learning on hardware platforms such as FPGA. This project consists of three intervening research thrusts: (1) to orchestrate information representation and model sparsity in the encryption domain to fundamentally decrease the memory and computation footprint in the HE inference; (2) to overcome the ultra-high overhead associated with the MPC-based solution through techniques such as encryption-aware model truncation and partial hardware reconfiguration; (3) to search for crypto-friendly and accuracy-preserving neural architectures via jointly optimizing non-linear operation reduction, and closed loop “algorithm-hardware” design space exploration.