Author: Guilbeault, Jessica

Making AI More Secure with Privacy – Preserving Machine Learning

Congratulations to CSE Assistant Professor Caiwen Ding who, in collaboration with Wujie Wen from Lehigh University and Xiaolin Xu from Northeastern University, was awarded a $1.2M NSF grant for “Accelerating Privacy-Preserving Machine Learning as a Service: From Algorithm to Hardware.” This research project focuses on the design of efficient algorithm-hardware co-optimized solutions to accelerate privacy-preserving machine learning on diverse hardware platforms.

Source: NSF

Machine learning (ML) as a service is being overwhelmingly driven by the ever-increasing clients’ intelligent data processing needs through the use of cloud servers, where powerful ML models are hosted. Although pervasive, out-sourced ML processing poses real threats to personal or business providers’ data privacy. For example, the clients either need to share their sensitive data, such as healthcare records, financial information, with the server, or the server has to disclose the model to the clients. To guarantee privacy, the rise of cryptographic protocols, such as Homomorphic Encryption (HE), Multi-Party Computation (MPC), enable ML analytics directly on the encrypted data. While enticing, there still exists a big gap between the theory and practice, e.g., long latency due to the prohibitively expensive computation or communication overhead over ciphertext. This project aims to practically accelerate the private ML service by offering a full-fledged development of efficient, scalable and encryption-conscious computing paradigms. The project’s novelties lie in new ML-specific cryptographic operators, accuracy-preserving and crypto-friendly neural architectures, and pioneered algorithm-hardware co-design methodologies. The project’s broader significance and importance are: (1) to advance trustworthy artificial intelligence (AI), one of the national strategic pillars of the National AI Initiative; (2) to deepen the understanding of interactions among cryptography, machine learning and hardware acceleration; (3) to enrich the computer engineering curriculum, and the training of students from diverse backgrounds through relevant programs at Lehigh University, Northeastern University, and the University of Connecticut.

The project will develop a multifaceted design paradigm for efficient, scalable and practical algorithm-hardware co-optimized solutions to significantly accelerate privacy-preserving machine learning on hardware platforms such as FPGA. This project consists of three intervening research thrusts: (1) to orchestrate information representation and model sparsity in the encryption domain to fundamentally decrease the memory and computation footprint in the HE inference; (2) to overcome the ultra-high overhead associated with the MPC-based solution through techniques such as encryption-aware model truncation and partial hardware reconfiguration; (3) to search for crypto-friendly and accuracy-preserving neural architectures via jointly optimizing non-linear operation reduction, and closed loop “algorithm-hardware” design space exploration.

Journals/Conference Papers Published by Faculty

derek aguiarProfessor Aguiar Journal and Article Produced

M. Hosseini, A. Palmer, W. Manka, P. G. Grady, V. Patchigolla, J. Bi, R. O’Neill, Z. Chi, and D. Aguiar. Deep statistical modelling of nanopore sequencing translocation times reveals latent non-B DNA structures. Bioinformatics (Supplement_1) (2023). This work was published in Intelligent Systems for Molecular Biology 2023 and will be presented in Lyon, France in late July. Info on ISMB: “The annual international conference on Intelligent Systems for Molecular Biology (ISMB) is the flagship meeting of the International Society for Computational Biology (ISCB). The 2023 conference is the 31st ISMB conference, which has grown to become the world’s largest bioinformatics and computational biology conference.”

Non-canonical DNA structures, which deviate from the canonical double helix, play an important role in cellular processes from genomic instability to oncogenesis. Professor Aguiar and Bi’s Ph.D. students Marjan Hosseini and Aaron Palmer led the development of a deep statistical model for predicting non-canonical DNA. This work represents the first computational pipeline to predict non-canonical DNA from nanopore sequencing and will be presented at ISMB 2023, a top conference in computational biology.

Aaron Palmer led the development of the Goodness-of-Fit Autoencoder (GoFAE), a deterministic generative autoencoder that optimizes goodness-of-fit test statistics, and the associated theory to justify its gradient-based optimization. The work will be presented at ICLR 2023, a top machine learning conference.
Palmer, A., Chi, Z., Aguiar, D., & Bi, J. (2023). Auto-Encoding Goodness of Fit. The Eleventh International Conference on Learning Representations.

Learn More @ https://openreview.net/forum?id=JjCAdMUlu9v

Caiwen Ding Produced 11 Conference Papers So Far

  1. Caiwen Ding[23’ICML] Ran Ran, Xinwei Luo, Wei Wang, Tao Liu, Gang Quan, Xiaolin Xu, Caiwen Ding, Wujie Wen. SpENCNN: Orchestrating Encoding and Sparsity for Fast Homomorphically Encrypted Neural Network Inference. In Proceedings of the 40th International Conference on Machine Learning (ICML 2023). Acceptance rate: 21.4%.
  2. [23’IJCAI] Bingbing Li, Zigeng Wang, Shaoyi Huang, Mikhail Bragin, Ji Li, Caiwen Ding. Towards Lossless Head Pruning through Automatic Peer Distillation for Large Language Models. In Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI), 2023. Acceptance rate: 15%.
  3. [23′Oakland] Ce Feng∗, Nuo Xu∗, Wujie Wen, Parv Venkitasubramaniam, Caiwen Ding, Spectral-DP: Differentially Private Deep Learning through Spectral Perturbation and Filtering, IEEE Symposium on Security and Privacy (IEEE S&P “Oakland”).
  4. [23’DAC] Hongwu Peng*, Shanglin Zhou*, Yukui Luo*, Nuo Xu, Shijin Duan, Ran Ran, Jiahui Zhao, Chenghong Wang, Tong Geng, Wujie Wen, Xiaolin Xu, Caiwen Ding, PASNet: Polynomial Architecture Search Framework for Two-party Computation-based Secure Neural Network Deployment, In Proceedings of ACM/EDAC/IEEE Design Automation Conference (DAC).
  5. [23’DAC] Shanglin Zhou*, Yingjie Li*, Minhan Lou, Weilu Gao, Zhijie Shi, Cunxi Yu, Caiwen Ding, Physics-aware Roughness Optimization for Diffractive Optical Neural Networks, In Proceedings of ACM/EDAC/IEEE Design Automation Conference (DAC).
  6. [23’CVPR] Shengkun Tang, Yaqing Wang, Zhenglun Kong, Tianchi Zhang, Yao Li, Caiwen Ding, Yanzhi Wang, Yi Liang, Dongkuan Xu, You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model, 2023 the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  7. [23’CVPR] Lei Zhang*, Jie Zhang*, Bowen Lei, Subhabrata Mukherjee, Xiang Pan, Bo Zhao, Caiwen Ding, Yao Li, Dongkuan Xu, Accelerating Dataset Distillation via Model Augmentation, 2023 the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  8. [23’DAC] Shaoyi Huang, Bowen Lei, Dongkuan Xu, Hongwu Peng, Yue Sun, Mimi Xie, Caiwen Ding, Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off, In Proceedings of ACM/EDAC/IEEE Design Automation Conference (DAC).
  9. [23’DAC] Shaoyi Huang, Haowen Fang, Kaleel Mahmood, Bowen Lei, Nuo Xu, Bin Lei, Yue Sun, Dongkuan Xu, Wujie Wen and Caiwen Ding, Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration, In Proceedings of ACM/EDAC/IEEE Design Automation Conference (DAC), 2023.
  10. [23’ISPASS] Mohsin Shan, Deniz Gurevin, Jared Nye, Caiwen Ding, Omer Khan, Workload Balancing to Unlock Extreme Parallelism for Graph Neural Network Acceleration, In Proceedings of the 2023 International Symposium on Performance Analysis of Systems and Software (ISPASS).
  11. [23’ICRA] Sanbao Su, Yiming Li, Sihong He, Songyang Han, Chen Feng, Caiwen Ding, Fei Miao, Uncertainty Quantification of Collaborative Detection for Self-Driving, In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA).