Congratulations Tan Zhu for your paper "Polyhedron Attention Module: Learning Adaptive-order Interactions," being accepted for presentation at the conference of Neural Information Processing Systems (NeurIPS).
Can you summarize your research area?
My research interests lie primarily in developing novel DNN architectures for recommendation system on mental health disorder diagnostic and the click-through rate prediction, and reinforcement learning algorithms focusing on deep stochastic contexture bandit problem and Monte Carlo tree search.
What is the overarching goal of your graduate study?
My overarching goal is to improve DNN’s interpretability and the performance by conducting feature selection with deep reinforcement learning and incorporate novel feature interactions with trainable complexity into the training process of DNNs.
How do you hope that you will have changed computing in five years?
In the next five years, in addition to developing interpretation methods for DNNs, I’m going to explore the feature selection and dataset distillation algorithms utilizing the model interpretations of DNNs. With the interpretable knowledge extracted from the state-of-the-art DNN models, it's possible to efficiently and elegantly downscale large datasets, and reduce the time and space complexity of on the training of large DNNs. Over the past few years, Large Language Models (LLMs) have undergone significant development, marking a transformative period in the field of artificial intelligence and natural language processing. Given these challenges, I am confident that my research can offer valuable contributions to both the academic and industrial works in this area.
How does additional support allow you to more effectively complete your graduate study?
I'm really thankful for the support I've received during my graduate studies. Prof. Bi's guidance has been incredibly valuable, helping me grow academically and professionally. The support provided by the CACC and the Computer Science department give me a collaborative environment, which greatly enriched my learning experience. The availability of high-performance computing resources allows me to engage in advanced deep learning research, which demands substantial computational power.
What are you hoping to do upon graduation?
Upon graduation, I’m planning to transition into the industry.
(For papers) What is the major improvement made in this work? What consequences does this improvement have for the field in general?
Our Polyhedron Attention Module (PAM) could adaptively learn interactions with different complexity for different samples, and in our theoretic analysis, we showed that PAM has stronger expression capability than ReLU-activated networks. Extensive experimental results demonstrate the state-of-the-art classification performance of PAM on massive datasets of the click-through rate prediction and PAM can learn meaningful interaction effects in a medical problem. These improvements not only set new benchmarks in click-through rate prediction but also underscores the growing importance of model transparency in AI.
Tan Zhu
"The NeurIPS conference offered a comprehensive overview of current research trends, including developments in large language models, knowledge distillation, and reinforcement learning. The most notable aspect was the researchers' emphasis on applying large language models to various research fields, demonstrating the significant potential of these models in addressing diverse challenges".