About
Highly motivated MS candidate specializing in AI Safety, Robustness, and Trustworthy Machine Learning with a strong focus on Multimodal Large Language Models (MLLMs). Proven expertise in developing advanced adversarial attacks, enhancing model generalization, and improving data valuation for high-impact research outcomes. Eager to leverage deep technical skills and research acumen to drive innovation in cutting-edge AI applications.
Work
→
Summary
Provided academic support and evaluation for undergraduate courses in Databases and Data Management at the School of Information and Computer Sciences.
Highlights
Graded assignments and provided feedback for CS220P: Databases and Data Management in Fall 2024, ensuring accurate assessment of student comprehension.
Supported CS122A: Introduction to Data Management in Spring 2024, contributing to effective student learning outcomes.
Assisted faculty in maintaining high academic standards and facilitating student success in core computer science subjects.
→
Summary
Conducted advanced research on AI safety and robustness in Multimodal Large Language Models (MLLMs), developing novel adversarial attacks and enhancing model generalization techniques.
Highlights
Developed and demonstrated that MLLMs are susceptible to fine-tuning risks, revealing that fine-tuning can break their safety alignment, critical for AI safety research.
Proposed the Dynamic Vision-Language Alignment (DynVLA) Attack, a transfer-based adversarial method that leverages Gaussian kernel perturbation to generate adversarial examples.
Enabled closed-source models like Google Gemini to generate target text, highlighting significant real-world security vulnerabilities in MLLMs.
Identified crucial roles of vision-language connector architecture, LLM size, and LLM type in effective surrogate model selection and improved transferability for adversarial attacks.
Utilized a model merging method on Low Rank Adaptations (LoRAs) to achieve superior generalization, outperforming model soup in both ID and OOD accuracy under few-shot learning settings.
→
Summary
Contributed to Cleanlab's open-source library by developing advanced Out-of-Distribution (OOD) detection methods and integrating data valuation modules.
Highlights
Developed GEN Out-of-Distribution (OOD) detection methods, significantly enhancing Cleanlab's OOD detection performance for high-resolution image datasets, including ImageNet.
Integrated a data valuation module leveraging KNN-Shapley value to score data point contributions, establishing it as a core feature of the Cleanlab library.
Improved the utility and robustness of Cleanlab's offerings, directly impacting the quality of data-centric AI applications.
→
Summary
Investigated the linearity of representation in backdoored models, focusing on data poisoning and stealthy trigger generation.
Highlights
Analyzed the Pearson correlation coefficient of representation between clean and poisoned inputs to understand backdoor attack mechanisms.
Proposed an innovative training process and a method to generate a more stealthy trigger, significantly improving the covertness of backdoor attacks.
Contributed to research on data poisoning and backdoor attack mitigation strategies, enhancing the robustness of machine learning models.
Awards
Student Scholarship
Awarded By
Southeast University
Received a prestigious student scholarship from Southeast University, acknowledging academic excellence and research potential.
Student Travel Support Award
Awarded By
IEEE Conference on Secure and Trustworthy ML (SaTML)
Awarded travel support to attend the inaugural IEEE Conference on Secure and Trustworthy ML (SaTML) in 2023, recognizing contributions to the field.
Publications
Published by
Under Review
Summary
Co-authored a paper proposing a novel Dynamic Vision-Language Alignment (DynVLA) Attack to enhance adversarial transferability in Multimodal Large Language Models, currently under review for publication.
Skills
Programming Languages
Python, C/C++, Go, Java, Shell, JavaScript.
Frameworks & Libraries
PyTorch, Hugging Face Transformers, PEFT, TRL, Diffusers.
Machine Learning & AI
Fine-tuning CLIP, Diffusion Models, Multimodal Large Language Models (MLLMs), LLaVA, Llama-Vision, Large Language Models (LLMs), SFT (Supervised Fine-tuning), RLHF (Reinforcement Learning from Human Feedback), AI Safety, Robustness, Trustworthy ML, Alignment, Post-training, Data-Centric AI, Adversarial Attacks, Out-of-Distribution (OOD) Detection, Model Generalization, Low Rank Adaptations (LoRA), Data Valuation, KNN-Shapley Value, Backdoor Attack, Data Poisoning, Representation Analysis.
Interests
Research Interests
AI Safety, Robustness, Trustworthy ML, Alignment/Post-training, Data-Centric AI, Multimodal Large Language Models, Adversarial Examples, Backdoor Attacks.
References
Prof. Yao Qin
Assistant Professor and Senior Research Scientist, University of California, Santa Barbara; Google DeepMind
Dr. Jindong Gu
Senior Research Fellow and Faculty Researcher, University of Oxford; Google DeepMind