Chenhe Gu

AI/ML Research Scientist | Multimodal LLMs & AI Safety

Irvine, US.

About

Highly motivated MS candidate specializing in AI Safety, Robustness, and Trustworthy Machine Learning with a strong focus on Multimodal Large Language Models (MLLMs). Proven expertise in developing advanced adversarial attacks, enhancing model generalization, and improving data valuation for high-impact research outcomes. Eager to leverage deep technical skills and research acumen to drive innovation in cutting-edge AI applications.

Work

University of California, Irvine

Grader

Mar 2024

→

May 2024

Summary

Provided academic support and evaluation for undergraduate courses in Databases and Data Management at the School of Information and Computer Sciences.

Highlights

Graded assignments and provided feedback for CS220P: Databases and Data Management in Fall 2024, ensuring accurate assessment of student comprehension.

Supported CS122A: Introduction to Data Management in Spring 2024, contributing to effective student learning outcomes.

Assisted faculty in maintaining high academic standards and facilitating student success in core computer science subjects.

University of California, Santa Barbara

Research Intern

Feb 2023

→

May 2024

Summary

Conducted advanced research on AI safety and robustness in Multimodal Large Language Models (MLLMs), developing novel adversarial attacks and enhancing model generalization techniques.

Highlights

Developed and demonstrated that MLLMs are susceptible to fine-tuning risks, revealing that fine-tuning can break their safety alignment, critical for AI safety research.

Proposed the Dynamic Vision-Language Alignment (DynVLA) Attack, a transfer-based adversarial method that leverages Gaussian kernel perturbation to generate adversarial examples.

Enabled closed-source models like Google Gemini to generate target text, highlighting significant real-world security vulnerabilities in MLLMs.

Identified crucial roles of vision-language connector architecture, LLM size, and LLM type in effective surrogate model selection and improved transferability for adversarial attacks.

Utilized a model merging method on Low Rank Adaptations (LoRAs) to achieve superior generalization, outperforming model soup in both ID and OOD accuracy under few-shot learning settings.

Cleanlab Inc.

Part-time Open Source Contributor

Nov 2023

→

Apr 2024

Summary

Contributed to Cleanlab's open-source library by developing advanced Out-of-Distribution (OOD) detection methods and integrating data valuation modules.

Highlights

Developed GEN Out-of-Distribution (OOD) detection methods, significantly enhancing Cleanlab's OOD detection performance for high-resolution image datasets, including ImageNet.

Integrated a data valuation module leveraging KNN-Shapley value to score data point contributions, establishing it as a core feature of the Cleanlab library.

Improved the utility and robustness of Cleanlab's offerings, directly impacting the quality of data-centric AI applications.

Virginia Tech

Research Intern

May 2022

→

Oct 2022

Summary

Investigated the linearity of representation in backdoored models, focusing on data poisoning and stealthy trigger generation.

Highlights

Analyzed the Pearson correlation coefficient of representation between clean and poisoned inputs to understand backdoor attack mechanisms.

Proposed an innovative training process and a method to generate a more stealthy trigger, significantly improving the covertness of backdoor attacks.

Contributed to research on data poisoning and backdoor attack mitigation strategies, enhancing the robustness of machine learning models.

Education

University of California, Irvine

Sep 2023

→

Jun 2025

Computer Science (Networked Systems)

Grade: 3.83/4.00

Southeast University

Sep 2019

→

Jun 2023

B.Sc.

Computer Science

Grade: 3.78/4.00 | 88.18

Awards

Student Scholarship

Jun 2023

Awarded By

Southeast University

Received a prestigious student scholarship from Southeast University, acknowledging academic excellence and research potential.

Student Travel Support Award

Jan 2023

Awarded By

IEEE Conference on Secure and Trustworthy ML (SaTML)

Awarded travel support to attend the inaugural IEEE Conference on Secure and Trustworthy ML (SaTML) in 2023, recognizing contributions to the field.

Publications

Prompt-insensitive evaluation: Generalizing llm evaluation across prompts through fine-tuning

Jan 2025

Published by

Not specified

Summary

Co-authored a forthcoming paper on prompt-insensitive evaluation for LLMs, focusing on generalizing evaluation across prompts via fine-tuning.

Improving adversarial transferability in MLLMs via dynamic vision-language alignment attack

Jan 2024

Published by

Under Review

Summary

Co-authored a paper proposing a novel Dynamic Vision-Language Alignment (DynVLA) Attack to enhance adversarial transferability in Multimodal Large Language Models, currently under review for publication.

Skills

Programming Languages

Python, C/C++, Go, Java, Shell, JavaScript.

Frameworks & Libraries

PyTorch, Hugging Face Transformers, PEFT, TRL, Diffusers.

Machine Learning & AI

Fine-tuning CLIP, Diffusion Models, Multimodal Large Language Models (MLLMs), LLaVA, Llama-Vision, Large Language Models (LLMs), SFT (Supervised Fine-tuning), RLHF (Reinforcement Learning from Human Feedback), AI Safety, Robustness, Trustworthy ML, Alignment, Post-training, Data-Centric AI, Adversarial Attacks, Out-of-Distribution (OOD) Detection, Model Generalization, Low Rank Adaptations (LoRA), Data Valuation, KNN-Shapley Value, Backdoor Attack, Data Poisoning, Representation Analysis.

Interests

Research Interests

AI Safety, Robustness, Trustworthy ML, Alignment/Post-training, Data-Centric AI, Multimodal Large Language Models, Adversarial Examples, Backdoor Attacks.

References

Prof. Yao Qin

Assistant Professor and Senior Research Scientist, University of California, Santa Barbara; Google DeepMind

Dr. Jindong Gu

Senior Research Fellow and Faculty Researcher, University of Oxford; Google DeepMind