Internship Opportunity: Trustworthy ML & LLM Security (Attacks & Defenses)

If you are sending email - make sure you have some expertise in this area, you already know about some attacks and defenses.

Start Date: As soon as possible

Duration: 3–6 months (extendable based on performance)

Description:

This internship focuses on evaluating the security, privacy, and robustness of machine learning models, including modern Large Language Models (LLMs). We are looking for highly motivated undergraduate or graduate students interested in studying and benchmarking attacks and defenses in ML systems.

The project involves hands-on experimentation with real-world models, where you will reproduce, analyze, and extend existing attack and defense techniques. The goal is to develop a deeper understanding of how ML systems fail — and how to make them robust, reliable, and trustworthy.

Responsibilities:

Implement and benchmark attack strategies (e.g., poisoning, inference, jailbreaks)
Study and evaluate defense mechanisms for ML and LLM systems
Design experimental pipelines for reproducible evaluation
Analyze robustness, privacy leakage, and system vulnerabilities
Contribute to research reports and potential publications

Requirements:

Strong programming skills in Python
Basic understanding of Machine Learning / Deep Learning
Familiarity with frameworks like PyTorch or TensorFlow
Interest in AI security, privacy, or trustworthy AI
Prior experience with LLMs, FL, or security is a plus (not mandatory)

What You Will Gain:

Hands-on experience with state-of-the-art ML and LLM security research
Exposure to real research workflows (experiments, evaluation, writing)
Opportunity to contribute to publications or open-source tools

Application:

Send your CV and a brief motivation to hkasyap.cse@iitbhu.ac.in with the subject: “Intern – Trustworthy ML / LLM Security”