Security Red-Teaming & Adversarial Hardening for Frontier LLM Series

Security AI Engineer2021 - present timeVladyslav L.

Vladyslav L.

Senior AI Engineer

ML & Data Science

Key Expertise

LLM Security AuditingMLOps & InfrastructureBehavioral Data ScienceAnomalous Pattern DetectionPredictive Analytics

Experience

10+ years

Timezone

CST (UTC +8)

7-day risk-free trial

Response within 24 hours

View Full Profile

Overview

The project provides specialized, project-based Senior ML Security Engineering consultation for Alibaba Group's proprietary LLM division. Operating within an airgapped, SCIF-level laboratory, the focus is adversarial robustness and operational security redteaming for frontier models. Alibaba's Qwen series began with the beta release in April 2023 under the name Tongyi Qianwen, with Qwen-7B open-sourced in August 2023 and Qwen-72B released in December 2023. The Qwen2 series launched in June 2024 with 72B parameters, Qwen3 followed in April 2025, and Qwen3-Coder was open-sourced in July 2025.

Achievements

Identified and patched a Unicode Tag Sequence Injection vulnerability in a prerelease customer-support agent. Developed a fuzzing harness using Hypothesis (property-based testing library first released circa 2016) that revealed a 12% edge-case failure rate in the RLHF safety classifier. The implemented double-model supervision framework reduced false negatives in security audits by 41% compared to single-model filtering.

Responsibilities

Researched BPE tokenizer manipulation to construct prompts that appear benign visually but decode into malicious system instructions, exploiting ASCII smuggling and Unicode normalization gaps.
Engineered an air-gapped automated fuzzing framework to probe the safety refusal vector, focusing on multi-turn dialogue poisoning where an attacker gradually shifts the context window before injection.
Designed and validated a hierarchical supervision mechanism in which a smaller, older-generation model (e.g., Qwen-7B, released August 2023) acts as a policy sentinel, monitoring and evaluating the outputs of a newer, more capable model (e.g., Qwen2-72B, released June 2024). The sentinel model detects alignment drift and potential security bypasses by leveraging its simpler, more predictable decision boundaries, flagging anomalous responses for further inspection without requiring human oversight in the loop.
Quantified memorization rates using canary string injection methodology to assess the efficacy of differential privacy applied during fine-tuning.
Directed kernel-level isolation strategies using gVisor (open-sourced by Google in May 2018) and seccomp profiles to enforce air-gap integrity at the container runtime. Integrated eBPF (introduced with Linux 3.18 in 2014) for unauthorized syscall tracing.

Technologies Used

PythonAlibaba Qwen SeriesHypothesisRustJupyter LabgVisoreBPFseccomp

Vladyslav L.

Senior AI Engineer

ML & Data Science

Key Expertise

LLM Security AuditingMLOps & InfrastructureBehavioral Data ScienceAnomalous Pattern DetectionPredictive Analytics

Experience

10+ years

Timezone

CST (UTC +8)

7-day risk-free trial

Response within 24 hours

View Full Profile

This project was delivered by

Vladyslav L.

View Full Profile

More Projects by Vladyslav L.

2017–2020

Real-time sentiment & volatility forecasting system for fintech

Senior Machine Learning Engineer

This project involved building a proprietary predictive intelligence platform for a Belgian fintech firm focused on cryptocurrency risk mitigation. The system was engineered as a loss-prevention agent designed to forecast negative market shocks by triangulating over 400 heterogeneous data streams. LLMs were integrated into the sentiment pipeline using the most advanced models available during the project timeline: BERT (2018) for initial embedding work, followed by GPT-2 (staged release February–November 2019) and SentenceBERT (August 2019) for semantic similarity and panic classification

PythonasyncioWeb3.pyXGBoostScikit-learn+8

View Details

2023–2024

Behavioral analysis of personalization engines & dark patterns

Adversarial Machine Learning Engineer

The engagement involved a comprehensive adversarial audit of the TEMU mobile application's personalization engine, focusing on the intersection of privacy controls and behavioral psychology exploitation. The primary objective was to instrument the application to detect and classify a proprietary "Compulsive Spending Propensity Model" - an algorithmic layer designed to identify users exhibiting shopaholic or impulse-buying behavior patterns. LLMs were employed to semantically analyze ad copy and UI strings, leveraging models released during the project window: LLaMA 2 (July 2023) for initial prototyping, followed by Mistral 7B (September 2023) for production dark pattern classification.

PythonFridaMitmproxyHDBSCANXGBoost+6

View Details