DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models

May 20, 2026 · 4:00 AM UTC ·3 min read · 0 reactions · 0 comments · 11 views

#adversarial attacks #machine learning #artificial intelligence

⚡ TL;DR · AI summary

The paper introduces DarkLLM, a novel framework for generating adversarial attacks using large language models. This approach allows for the translation of natural-language attack instructions into effective visual perturbations across various models. The authors demonstrate that DarkLLM can produce highly effective attacks with only 1B parameters, highlighting vulnerabilities in modern foundation models.

Key facts

▪DarkLLM unifies various types of adversarial attacks within a single framework.
▪The framework leverages natural-language instruction tuning for flexible adversarial generation.
▪Extensive experiments show DarkLLM's effectiveness against multiple models and tasks.

Original article

arXiv cs.AI

Read full at arXiv cs.AI →

Opening excerpt (first ~120 words) tap to expand

Computer Science > Cryptography and Security arXiv:2605.18868 (cs) [Submitted on 15 May 2026] Title:DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models Authors:Ye Sun, Xin Wang, Jiaming Zhang, Yifeng Gao, Yixu Wang, Yifan Ding, Qixian Zhang, Henghui Ding, Xingjun Ma, Yu-Gang Jiang View a PDF of the paper titled DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models, by Ye Sun and 9 other authors View PDF HTML (experimental) Abstract:While vision and multimodal foundation models underpin critical tasks from perception to complex reasoning, they remain highly vulnerable to adversarial attacks.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.

Anonymous · no account needed

Discussion

0 comments

DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models

Discussion

More from arXiv cs.AI