How I Built a Unicode Sanitizer to Stop Hidden Prompt Injection Attacks
Jade Duan has developed an open-source tool called Velio to combat hidden prompt injection attacks by stripping invisible Unicode characters from text. The tool aims to enhance the security of language models by normalizing and removing problematic Unicode characters. Velio provides structured findings on the characters removed, allowing users to inspect hidden content in their text inputs.
- ▪Velio is a Python library and REST API designed to sanitize text before it reaches a language model.
- ▪The tool targets four main categories of problematic Unicode characters, including zero-width spaces and bidirectional overrides.
- ▪Users can choose between two output modes: strip, which removes characters, and mark, which indicates removed characters with tokens.
Opening excerpt (first ~120 words) tap to expand
try { if(localStorage) { let currentUser = localStorage.getItem('current_user'); if (currentUser) { currentUser = JSON.parse(currentUser); if (currentUser.id === 1892859) { document.getElementById('article-show-container').classList.add('current-user-is-article-author'); } } } } catch (e) { console.error(e); } Jade Duan Posted on May 16 How I Built a Unicode Sanitizer to Stop Hidden Prompt Injection Attacks #llm #opensource #security #showdev I recently shipped a small open-source tool called Velio that strips hidden Unicode characters from text before it reaches an LLM. This post explains why I built it, what it actually catches, and how to use it.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at DEV.to (Top).