GitHub - tldrsec/prompt-injection-defenses: Every practical and proposed defense against prompt injection.

Linking pages

GitHub - Tanq16/link-hub: A collection of helpful resources related to Cybersecurity and a lot more. https://github.com/Tanq16/link-hub 1 comment

Linked pages

The Dual LLM pattern for building AI assistants that can resist prompt injection https://simonwillison.net/2023/Apr/25/dual-llm-pattern/ 116 comments
Representation Engineering Mistral-7B an Acid Trip https://vgel.me/posts/representation-engineering/ 75 comments
[2402.16459] Defending LLMs against Jailbreaking Attacks via Backtranslation https://arxiv.org/abs/2402.16459 48 comments
You can’t solve AI security problems with more AI https://simonwillison.net/2022/Sep/17/prompt-injection-more-ai/ 31 comments
[2310.03684] SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks https://arxiv.org/abs/2310.03684 22 comments
doublespeak.chat https://doublespeak.chat 12 comments
GitHub - NVIDIA/NeMo-Guardrails: NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems. https://github.com/NVIDIA/NeMo-Guardrails 1 comment
Securing LLM Systems Against Prompt Injection | NVIDIA Technical Blog https://developer.nvidia.com/blog/securing-llm-systems-against-prompt-injection/ 1 comment
OpenAI API https://platform.openai.com/docs/guides/safety-best-practices 0 comments
GitHub - whylabs/langkit: 🔍 LangKit: An open-source toolkit for monitoring Language Learning Models (LLMs). 📚 Extracts signals from prompts & responses, ensuring safety & security. 🛡️ Features include text quality, relevance metrics, & sentiment analysis. 📊 A comprehensive tool for LLM observability. 👀 https://github.com/whylabs/langkit 0 comments
GitHub - protectai/rebuff: Rebuff.ai - Prompt Injection Detector https://github.com/protectai/rebuff 0 comments
GitHub - amoffat/HeimdaLLM: Use LLMs to construct trusted output from untrusted input https://github.com/amoffat/HeimdaLLM 0 comments
GitHub - jthack/PIPE: Prompt Injection Primer for Engineers https://github.com/jthack/PIPE 0 comments
[2308.14132] Detecting Language Model Attacks with Perplexity https://arxiv.org/abs/2308.14132 0 comments
Cloud & App Security Product Insights https://list.latio.tech/ 0 comments
Purple Llama CyberSecEval: A benchmark for evaluating the cybersecurity risks of large language models | Research - AI at Meta https://ai.meta.com/research/publications/purple-llama-cyberseceval-a-benchmark-for-evaluating-the-cybersecurity-risks-of-large-language-models/ 0 comments
GitHub - guardrails-ai/guardrails: Adding guardrails to large language models. https://github.com/guardrails-ai/guardrails 0 comments
Improving LLM Security Against Prompt Injection: AppSec Guidance For Pentesters and Developers - Include Security Research Blog https://blog.includesecurity.com/2024/01/improving-llm-security-against-prompt-injection-appsec-guidance-for-pentesters-and-developers/ 0 comments
Recommendations to help mitigate prompt injection: limit the blast radius https://simonwillison.net/2023/Dec/20/mitigate-prompt-injection/ 0 comments