Linking pages
Linked pages
- The Dual LLM pattern for building AI assistants that can resist prompt injection https://simonwillison.net/2023/Apr/25/dual-llm-pattern/ 116 comments
- Representation Engineering Mistral-7B an Acid Trip https://vgel.me/posts/representation-engineering/ 75 comments
- [2402.16459] Defending LLMs against Jailbreaking Attacks via Backtranslation https://arxiv.org/abs/2402.16459 48 comments
- You can’t solve AI security problems with more AI https://simonwillison.net/2022/Sep/17/prompt-injection-more-ai/ 31 comments
- [2310.03684] SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks https://arxiv.org/abs/2310.03684 22 comments
- doublespeak.chat https://doublespeak.chat 12 comments
- GitHub - NVIDIA/NeMo-Guardrails: NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems. https://github.com/NVIDIA/NeMo-Guardrails 1 comment
- Securing LLM Systems Against Prompt Injection | NVIDIA Technical Blog https://developer.nvidia.com/blog/securing-llm-systems-against-prompt-injection/ 1 comment
- OpenAI API https://platform.openai.com/docs/guides/safety-best-practices 0 comments
- GitHub - whylabs/langkit: 🔍 LangKit: An open-source toolkit for monitoring Language Learning Models (LLMs). 📚 Extracts signals from prompts & responses, ensuring safety & security. 🛡️ Features include text quality, relevance metrics, & sentiment analysis. 📊 A comprehensive tool for LLM observability. 👀 https://github.com/whylabs/langkit 0 comments
- GitHub - protectai/rebuff: Rebuff.ai - Prompt Injection Detector https://github.com/protectai/rebuff 0 comments
- GitHub - amoffat/HeimdaLLM: Use LLMs to construct trusted output from untrusted input https://github.com/amoffat/HeimdaLLM 0 comments
- GitHub - jthack/PIPE: Prompt Injection Primer for Engineers https://github.com/jthack/PIPE 0 comments
- [2308.14132] Detecting Language Model Attacks with Perplexity https://arxiv.org/abs/2308.14132 0 comments
- Cloud & App Security Product Insights https://list.latio.tech/ 0 comments
- Purple Llama CyberSecEval: A benchmark for evaluating the cybersecurity risks of large language models | Research - AI at Meta https://ai.meta.com/research/publications/purple-llama-cyberseceval-a-benchmark-for-evaluating-the-cybersecurity-risks-of-large-language-models/ 0 comments
- GitHub - guardrails-ai/guardrails: Adding guardrails to large language models. https://github.com/guardrails-ai/guardrails 0 comments
- Improving LLM Security Against Prompt Injection: AppSec Guidance For Pentesters and Developers - Include Security Research Blog https://blog.includesecurity.com/2024/01/improving-llm-security-against-prompt-injection-appsec-guidance-for-pentesters-and-developers/ 0 comments
- Recommendations to help mitigate prompt injection: limit the blast radius https://simonwillison.net/2023/Dec/20/mitigate-prompt-injection/ 0 comments
Related searches:
Search whole site: site:github.com
Search title: GitHub - tldrsec/prompt-injection-defenses: Every practical and proposed defense against prompt injection.
See how to search.