Tag: LLM safety

Browse our exclusive articles!

CAP: Efficient Knowledge Unlearning in Large Language Models

Discover CAP, a novel prompt-driven framework enabling precise, controllable unlearning in LLMs without modifying model parameters.

Principled LLM Safety Testing: Solving Jailbreak Oracle

Discover Boa, a novel system tackling the jailbreak oracle problem to improve LLM safety testing and prevent harmful jailbreak attacks.

Logic Jailbreak: Bypass LLM Safety with Formal Logic

Discover how Logic Jailbreak uses formal logical expressions to efficiently bypass LLM safety restrictions across multiple languages.

Detecting Harmful Intent in LLM Residual Streams Geometrically

Discover how harmful intent in large language models can be geometrically identified from residual streams, enhancing AI safety and alignment.

FineSteer: Advanced Inference-Time Steering for LLMs

Discover FineSteer, a unified framework enhancing fine-grained inference-time steering in large language models for safer, accurate AI outputs.

Popular

OpenAI Partners with Malta to Offer ChatGPT Plus Nationwide

OpenAI and Malta team up to provide free ChatGPT Plus access and AI training to all citizens, promoting digital literacy and responsible AI use.

Critical Linux Kernel Flaw Risks SSH Host Key Theft

A critical Linux kernel flaw risks stolen SSH host keys. Learn how to protect your systems and stay secure until patches are widely available.

Top External Hard Drives 2026: Expert Reviews & Buying Guide

Discover the best external hard drives of 2026 with expert reviews. Find top picks for speed, durability, and security to suit all storage needs.

Fitbit Air Deal on Amazon: 26% Off + Free Band Offer

Get 26% off the new Fitbit Air on Amazon with a free band included. Limited-time offer—boost your fitness with advanced tracking and stylish design.

Subscribe

spot_imgspot_img