LLM Safety From Within: Detecting Harmful Content with Internal Representations Paper • 2604.18519 • Published 27 days ago • 26