Medical Malice: A Dataset for Context-Aware Safety in Healthcare LLMs Paper • 2511.21757 • Published Nov 24, 2025
HealthQA-BR: A System-Wide Benchmark Reveals Critical Knowledge Gaps in Large Language Models Paper • 2506.21578 • Published Jun 16, 2025