LAnoBERT: System Log Anomaly Detection based on BERT Masked Language Model Paper • 2111.09564 • Published Jul 23, 2023
CheckEval: Robust Evaluation Framework using Large Language Model via Checklist Paper • 2403.18771 • Published Mar 27, 2024
RExBench: Can coding agents autonomously implement AI research extensions? Paper • 2506.22598 • Published Jun 27, 2025 • 11
Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models Paper • 2404.13919 • Published Feb 23, 2025
CIRF: Tokenizing Chain-of-Thoughts into Reusable Functional Units for Efficient Latent Reasoning in Large Language Models Paper • 2605.28292 • Published 12 days ago
RExBench: Can coding agents autonomously implement AI research extensions? Paper • 2506.22598 • Published Jun 27, 2025 • 11