-
Diffusion Language Models Know the Answer Before Decoding
Paper • 2508.19982 • Published • 27 -
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
Paper • 2512.13586 • Published • 93 -
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following
Paper • 2601.06431 • Published • 12 -
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning
Paper • 2601.09088 • Published • 63
Collections
Discover the best community collections!
Collections including paper arxiv:2601.22975
-
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods
Paper • 2601.21821 • Published • 62 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 111 -
Reinforced Attention Learning
Paper • 2602.04884 • Published • 30 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 63
-
LongCat-Flash-Thinking-2601 Technical Report
Paper • 2601.16725 • Published • 180 -
DeepSeek-OCR 2: Visual Causal Flow
Paper • 2601.20552 • Published • 68 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
BMAM: Brain-inspired Multi-Agent Memory Framework
Paper • 2601.20465 • Published • 5
-
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 204 -
ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas
Paper • 2601.21558 • Published • 60 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 111
-
V-Thinker: Interactive Thinking with Images
Paper • 2511.04460 • Published • 98 -
Visual Spatial Tuning
Paper • 2511.05491 • Published • 53 -
BabyVision: Visual Reasoning Beyond Language
Paper • 2601.06521 • Published • 201 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 111
-
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 322 -
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
Paper • 2512.23988 • Published • 19 -
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
Paper • 2512.25075 • Published • 15 -
Guiding a Diffusion Transformer with the Internal Dynamics of Itself
Paper • 2512.24176 • Published • 8
-
Behavior Knowledge Merge in Reinforced Agentic Models
Paper • 2601.13572 • Published • 27 -
Language of Thought Shapes Output Diversity in Large Language Models
Paper • 2601.11227 • Published • 9 -
Agentic-R: Learning to Retrieve for Agentic Search
Paper • 2601.11888 • Published • 19 -
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System
Paper • 2602.02488 • Published • 36
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 628 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 302 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 320 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 213
-
Diffusion Language Models Know the Answer Before Decoding
Paper • 2508.19982 • Published • 27 -
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
Paper • 2512.13586 • Published • 93 -
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following
Paper • 2601.06431 • Published • 12 -
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning
Paper • 2601.09088 • Published • 63
-
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods
Paper • 2601.21821 • Published • 62 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 111 -
Reinforced Attention Learning
Paper • 2602.04884 • Published • 30 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 63
-
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 322 -
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
Paper • 2512.23988 • Published • 19 -
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
Paper • 2512.25075 • Published • 15 -
Guiding a Diffusion Transformer with the Internal Dynamics of Itself
Paper • 2512.24176 • Published • 8
-
LongCat-Flash-Thinking-2601 Technical Report
Paper • 2601.16725 • Published • 180 -
DeepSeek-OCR 2: Visual Causal Flow
Paper • 2601.20552 • Published • 68 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
BMAM: Brain-inspired Multi-Agent Memory Framework
Paper • 2601.20465 • Published • 5
-
Behavior Knowledge Merge in Reinforced Agentic Models
Paper • 2601.13572 • Published • 27 -
Language of Thought Shapes Output Diversity in Large Language Models
Paper • 2601.11227 • Published • 9 -
Agentic-R: Learning to Retrieve for Agentic Search
Paper • 2601.11888 • Published • 19 -
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System
Paper • 2602.02488 • Published • 36
-
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 204 -
ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas
Paper • 2601.21558 • Published • 60 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 111
-
V-Thinker: Interactive Thinking with Images
Paper • 2511.04460 • Published • 98 -
Visual Spatial Tuning
Paper • 2511.05491 • Published • 53 -
BabyVision: Visual Reasoning Beyond Language
Paper • 2601.06521 • Published • 201 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 111
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 628 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 302 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 320 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 213