xiaotong
xtongji
AI & ML interests
None yet
Recent Activity
upvoted a paper about 22 hours ago
Decoding as Optimisation on the Probability Simplex: From Top-K to Top-P (Nucleus) to Best-of-K Samplers upvoted a paper 18 days ago
Multi-Task GRPO: Reliable LLM Reasoning Across Tasks authored
a paper
25 days ago
Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving Organizations
None yet