arxiv:2510.23393
Farid Bagirov
kraalfar
AI & ML interests
None yet
Recent Activity
upvoted a paper 2 days ago
Mellum2 Technical Report authored a paper 7 months ago
The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N
Sampling via max@k Optimisation