arxiv:2511.13524
Xiaoji Zheng
Student-Xiaoji
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 14 hours ago
Bridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code Generation
upvoted
a
paper
about 14 hours ago
Unified Personalized Reward Model for Vision Generation
upvoted
a
paper
about 14 hours ago
CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs
Organizations
None yet