John
johnhan00
ยท
AI & ML interests
Pluralistic value alignment
Recent Activity
upvoted a paper 4 days ago
KL for a KL: On-Policy Distillation with Control Variate Baseline commentedon a paper 23 days ago
ThinkBrake: Efficient Reasoning via Log-Probability Margin Guided Decoding