Ambroser53 's Collections Alignment
updated
Understanding the performance gap between online and offline alignment
algorithms
Paper
• 2405.08448
• Published
• 18
Self-Exploring Language Models: Active Preference Elicitation for Online
Alignment
Paper
• 2405.19332
• Published
• 22
Offline Regularised Reinforcement Learning for Large Language Models
Alignment
Paper
• 2405.19107
• Published
• 15
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback
Paper
• 2406.00888
• Published
• 33
Scaling Laws for Reward Model Overoptimization in Direct Alignment
Algorithms
Paper
• 2406.02900
• Published
• 13
BPO: Supercharging Online Preference Learning by Adhering to the
Proximity of Behavior LLM
Paper
• 2406.12168
• Published
• 7
Deep Bayesian Active Learning for Preference Modeling in Large Language
Models
Paper
• 2406.10023
• Published
• 2
Bootstrapping Language Models with DPO Implicit Rewards
Paper
• 2406.09760
• Published
• 41
Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix"
Cycle
Paper
• 2407.13833
• Published
• 12
Baichuan Alignment Technical Report
Paper
• 2410.14940
• Published
• 51