Submitted by Maksim Afanasyev 30 SLIME: Stabilized Likelihood Implicit Margin Enforcement for Preference Optimization Floating Point Sigma Lab 4 2