ELF: Embedded Language Flows
Paper • 2605.10938 • Published • 11
Trained checkpoints from an unofficial PyTorch reproduction of ELF: Embedded Language Flows (Hu et al., 2026). Code, training/eval scripts, and reproduction artifacts live at https://github.com/Ugness/ELF-pytorch.
Results for ELF are not directly comparable with baselines (MDLM, Duo, FLM, ...) due to tokenization and preprocessing differences.
| Path | Size | Role |
|---|---|---|
last.ckpt |
1.4 GB | Final EMA-bearing checkpoint (== checkpoint_epoch05_step00228204.ckpt). This is the one used for the headline 1000-sample eval. |
checkpoints/checkpoint_epoch00_step00038034.ckpt … checkpoint_epoch05_step00228204.ckpt |
6 × 1.4 GB | Per-epoch checkpoints for reproducing the per-epoch trajectory. Optional. |
reproduction/config.yml |
— | Resolved training-config snapshot from the actual run. |
reproduction/eval1000/{all_generated,metrics}.jsonl |
— | 1000 generated samples + final Gen. PPL/entropy. |
reproduction/per_epoch/epoch_00{1..6}.jsonl + metrics.jsonl |
— | 256 sanity samples per epoch + per-epoch metrics. |
| Metric | Paper (TPU v5p-64) | This reproduction (8× B200 DDP, Lightning) |
|---|---|---|
| Gen. PPL ↓ | 24.1 | 25.61 |
| Entropy ↑ | 5.15 | 5.20 |
| Epoch | Step | Gen. PPL | Entropy |
|---|---|---|---|
| 1 | 38 034 | 2.73¹ | 0.70¹ |
| 2 | 76 068 | 37.11 | 5.17 |
| 3 | 114 102 | 28.63 | 5.21 |
| 4 | 152 136 | 25.00 | 5.16 |
| 5 | 190 170 | 25.58 | 5.19 |
| 6 | 228 204 | 26.11 | 5.21 |
¹ Epoch 1 is degenerate (entropy ≈ 0.7); the run only begins producing fluent text from epoch 2 onward.
pip install huggingface_hub
# Final EMA checkpoint only (recommended)
huggingface-cli download Ugness/elf-torch last.ckpt --local-dir ./elf-b/
# Then, from the code repo (https://github.com/Ugness/ELF-pytorch):
cd pytorch_lightning/
torchrun --nproc_per_node=8 --master_port=29510 eval_lightning.py \
--config configs/training_configs/train_owt_ELF-B.yml \
--checkpoint_path /path/to/elf-b/last.ckpt \
--num_samples 1000
# Expected: Gen. PPL ≈ 25.6, sample entropy ≈ 5.20.
MIT, same as the code repo. Please cite the original paper:
@article{elf2026,
title={ELF: Embedded Language Flows},
author={Hu, Keya and Qiu, Linlu and Lu, Yiyang and Zhao, Hanhong and Li, Tianhong and Kim, Yoon and Andreas, Jacob and He, Kaiming},
journal={arXiv preprint arXiv:2605.10938},
year={2026}
}
This reproduction was heavily developed with Claude Code.