CodeLeWM Transition Model Artifacts

This repository hosts CodeLeWM transition-model checkpoints and checkpoint manifests for code-edit and execution-world-model experiments. CodeLeWM is a research artifact for scoring and reranking candidate code states; it is not a code generator and this repository is not an inference endpoint.

What Is Hosted Here

Artifact family	Repository path	Dataset surface	Claim posture
Early scaled code-edit transition run	`checkpoints/codelewm-scaled-20260520-9699b53`	`abdelstark/codelewm-public-shard`	smoke/scaled diagnostic
Action-use margin run	`checkpoints/codelewm-action-use-20260520-6650183`	`abdelstark/codelewm-public-shard`	negative action-use result
Action-use retrieval run	`checkpoints/codelewm-action-use-retrieval-20260520-7895d18`	`abdelstark/codelewm-public-shard`	negative against no-action control
v0.2 action-swap/inverse-action run	`checkpoints/codelewm-v0-2-action-swap-rerun-20260520-7c7cb0b`	`abdelstark/codelewm-public-shard`	negative action-use and representation result
v0.8 execution checkpoints, seeds 42 and 1729	`checkpoints/codelewm-v0-8-short-execution-20260605-1b737e4-seed-{42,1729}`	`abdelstark/codelewm-execution-pack`	mixed diagnostic execution evidence

The final v0.9 seed-42 and seed-1729 execution runs are published in abdelstark/codelewm-runs, not in this model repository. They are documented by the CodeLeWM release cards and final public artifact index.

Dataset Information

CodeLeWM uses two public dataset surfaces:

Dataset	Role
`abdelstark/codelewm-public-shard`	Historical public-safe Python code-edit transition shards used by the scaled/action-use runs.
`abdelstark/codelewm-execution-pack`	Current execution-substrate pack of 2,188 tokenized `(code, input, output)` records at revision `v0.9.0-rc1`.

The execution pack records a deterministic sandbox policy, source provenance, split policy, checksums, and a claim boundary. It contains tokenized code, inputs, outputs, and metadata; training and scoring do not execute candidate code.

Intended Use

Reproduce CodeLeWM checkpoint loading, retrieval, surprise, scorer-quality, and reranking diagnostics.
Compare transition-model scores against no-action, shuffled-action, lexical, random, and LLM-order controls.
Inspect checkpoint manifests and trust-gate metadata before using a checkpoint in local scoring.

Out Of Scope

Generating code.
Claiming broad coding improvement or live patch utility.
Treating a green manifest, demo, or checkpoint load as a model-quality claim.
Loading checkpoints without the CodeLeWM checkpoint trust gates and manifest verification.

Claim Boundary

The tested code-edit action-use interventions are negative. The v0.9/v1.0 release evidence supports a narrow HumanEval WS-D diagnostic reranking slice, while the aggregate downstream claim remains closed because MBPP-Plus is saturated against no-action and lexical controls. This repository therefore supports reproducible diagnostic research, not a general claim that CodeLeWM improves coding.

Verification

Download a hosted checkpoint family and verify its manifests:

hf download abdelstark/codelewm-transition-model \
  --repo-type model \
  --include 'checkpoints/codelewm-v0-8-short-execution-20260605-1b737e4-seed-42/**' \
  --local-dir .artifacts/hf-download/codelewm-transition-model

uv run codelewm manifest verify \
  --manifest .artifacts/hf-download/codelewm-transition-model/checkpoints/codelewm-v0-8-short-execution-20260605-1b737e4-seed-42/manifest.json \
  --json

uv run codelewm secret-scan \
  .artifacts/hf-download/codelewm-transition-model/checkpoints/codelewm-v0-8-short-execution-20260605-1b737e4-seed-42 \
  --json

Expected result: manifest verification returns ok=true and the secret scan returns ok=true with zero findings.

Primary References

Code repository: https://github.com/AbdelStark/CodeLeWM
Execution dataset: https://huggingface.co/datasets/abdelstark/codelewm-execution-pack
Historical code-edit shard: https://huggingface.co/datasets/abdelstark/codelewm-public-shard
Run artifacts: https://huggingface.co/datasets/abdelstark/codelewm-runs
Final artifact index: docs/benchmark/PUBLIC_ARTIFACT_INDEX_2026-06-08.md

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

abdelstark
/

codelewm-transition-model