Exporters

New export backends can be added to Transformers by subclassing HfExporter.

Learn how to use the built-in exporters in the Exporters guide.

AutoHfExporter

class transformers.exporters.AutoHfExporter

( )

The Auto-HF expoerter class that takes care of automatically instantiating to the correct HfExporter given the ExportConfig.

from_pretrained

< source >

( pretrained_model_name_or_path**kwargs )

Load an exporter instance from a pretrained model/checkpoint that ships an export config.

Not implemented yet — placeholder for a first-class “export recipe” workflow.

The idea: model owners publish an export_config.json (or an export_config field in config.json) alongside their weights on the Hub. That file captures the settings the owner has already validated for their architecture — the target format (dynamo / onnx / executorch), exact dynamic-shape specs (e.g. text_ids dynamic to 4096, image tiles fixed at 448, batch=1 for edge deployment), strict flag, ONNX opset, prefill vs. decode layout, ExecuTorch backend choice, and any other knob that today lives as tribal knowledge in a README or a private notebook.

Consumers then get the owner-validated export in one call:

exporter = AutoHfExporter.from_pretrained("org/model-name")
program = exporter.export(model, inputs)

Composes with the [register_export_input_preparer] registry: the owner supplies the shape spec via export_config.json, transformers supplies the data-dependent precomputations (cu_seqlens, vision position ids, window indices, …) for that architecture. Together they cover the two hard parts of exporting new models — knowing the right shape contract and preparing the right inputs — so downstream users don’t re-derive either from scratch (and don’t break in production when they get it wrong).

supports_export_format

< source >

( export_config_dict: dict )

Return True if the provided dict describes an export_format that has both a registered config class and a registered exporter class. Warns with an actionable message when the format is missing entirely, unknown, or only half-registered.

AutoExportConfig

class transformers.exporters.AutoExportConfig

< source >

( )

The Auto-HF export config class that takes care of automatically dispatching to the correct export config given an export config stored in a dictionary.

HfExporter

class transformers.exporters.HfExporter

< source >

( )

Abstract base class for all Transformers exporters.

Subclass and implement ~HfExporter.export to add a new export backend.

export

< source >

( model: PreTrainedModelsample_inputs: MutableMapping[str, torch.Tensor | Cache]config: ExportConfigMixin )

Parameters

model (PreTrainedModel) — The model to export.
sample_inputs (dict[str, torch.Tensor | Cache]) — Forward kwargs — what you’d pass to model(**sample_inputs). These are used directly as the example inputs during tracing. For an autoregressive decode-step export, this means you need to include past_key_values, cache_position, etc. If you only have generation-style inputs, use ~HfExporter.export_for_generation instead — it runs model.generate for you and exports each stage.
config (ExportConfigMixin) — Backend-specific configuration.

Export the model and return the backend-specific program object.

export_for_generation

< source >

( model: PreTrainedModelsample_inputs: MutableMapping[str, torch.Tensor | Cache]config: ExportConfigMixin | dict[str, ExportConfigMixin] ) → dict[str, Any]

Parameters

model (PreTrainedModel) — The generative model to export. Must support model.generate(**sample_inputs).
sample_inputs (dict[str, torch.Tensor | Cache]) — Generate kwargs — what you’d pass to model.generate(**sample_inputs) (typically input_ids + attention_mask, plus any modality inputs like pixel_values / input_features for multi-modal models). Per-stage forward kwargs are captured internally.
config (ExportConfigMixin or dict[str, ExportConfigMixin]) — Backend-specific configuration. Pass a single config to apply to every component, or a dict keyed by component name (e.g. "image_encoder", "language_model", "lm_head", "decode") to override per-component — all component names must be present in the dict.

Returns

dict[str, Any]

{component_name: backend_specific_artifact} — same keys as decompose_for_generation(). Values are whatever ~HfExporter.export returns for the concrete backend (ExportedProgram, ONNXProgram, ExecutorchProgramManager).

Decompose a generative model and export each component independently.

Thin wrapper around decompose_for_generation() that calls ~HfExporter.export on every returned (submodel, forward_inputs) pair. If you need the intermediate (submodel, forward_inputs) pairs (for verification, custom inputs, skipping a stage, …), call decompose_for_generation() directly.

validate_environment

< source >

( *args**kwargs )

Check required_packages are installed and warn on version drift from tested_versions.

DynamoConfig

class transformers.exporters.DynamoConfig

< source >

( export_format: ExportFormat = <ExportFormat.DYNAMO: 'dynamo'>dynamic: bool = Falsestrict: bool = Falsedynamic_shapes: dict[str, typing.Any] | None = Noneprefer_deferred_runtime_asserts_over_guards: bool = False )

Parameters

dynamic (bool, optional, defaults to False) — Whether to export with dynamic (symbolic) shapes. When True and dynamic_shapes is not set, all tensor dimensions are set to Dim.AUTO automatically.
strict (bool, optional, defaults to False) — Whether to enable strict mode in torch.export. Runs the full symbolic trace and catches more errors, but is slower and more likely to fail on complex models.
dynamic_shapes (dict[str, Any], optional) — Explicit per-input dynamic shape specifications passed to torch.export. Takes precedence over dynamic.
prefer_deferred_runtime_asserts_over_guards (bool, optional, defaults to False) — When True, data-dependent shape guards are emitted as runtime asserts in the exported graph instead of failing the export at trace time when a guard wouldn’t hold across the full symbolic shape range. Most transformer LLMs need this set to True when using fine-grained Dim(min=, max=) bounds. Not needed with dynamic=True / Dim.AUTO, where torch.export infers shape relations instead of verifying them against the user-stated bounds.

Configuration class for exporting models via torch.export.

OnnxConfig

class transformers.exporters.OnnxConfig

< source >

( export_format: ExportFormat = <ExportFormat.ONNX: 'onnx'>dynamic: bool = Falsestrict: bool = Falsedynamic_shapes: dict[str, typing.Any] | None = Noneprefer_deferred_runtime_asserts_over_guards: bool = Falseoutput_path: str | os.PathLike | None = Noneopset_version: int | None = Noneexternal_data: bool = Trueoptimize: bool = Trueexport_params: bool = Truekeep_initializers_as_inputs: bool = False )

Parameters

output_path (str or PathLike, optional) — Output path for the .onnx file. When None (default) the exported model is kept in memory as an ONNXProgram and not written to disk.
opset_version (int, optional) — ONNX opset version to target. Defaults to the latest opset supported by the installed onnxscript version.
external_data (bool, optional, defaults to True) — Store large weight tensors in a separate .onnx_data sidecar file instead of embedding them in the protobuf. Required for models whose weights exceed the 2 GB protobuf limit.
optimize (bool, optional, defaults to True) — Run onnxscript optimisation passes (constant folding, dead-code elimination, …) on the exported graph. Disable for models that hit upstream onnxscript optimiser bugs.
export_params (bool, optional, defaults to True) — Embed model weights in the ONNX graph. Set to False to export a weight-free graph (weights must be supplied at runtime).
keep_initializers_as_inputs (bool, optional, defaults to False) — Expose weight initializers as explicit graph inputs. Required by some older ONNX runtimes (opset < 9).

Configuration class for exporting models to ONNX via torch.onnx.export.

Inherits all fields from DynamoConfig (dynamic, strict, dynamic_shapes, prefer_deferred_runtime_asserts_over_guards).

ExecutorchConfig

class transformers.exporters.ExecutorchConfig

< source >

( export_format: ExportFormat = <ExportFormat.EXECUTORCH: 'executorch'>dynamic: bool = Falsestrict: bool = Falsedynamic_shapes: dict[str, typing.Any] | None = Noneprefer_deferred_runtime_asserts_over_guards: bool = Falsebackend: str = 'xnnpack' )

Parameters

backend (str, optional, defaults to "xnnpack") — Target ExecuTorch backend. Supported values:
- "xnnpack" — CPU inference via the XNNPACK library (default; runs anywhere).
- "cuda" — GPU inference via the ExecuTorch CUDA backend.

Configuration class for exporting models to ExecuTorch format.

Inherits all fields from DynamoConfig (dynamic, strict, dynamic_shapes, prefer_deferred_runtime_asserts_over_guards).

Update on GitHub