Skip to content

Parameterslink

Overviewlink

Parameters in IREE are externalized storage for resources that are asynchronously accessible and device-aware. Parameters offer efficient ways to store, manipulate, and load data for large resources like the weights in a machine learning model.

Without using parameters, compiled programs include both code and data:

graph LR
  accTitle: .vmfb file without using parameters
  accDescr {
    Without using parameters, .vmfb files contain host code, device code,
    small data, and large resources all in the same file.
  }

  subgraph VMFB[".vmfb file"]
    HostCode(Host code)
    DeviceCode(Device code)
    SmallData(Small data)
    LargeResources(Large resources)
  end

Using parameters, data can be stored, transmitted, and loaded from separate sources:

graph BT
  accTitle: .vmfb file using parameters
  accDescr {
    Using parameters, .vmfb files contain host code, device code, small
    constants, and parameters. External .irpa, .safetensors, and .gguf files
    can be linked to these parameters.
  }

  subgraph VMFB[".vmfb file using parameters"]
    HostCode(Host code)
    DeviceCode(Device code)
    SmallData(Small data)
    Parameters("Parameters
    • scope_1::key_1
    • scope_1::key_2
    • scope_2::key_1
    • scope_2::key_2")
  end

  subgraph IRPA[".irpa file"]
    key_1
    key_2
  end

  subgraph Safetensors[".safetensors file"]
    key_1a[key_1]
  end

  subgraph GGUF[".gguf file"]
    key_2a[key_2]
  end

  IRPA -. "scope_1" .-> Parameters
  Safetensors -. "scope_2" .-> Parameters
  GGUF -. "scope_2" .-> Parameters

Note

Notice that parameters are identified by a scope and a unique key within that scope, not strong references to specific file paths. Data from any supported file format or "parameter index provider" can be loaded.

Supported formatslink

IRPAlink

The IREE Parameter Archive (IRPA) file format (iree/schemas/parameter_archive.h) is IREE's own format optimized for deployment. Formats like GGUF and safetensors can be converted to IRPA.

  • Data is always aligned in IRPA files for efficient loading
  • IRPA files contain minimal metadata and are fully hermetic. Buffers are stored as opaque byte range blobs, not as tensors with explicit types and shapes
  • For testing and benchmarking workflows, IRPA files may include a mix of real data and splatted values (repeating patterns with no storage requirements on disk)

GGUFlink

The GGUF format is used by the GGML project and other projects in that ecosystem like llama.cpp.

  • GGUF files are non-hermetic - using them requires knowledge about the settings used to compile GGML in order to interpret the contents of each file (particularly for various quantization formats)
  • GGUF files are aligned, so they should have matching performance with IRPA files

🤗 Safetensorslink

The safetensors format is used by the Hugging Face community.

  • Safetensors files are not naturally aligned to support efficient loading, so using them across runtime devices comes with (possibly severe) performance penalties

Extensibility and other formatslink

The core IREE tools are written in C and aim to be simple and pragmatic, with minimal dependencies. Other formats could be converted into supported file types:

  • PyTorch .pt and .pth files (serialized state dictionaries produced with torch.save)
  • TensorFlow checkpoint (.ckpt, .h5) files or SavedModel / model.keras archives (see the TensorFlow guide)

In-tree formats for file-backed parameters are defined in the iree/io/formats/ folder. Additional formats could be defined out-of-tree to make use of external libraries as needed.

Parameter loading from memory (or a cache, or some other location) is possible by adding new providers implementing iree_io_parameter_provider_t. The default parameter index provider operates on files on local disk.

Working with parameter fileslink

Creating parameter fileslink

The iree-create-parameters tool can create IREE Parameter Archive (.irpa) files. Each parameter in the archive can be created with either a real data value (taking up storage space in the final archive) or a splatted value (zeroed contents or a repeated value, taking up no storage space on disk).

Tip: --help output

For a detailed list of options, pass --help:

$ iree-create-parameters --help

# ============================================================================
# 👻 IREE: iree-create-parameters
# ============================================================================

Creates IREE Parameter Archive (.irpa) files. Provide zero or more
parameter value declarations and an output file with
`--output=file.irpa` to produce a new file with zeroed or patterned
contents.

...
  • Example creating a file with two zeroed embedded parameters and one with a repeating pattern:

    $ iree-create-parameters \
        --data=my.zeroed_param_1=4096xf32 \
        --data=my.zeroed_param_2=2x4096xi16 \
        --data=my.pattern_param_2=8x2xf32=2.1 \
        --output=output_with_storage.irpa
    
  • Example creating a file with splatted values (no storage on disk):

    $ iree-create-parameters \
        --splat=my.splat_param_1=4096xf32=4.1 \
        --splat=my.splat_param_2=2x4096xi16=123 \
        --output=output_without_storage.irpa
    

Parameter archives can also be created using IREE's Python bindings:

import iree.runtime as rt
import numpy as np

parameter_index = rt.ParameterIndex()
parameter_index.add_buffer("weight", np.zeros([32, 16]) + 2.0)
parameter_index.add_buffer("bias", np.zeros([32, 16]) + 0.5)
parameter_index.create_archive_file("parameters.irpa")

See the runtime/bindings/python/tests/io_test.py file for more usage examples.

Converting to the IRPA formatlink

The iree-convert-parameters tool converts supported files into IREE Parameter Archives (.irpa) files.

Tip: --help output

For a detailed list of options, pass --help:

$ iree-convert-parameters --help

# ============================================================================
# 👻 IREE: iree-convert-parameters
# ============================================================================

Converts supported parameter file formats into IREE Parameter Archives
(.irpa) files. Provide one or more input parameter files in the same
form as expected by the iree-run-module tool (`--parameters=foo.gguf`)
and an output file with `--output=file.irpa`.

...
  • Example converting from safetensors to IRPA:

    $ iree-convert-parameters \
        --parameters=input.safetensors \
        --output=output.irpa
    
  • Example mutating parameters:

    $ iree-convert-parameters \
        --parameters=a.gguf \
        --parameters=b.safetensors \
        --exclude=unneeded_param \
        --rename=old_name=new_name \
        --splat=some_name=f32=4.2 \
        --output=ab.irpa
    
  • Example stripping parameters and replacing them with zeros except for one with special handling:

    $ iree-convert-parameters \
        --parameters=input.irpa \
        --strip \
        --splat=special_param=f32=1.0 \
        --output=output.irpa
    

Inspecting parameter fileslink

The iree-dump-parameters tool outputs information about parsed parameter files.

Tip: --help output

For a detailed list of options, pass --help:

$ iree-dump-parameters --help

# ============================================================================
# 👻 IREE: iree-dump-parameters
# ============================================================================

Dumps information about parsed parameter files.

...
  • Example listing all available parameters and their index information:

    $ iree-dump-parameters \
        --parameters=my_scope=my_file.gguf \
        [--parameters=...]
    
  • Example extracting parameter binary contents from a file:

    $ iree-dump-parameters ... \
        --extract=scope::key0=file0.bin \
        [--extract=...]
    

Loading parameters from fileslink

On the command linelink

IREE command line tooling can load parameter files alongside module files:

iree-run-module --module=program.vmfb --parameters=data.irpa ...

For concrete examples, see these test files:

From Pythonlink

See the runtime/bindings/python/tests/io_runtime_test.py file for usage examples.

Using the C APIlink

TODO: iree_io_parameters_module_create() sample code