Skip to content

TFLite Integration

IREE supports compiling and running TensorFlow Lite programs stored as TFLite FlatBuffers. These files can be imported into an IREE-compatible format then compiled to a series of backends.


Install TensorFlow-Lite specific dependencies using pip:

python -m pip install \
  iree-compiler \
  iree-runtime \

Importing and Compiling

IREE's tooling is divided into two components: import and compilation.

  1. The import tool converts the TFLite FlatBuffer to an IREE compatible form, validating that only IREE compatible operations remain. Containing a combination of TOSA and IREE operations.
  2. The compilation stage generates the bytecode module for a list of targets, which can be executed by IREE.

Using Command Line Tools

These two stages can be completed entirely via the command line.


# Fetch the sample model

# Import the sample model to an IREE compatible form
iree-import-tflite ${TFLITE_PATH} -o ${IMPORT_PATH}

# Compile for the CPU backend
iree-compile \
    --iree-input-type=tosa \
    --iree-hal-target-backends=llvm-cpu \
    ${IMPORT_PATH} \
    -o ${MODULE_PATH}

Using the Python API

The example below demonstrates downloading, compiling, and executing a TFLite model using the Python API. This includes some initial setup to declare global variables, download the sample module, and download the sample inputs.

Declaration of absolute paths for the sample repo and import all required libraries. The default setup uses the CPU backend as the only target. This can be reconfigured to select alternative targets.

import iree.compiler.tflite as iree_tflite_compile
import iree.runtime as iree_rt
import numpy
import os
import urllib.request

from PIL import Image

workdir = "/tmp/workdir"
os.makedirs(workdir, exist_ok=True)

tfliteFile = "/".join([workdir, "model.tflite"])
jpgFile = "/".join([workdir, "input.jpg"])
tfliteIR = "/".join([workdir, "tflite.mlir"])
tosaIR = "/".join([workdir, "tosa.mlir"])
bytecodeModule = "/".join([workdir, "iree.vmfb"])

backends = ["llvm-cpu"]
config = "local-task"

The TFLite sample model and input are downloaded locally.

tfliteUrl = ""
jpgUrl = ""

urllib.request.urlretrieve(tfliteUrl, tfliteFile)
urllib.request.urlretrieve(jpgUrl, jpgFile)

Once downloaded we can compile the model for the selected backends. Both the TFLite and TOSA representations of the model are saved for debugging purposes. This is optional and can be omitted.


After compilation is completed we configure the VmModule using the local-task configuration and compiled IREE module.

config = iree_rt.Config("local-task")
context = iree_rt.SystemContext(config=config)
with open(bytecodeModule, 'rb') as f:
  vm_module = iree_rt.VmModule.from_flatbuffer(config.vm_instance,

Finally, the IREE module is loaded and ready for execution. Here we load the sample image, manipulate to the expected input size, and execute the module. By default TFLite models include a single function named 'main'. The final results are printed.

im = numpy.array(, 192))).reshape((1, 192, 192, 3))
args = [im]

invoke = context.modules.module["main"]
iree_results = invoke(*args)


Failures during the import step usually indicate a failure to lower from TensorFlow Lite's operations to TOSA, the intermediate representation used by IREE. Many TensorFlow Lite operations are not fully supported, particularly those than use dynamic shapes. File an issue to IREE's TFLite model support project.

Additional Samples

Colab notebooks
Text classification with TFLite and IREE Open in Colab


Issue#3954: Add documentation for an Android demo using the Java TFLite bindings, once it is complete at not-jenni/iree-android-tflite-demo.