Testing guidelink

Like the IREE project in general, IREE tests are divided into a few different components and use different tooling depending on the needs of that component.

Test type	Test	Build system	Supported platforms
Compiler tests	iree_lit_test	Bazel/CMake	Host
Runtime tests	iree_cc_test	Bazel/CMake	Host/Device
	iree_native_test	Bazel/CMake	Host/Device
	iree_hal_cts_test_suite	CMake	Host/Device
Core E2E tests	iree_check_test	Bazel/CMake	Host/Device
	iree_static_linker_test	CMake	Host/Device

There are also more *_test_suite targets that groups test targets with the same configuration together.

Compiler testslink

Tests for the IREE compilation pipeline are written as lit tests in the same style as MLIR.

By convention, IREE includes tests for

printing and parsing of ops in .../IR/test/{OP_CATEGORY}_ops.mlir files
folding and canonicalization in .../IR/test/{OP_CATEGORY}_folding.mlir files
compiler passes and pipelines in other .../test/*.mlir files

Running a testlink

For the test iree/compiler/Dialect/VM/Conversion/MathToVM/test/arithmetic_ops.mlir

With CMake, run this from the build directory:

ctest -R iree/compiler/Dialect/VM/Conversion/MathToVM/test/arithmetic_ops.mlir.test

With Bazel, run this from the repo root:

bazel test //compiler/src/iree/compiler/Dialect/VM/Conversion/MathToVM/test:arithmetic_ops.mlir.test

Writing a testlink

For advice on writing MLIR compiler tests, see the MLIR testing guide. Tests should be .mlir files in test directory adjacent to the functionality they are testing. Instead of mlir-opt, use iree-opt, which registers IREE dialects and passes and doesn't register some unnecessary core ones.

As with most parts of the IREE compiler, these should not have a dependency on the runtime.

Configuring the build systemlink

In the Bazel BUILD file, create a iree_lit_test_suite rule. We usually create a single suite that globs all .mlir files in the directory and is called "lit".

load("//iree/build_tools/bazel:iree_lit_test.bzl", "iree_lit_test_suite")

iree_lit_test_suite(
    name = "lit",
    srcs = glob(["*.mlir"]),
    tools = [
        "@llvm-project//llvm:FileCheck",
        "//tools:iree-opt",
    ],
)

There is a corresponding CMake function, calls to which will be generated by our Bazel to CMake converter.

iree_lit_test_suite(
  NAME
    lit
  SRCS
    "arithmetic_ops.mlir"
  DATA
    FileCheck
    iree-opt
)

You can also create a test for a single file with iree_lit_test.

Runtime testslink

Tests for the runtime C++ code use the GoogleTest testing framework. They should generally follow the style and best practices of that framework.

Running a testlink

For the test /runtime/src/iree/base/bitfield_test.cc:

With CMake, run this from the build directory:

ctest -R iree/base/bitfield_test

With Bazel, run this from the repo root:

bazel test //runtime/src/iree/base:arena_test

Setting test environmentslink

Parallel testing for ctest can be enabled via the CTEST_PARALLEL_LEVEL environment variable. For example:

export CTEST_PARALLEL_LEVEL=$(nproc)

To use the Vulkan backend as test driver, you may need to select between a Vulkan implementation from SwiftShader and multiple Vulkan-capable hardware devices. This can be done via environment variables. See the generic Vulkan setup page for details regarding these variables.

For Bazel, you can persist the configuration in user.bazelrc to save typing. For example:

test:vkswiftshader --test_env="LD_LIBRARY_PATH=..."
test:vkswiftshader --test_env="VK_LAYER_PATH=..."
test:vknative --test_env="LD_LIBRARY_PATH=..."
test:vknative --test_env="VK_LAYER_PATH=..."

Then you can use bazel test --config=vkswiftshader to select SwiftShader as the Vulkan implementation. Similarly for other implementations.

Writing a testlink

For advice on writing tests in the GoogleTest framework, see the GoogleTest primer. Test files for source file foo.cc with build target foo should live in the same directory with source file foo_test.cc and build target foo_test. You should #include iree/testing/gtest.h instead of any of the gtest or gmock headers.

As with all parts of the IREE runtime, these should not have a dependency on the compiler.

Configuring the build systemlink

In the Bazel BUILD file, create a cc_test target with your test file as the source and any necessary dependencies. Usually, you can link in a standard gtest main function. Use iree/testing:gtest_main instead of the gtest_main that comes with gtest.

cc_test(
    name = "arena_test",
    srcs = ["arena_test.cc"],
    deps = [
        ":arena",
        "//iree/testing:gtest_main",
    ],
)

We have created a corresponding CMake function iree_cc_test that mirrors the Bazel rule's behavior. Our Bazel to CMake converter should generally derive the CMakeLists.txt file from the BUILD file:

iree_cc_test(
  NAME
    arena_test
  SRCS
    "arena_test.cc"
  DEPS
    ::arena
    iree::testing::gtest_main
)

There are other more specific test targets, such as iree_hal_cts_test_suite, which are designed to test specific runtime support with template configuration and is not supported by Bazel rules.

IREE core end-to-end (e2e) testslink

Here "end-to-end" means from the input accepted by the IREE core compiler (dialects like TOSA, StableHLO, Linalg) to execution using the IREE runtime components. It does not include tests of the integrations with ML frameworks (e.g. TensorFlow, PyTorch) or bindings to other languages (e.g. Python).

We avoid using the more traditional lit tests used elsewhere in the compiler for runtime execution tests. Lit tests require running the compiler tools on the test platform through shell or python scripts that act on files from a local file system. On platforms like Android, the web, and embedded systems, each of these features is either not available or is severely limited.

Instead, to test these flows we use a custom framework called check. The check framework compiles test programs on the host machine into standalone test binary files that can be pushed to test devices (such as Android phones) where they run with gtest style assertions (e.g. check.expect_almost_eq(lhs, rhs)).

Building e2e testslink

The files needed by these tests are not built by default with CMake. You'll need to build the special iree-test-deps target to generate test files prior to running CTest (from the build directory):

cmake --build . --target iree-test-deps

To run e2e model tests in generated_e2e_model_tests.cmake, because of their dependencies, -DIREE_BUILD_E2E_TEST_ARTIFACTS=ON needs to be set when configuring CMake. Also see IREE Benchmark Suite Prerequisites for required packages.

Running a Testlink

For the test tests/e2e/stablehlo_ops/floor.mlir compiled for the VMVX target backend and running on the VMVX driver (here they match exactly, but in principle there's a many-to-many mapping from backends to drivers).

With CMake, run this from the build directory:

ctest -R tests/e2e/stablehlo_ops/check_vmvx_local-task_floor.mlir

With Bazel, run this from the repo root:

bazel test tests/e2e/stablehlo_ops:check_vmvx_local-task_floor.mlir

Setting test environmentslink

Similarly, you can use environment variables to select Vulkan implementations for running tests as explained in the Runtime tests section.

Writing a testlink

These tests live in tests/e2e. A single test consists of a .mlir source file specifying an IREE module where each exported function takes no inputs and returns no results and corresponds to a single test case.

As an example, here are some tests for the MHLO floor operation:

func.func @tensor() {
  %input = util.unfoldable_constant dense<[0.0, 1.1, 2.5, 4.9]> : tensor<4xf32>
  %result = "mhlo.floor"(%input) : (tensor<4xf32>) -> tensor<4xf32>
  check.expect_almost_eq_const(%result, dense<[0.0, 1.0, 2.0, 4.0]> : tensor<4xf32>): tensor<4xf32>
  return
}

func.func @scalar() {
  %input = util.unfoldable_constant dense<101.3> : tensor<f32>
  %result = "mhlo.floor"(%input) : (tensor<f32>) -> tensor<f32>
  check.expect_almost_eq_const(%result, dense<101.0> : tensor<f32>): tensor<f32>
  return
}

func.func @negative() {
  %input = util.unfoldable_constant dense<-1.1> : tensor<f32>
  %result = "mhlo.floor"(%input) : (tensor<f32>) -> tensor<f32>
  check.expect_almost_eq_const(%result, dense<-2.0> : tensor<f32>): tensor<f32>
  return
}

Test cases are created in gtest for each public function exported by the module.

Note the use of util.unfoldable_constant to specify test constants. If we were to use a regular constant the compiler would fold away everything at compile time and our test would not actually test the runtime. unfoldable_constant adds a barrier that prevents folding. To prevent folding/constant propagate on an arbitrary SSA-value you can use util.optimization_barrier.

Next we use this input constant to exercise the runtime feature under test (in this case, just a single floor operation). Finally, we use a check dialect operation to make an assertion about the output. There are a few different assertion operations. Here we use the expect_almost_eq_const op: almost because we are comparing floats and want to allow for floating-point imprecision, and const because we want to compare it to a constant value. This last part is just syntactic sugar around

%expected = arith.constant dense<101.0> : tensor<f32>
check.expect_almost_eq(%result, %expected) : tensor<f32>

The output of running this test looks like:

[==========] Running 4 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 4 tests from module
[ RUN      ] module.tensor
[       OK ] module.tensor (76 ms)
[ RUN      ] module.scalar
[       OK ] module.scalar (79 ms)
[ RUN      ] module.double
[       OK ] module.double (55 ms)
[ RUN      ] module.negative
[       OK ] module.negative (54 ms)
[----------] 4 tests from module (264 ms total)

[----------] Global test environment tear-down
[==========] 4 tests from 1 test suite ran. (264 ms total)
[  PASSED  ] 4 tests.

The "module" name for the test suite comes from the default name for an implicit MLIR module. To give the test suite a more descriptive name, use an explicit named top-level module in this file.

Configuring the build systemlink

A single .mlir source file can be turned into a test target with the iree_check_test Bazel macro (and corresponding CMake function).

load("//build_tools/bazel:iree_check_test.bzl", "iree_check_test")

iree_check_test(
    name = "check_vmvx_local-task_floor.mlir",
    src = "floor.mlir",
    driver = "local-task",
    target_backend = "vmvx",
)

The target naming convention is "check_backend_driver_src". The generated test will automatically be tagged with a "driver=vmvx" tag, which can help filter tests by backend (especially when many tests are generated, as below).

Usually we want to create a suite of tests across many backends and drivers. This can be accomplished with additional macros. For a single backend/driver pair:

load("//build_tools/bazel:iree_check_test.bzl", "iree_check_single_backend_test_suite")

iree_check_single_backend_test_suite(
    name = "check_vmvx_local-task",
    srcs = glob(["*.mlir"]),
    driver = "local-task",
    target_backend = "vmvx",
)

This will generate a separate test target for each file in srcs with a name following the convention above as well as a Bazel test_suite called "check_vmvx_local-task" that will run all the generated tests.

You can also generate suites across multiple pairs:

load("//build_tools/bazel:iree_check_test.bzl", "iree_check_test_suite")

iree_check_test_suite(
    name = "check",
    srcs = ["success.mlir"],
    # Leave this argument off to run on all supported backend/driver pairs.
    target_backends_and_drivers = [
        ("vmvx", "local-task"),
        ("vulkan-spirv", "vulkan"),
    ],
)

This will create a test per source file and backend/driver pair, a test suite per backend/driver pair, and a test suite, "check", that will run all the tests.

The CMake functions follow a similar pattern. The calls to them are generated in our CMakeLists.txt file by bazel_to_cmake.

There are other test targets that generate tests based on template configuraton and platform detection, such as iree_static_linker_test. Those targets are not supported by Bazel rules at this point.

External test suitelink

An out-of-tree test suite is under development at nod-ai/SHARK-TestSuite for large collections of generated tests and machine learning models that are too large to fit into the main git repository.

Testing these programs follows several stages:

graph LR
  Import -. "\n(offline)" .-> Compile
  Compile --> Run

This particular test suite treats importing (e.g. from ONNX, PyTorch, or TensorFlow) as an offline step and contains test cases organized into folders of programs, inputs, and expected outputs:

Sample test case directory

test_case_name/
  model.mlir
  input_0.npy
  output_0.npy
  test_data_flags.txt

Sample test_data_flags.txt

--input=@input_0.npy
--expected_output=@output_0.npy

Many model, input, and output files are too large to store directly in Git, so the external test suite also uses Git LFS and cloud storage.

Each test case can be run using a sequence of commands like:

iree-compile model.mlir {flags} -o model.vmfb
iree-run-module --module=model.vmfb --flagfile=test_data_flags.txt

To run slices of the test suite, a pytest runner is included that can be configured using JSON files. The JSON files tested in the IREE repo itself are stored in build_tools/pkgci/external_test_suite/.

For example, here is part of a config file for running ONNX tests on CPU:

build_tools/pkgci/external_test_suite/onnx_cpu_llvm_sync.json
{
  "config_name": "cpu_llvm_sync",
  "iree_compile_flags": [
    "--iree-hal-target-backends=llvm-cpu"
  ],
  "iree_run_module_flags": [
    "--device=local-sync"
  ],
  "skip_compile_tests": [
    "test_dequantizelinear",
    "test_slice_default_axes"
  ],
  "skip_run_tests": [],
  "expected_compile_failures": [
    "test_acos",
    "test_acos_example",
    "test_acosh",
    "test_acosh_example",
    "test_adagrad",
    "test_adagrad_multiple",

Adding new test caseslink

To add new test cases to the external test suite:

Import the programs you want to test into MLIR. This can be done manually or using automation. Prefer to automate, or at least document, the process so test cases can be regenerated later.
Construct sets of inputs and expected outputs (as .npy or .bin files). These can be manually authored or imported by running the program through a reference backend.
Group the program, inputs, and outputs together using a flagfile.

To start running new test cases:

Bump the commit of the test suite that is used in IREE's .github/workflows/ files
Add new pytest invocations and/or config files that run the new tests

Usage from other projectslink

The external test suite only needs iree-compile and iree-run-module to run, so it is well suited for use in downstream projects that implement plugins for IREE. The conftest.py file can also be forked (or bypassed entirely) to further customize the test runner behavior.