One of the most useful features of pytest is its ability to generate test cases with the parametrize keyword. Like table tests in other languages and frameworks, sometimes it’s really useful to have the same test run over multiple test cases to make sure that you’re doing things correctly.

from typing import Any

import pytest


def combine(*stuff: list[Any]) -> Any:
    """Combine the things!"""
    return super_complicated_algorithm(*stuff)



@pytest.mark.parametrize(
    "args,exp_result",
    [
        (param([1, 2], 3,id="ints")),
        (param([1, 2, 3], 6,id="more ints")),
        (param([0.1, 0.2, 0.3], 0.6, id="floats")),
        (param([i for i in range(10000)], 49995000, id="bunch of ints")),
        (param(["hello", " ", "world", "!"], "hello world!",id="strings")),
    ],
)
def test_combine_stuff(args: list[Any], exp_result: Any) -> None:
    actual = combine(*args)
    match exp_result:
        case float():
            buffer = 0.0000000001
            assert abs(actual - exp_result) <= buffer
        case _:
            assert actual == exp_result

This gives us some really nice output and we can rest easy knowing that we succesfully created 5 different test cases:

$ poetry run pytest
...
collected 5 items

test.py::test_combine_stuff[ints] PASSED
test.py::test_combine_stuff[more ints] PASSED
test.py::test_combine_stuff[floats] PASSED
test.py::test_combine_stuff[bunch of ints] PASSED
test.py::test_combine_stuff[strings] PASSED

=========== 5 passed in 0.01s ====================

Let’s take a closer look at what’s going on here…

The pytest Metafunc object

First we should break it down to an incredibly simple example. We’ll create a test with 10 cases that does nothing but create a simple test case for 10 different integers:

import pytest

numbers = [i for i in range(10)]


@pytest.mark.parametrize("number", numbers, ids=numbers)
def test_show_me_a_number(number: int) -> None:
    assert type(number) is int, "what did you expect?"

The output here is stupidly simple:

$ poetry run pytest
...
collected 10 items

test.py::test_show_me_a_number[0] PASSED
test.py::test_show_me_a_number[1] PASSED
test.py::test_show_me_a_number[2] PASSED
test.py::test_show_me_a_number[3] PASSED
test.py::test_show_me_a_number[4] PASSED
test.py::test_show_me_a_number[5] PASSED
test.py::test_show_me_a_number[6] PASSED
test.py::test_show_me_a_number[7] PASSED
test.py::test_show_me_a_number[8] PASSED
test.py::test_show_me_a_number[9] PASSED

========= 10 passed in 0.01s ===========

How might we recreate this pattern without using the decorator? Enter the Metafunc object baked into pytest. Ditching the generator for a function definition makes the code be a little bit uglier, but as we start to peek under the hood, we start seeing some of the magic that make pytest a super powerful test driver.

import pytest


def pytest_generate_tests(metafunc: pytest.Metafunc):
    marker = "number"
    if marker in metafunc.fixturenames:
        numbers = [i for i in range(10)]
        metafunc.parametrize(marker, numbers, ids=numbers)

def test_show_me_a_number(number: int) -> None:
    assert type(number) is int, "what did you expect?"

And the output looks exactly the same!

$ poetry run pytest
...
collected 10 tems

test.py::test_show_me_a_number[0] PASSED
test.py::test_show_me_a_number[1] PASSED
test.py::test_show_me_a_number[2] PASSED
test.py::test_show_me_a_number[3] PASSED
test.py::test_show_me_a_number[4] PASSED
test.py::test_show_me_a_number[5] PASSED
test.py::test_show_me_a_number[6] PASSED
test.py::test_show_me_a_number[7] PASSED
test.py::test_show_me_a_number[8] PASSED
test.py::test_show_me_a_number[9] PASSED

========== 10 passed in 0.01s ==========

What’s happening here is that python takes an iterable, in this case, the numbers list comprehension, and writes a unique test case for each item as it iterates through. We specify a marker string which signifies to parametrize any test method that uses that string as an argument.

In addition, we pass another iterable (in this case the same as the first) for ids to keep track of individual test case. If we replaced ids=numbers with ids=[f"#{n}" for n in numbers], the test ids would be test_show_me_a_number[#0], test_show_me_a_number[#1], test_show_me_a_number[#2], etc.

Now that we’re no longer limited to the decorator function, let’s start getting a little more creative. For example, what if instead of parametrizing static values, we start using this functionality to dynamically generate tests based on random inputs?

For example, we could generate tests based on the time of day, which is definitely a good idea and something we should try:

from datetime import datetime

import pytest


def pytest_generate_tests(metafunc: pytest.Metafunc):
    marker = "number"
    if marker in metafunc.fixturenames:
        numbers = [i for i in range(5) if datetime.now().microsecond % 2 == 0]
        metafunc.parametrize(marker, numbers, ids=numbers)


def test_show_me_a_number(number: int) -> None:
    assert type(number) is int, "what did you expect?"

Let’s run this one a couple of times.

$ poetry run pytest
...
collected 3 items

test.py::test_show_me_a_number[0] PASSED
test.py::test_show_me_a_number[1] PASSED
test.py::test_show_me_a_number[2] PASSED

========== 3 passed in 0.01s ===========

And again?

$ poetry run pytest
...
collected 2 items

test.py::test_show_me_a_number[3] PASSED
test.py::test_show_me_a_number[4] PASSED

========== 2 passed in 0.01s ===========

How about one more time?

$ poetry run pytest
...
collected 3 items

test.py::test_show_me_a_number[0] PASSED
test.py::test_show_me_a_number[2] PASSED
test.py::test_show_me_a_number[3] PASSED

========== 3 passed in 0.01s ===========

Okay, this is weird but we’re starting to see the underlying power. With this pattern we now have the ability to generate any test we want, determined by arbitrary inputs. While the above scenario is pretty useless from a software engineering perspective, there are some very useful opportunities for this sort of pattern in production.

Generate tests by command-line input

One fun application is that we can gather dynamic input from the command line and parametrize the test against multiple targets. Like let’s say you’re testing a web API that serves a semantically-versioned C++ binary built against multiple targets. Using metafunc we can retrieve the semver string from the command line and run the same test against the various platforms.

This one gets a little more complicated since we need to register our command-line flag in a conftest.py file, but the code is still pretty straightforward.

Here’s the conftest.py command-line parser code:

# conftest.py

import pytest

# needs to be in conftest so the parser picks it up before test collection
def pytest_addoption(parser: pytest.Parser):
    parser.addoption(
        "--semver",
        action="store",
        required=False,
        help="specify the semver",
    )


# let's put this in conftest too while we're at it
def pytest_generate_tests(metafunc: pytest.Metafunc):
    marker = "semver"
    platforms = ["linux_x86_32", "linux_x86_64", "win32", "win64", "osx-aarch"]
    if marker in metafunc.fixturenames:
        if version := metafunc.config.getoption("--semver"):
            binaries = [f"{version}-{platform}" for platform in platforms]
            metafunc.parametrize(marker, binaries, ids=binaries)

And the test itself:

import pytest

from binary_getter import BinaryGetter


def test_get_binary(semver: str) -> None:
    assert BinaryGetter.get_binary(semver), f"unable to download {semver}"

Using the --semver flag, we can now verify that we can download the same binary version on a bunch of different platforms:

$ poetry run pytest --semver 3.3.7rc1
...
collected 5 items

test.py::test_get_binary[3.3.7rc1-linux_x86_32] PASSED
test.py::test_get_binary[3.3.7rc1-linux_x86_64] PASSED
test.py::test_get_binary[3.3.7rc1-win32] PASSED
test.py::test_get_binary[3.3.7rc1-win64] PASSED
test.py::test_get_binary[3.3.7rc1-osx-aarch] PASSED

================ 5 passed in 0.01s ====================


$ poetry run pytest --semver 1.2.3
...
collected 5 items

test.py::test_get_binary[1.2.3-linux_x86_32] PASSED
test.py::test_get_binary[1.2.3-linux_x86_64] PASSED
test.py::test_get_binary[1.2.3-win32] PASSED
test.py::test_get_binary[1.2.3-win64] PASSED
test.py::test_get_binary[1.2.3-osx-aarch] PASSED

================ 5 passed in 0.01s ====================

Using this pattern we can verify any release without needing to adjust the source code! We could even adjust the command line flags to accept multiple arguments or additional platforms to start really generating a lot of tests.

This sort of pattern is super useful if the person running the test isn’t necessarily the developer who wrote or maintains the test suite, or for CI jobs where the output of a build job gets piped immediately into a test suite without needing to touch source or otherwise intervene manually.

Parametrize a fixture for cleaner, more atomic test definitions

One of the most obvious and simplest applications for metafunc is to use it to process parametrized arguments to keep tests atomic and reusable. We’ve all written an e2e test that takes the “kitchen sink” approach that throws all of the test cases in a single function:

from pathlib import Path

import pytest

from utility_library import preprocess, validate
from main_library import FileProcessor, FilePublisher, FileRetriever
from reports import Datafile, Report, Redacted
from company_secrets import disallowed_words


@pytest.mark.parametrize(
    "raw_file",
    [
        "/path/to/data1.json",
        "/path/to/data2.json",
        "/path/to/data3.json",
        "/path/to/data4.json",
        "/path/to/data5.json",
        "/path/to/data6.json",
    ],
)
def test_file_processor_and_publisher_and_retriever(raw_file: str):
    # set up some preconditions
    test_data = Path(raw_file)
    assert test_data.exists(), "must provide valid test data file"
    starting_point: Datafile = preprocess(test_data)
    assert validate(starting_point), "data can't be validated"
    fp = FileProcessor(starting_point)

    # verify protobuf functionality
    proto = fp.convert_to_protobuf()
    assert proto, "unable to convert to protobuf"

    # verify report-making capability
    report: Report = fp.make_report()
    assert report, "unable to process the report"
    assert report.passes_validation(), "report is invalid"
    redacted: Redacted = report.redacted()
    assert redacted, "unable to redact the report"
    for word in disallowed_words:
        assert word not in str(redacted), "we leaked company secrets!"

    # verify we can publish the report
    publisher = FilePublisher()
    assert publisher.publish(report), "unable to publish report"
    assert publisher.publish(redacted), "unable to publish redacted report"

    retriever = FileRetriever()
    retrieved_report = retriever.retrieve(report.name)
    assert retrieved_report == report, "unable to download retrieved report"
    retrieved_redacted = retriever.retrieve(redacted.name)
    assert retrieved_redacted == redacted, "unable to download redacted report"

If you managed to even read the whole test, you would see that while this is probably a really great test that does all of the things it needs to, it’s both fragile and unextensible. What happens if you fail to publish the report, for example? Pytest skips the FileRetriever code and we’ll never know about any bugs there until the FilePublisher code gets fixed first.

Without knowledge of metafunc, the “simple” solution is to split the functions into separate methods and apply the parametrize decorator to each one. This helps avoid missing regressions in FileRetriever, but we can still do better.

Instead, if we use metafunc in combination with other fixtures, we can DRY up our code and make it much easier to write new test cases as needed. In this case we may even want to split all of the fixtures into its own conftest to make the test cases even cleaner to look at:

# conftest.py

from pathlib import Path

import pytest

from utility_library import preprocess, validate
from main_library import FileProcessor, FilePublisher, FileRetriever
from reports import Datafile, Report, Redacted


def pytest_generate_tests(metafunc: pytest.Metafunc):
    raw_files = [
        "/path/to/data1.json",
        "/path/to/data2.json",
        "/path/to/data3.json",
        "/path/to/data4.json",
        "/path/to/data5.json",
        "/path/to/data6.json",
    ]
    # do pre-processing here
    test_data = [Path(raw_file) for raw_file in raw_files]
    datafiles: list[Datafile] = []
    for td in test_data:
        assert td.exists(), "must provide valid test data file"
        datafile = preprocess(td)
        assert validate(datafile), "data can't be validated"
        datafiles.append(datafile)

    marker = ("datafile",)
    if marker in metafunc.fixturenames:
        metafunc.parametrize(marker, datafiles, ids=datafiles, scope="class")


@pytest.fixture(scope="class")
def file_processor(datafile: Datafile) -> FileProcessor:
    return FileProcessor(datafile)


@pytest.fixture(scope="class")
def report(file_processor: FileProcessor) -> Report:
    return file_processor.make_report()


@pytest.fixture(scope="class")
def redacted(report: Report) -> Redacted:
    return report.redacted()


@pytest.fixture(scope="class")
def publisher() -> FilePublisher:
    return FilePublisher()


@pytest.fixture(scope="class")
def retriever() -> FileRetriever:
    return FileRetriever()

Now, any test or fixture that uses the datafile fixture as an argument will be parametrized by whatever inputs we define in the conftest. Since the file_processor fixture calls datafile, it will get the parametrization treatment and get re-instantiated (and tested) for each test case. The same goes for report which calls file_processor, and by extension redacted which calls report. The final two fixtures publisher and retriever only get instantiated once because they aren’t involved in any of the parametrization.

We should take a minute to visit the scope="class" argument we add to the metafunc.parametrize() call. Just like with regular fixtures, this means that while we can reinstantiate the fixtures for each item in datafiles, we don’t reinstantiate the object more than once per class. This essentially lets us reuse the same file descriptor over and over again for the entire test class to reduce file I/O and improve test speed. Of course we could name the scope whatever we wanted, and if we were concerned about side effects and leaking state, we would likely want to parametrize using scope="function".

Now that the conftest is complete, we can see a very clean and atomic test file. So clean in fact, we don’t even need to call import pytest since all of the test configuration has been pushed outside of the main test file.

# test.py

from main_library import FileProcessor, FilePublisher, FileRetriever
from reports import Datafile, Report, Redacted
from company_secrets import disallowed_words

class TestFileProcessorAndPublisherAndRetriever:
    def test_protobuf_functionality(self, file_processor: FileProcessor):
        proto = file_processor.convert_to_protobuf()
        assert proto, "unable to convert to protobuf"

    def test_make_report(self, report: Report):
        assert report, "unable to process the report"
        assert report.passes_validation(), "report is invalid"

    def test_redacted(self, redacted: Redacted):
        assert redacted, "unable to redact the report"
        for word in disallowed_words:
            assert word not in str(redacted), "we leaked company secrets!"

    def test_publish_and_retrieve_report(
        self, publisher: FilePublisher, retriever: FileRetriever, report: Report
    ):
        retrieved_report = retriever.retrieve(report.name)
        assert publisher.publish(report), "unable to publish report"
        assert retrieved_report == report, "unable to download retrieved report"

    def test_publish_and_retrieve_redacted(
        self, publisher: FilePublisher, retriever: FileRetriever, redacted: Redacted
    ):
        assert publisher.publish(redacted), "unable to publish redacted report"
        retrieved_redacted = retriever.retrieve(redacted.name)
        assert retrieved_redacted == redacted, "unable to download redacted report"

The beauty of this approach is that pretty much anybody can see what these tests are doing and what they are validating. If we want to write more tests or tweak the acceptance criteria, it’s very easy to do without breaking other functionality. And if we want to change the list of data input files to test against, we can simply adjust the items in the list inside our conftest file.

Looking for more?

We’re just starting to scratch the surface of Pytest functionality, but this is a good look at some of the more powerful applications for Pytest, and a reason that many people (like myself) use Python and Pytest to test targets that aren’t even written in Python. As an integration/e2e test framework for testing any API that Python can connect to, Pytest is unrivaled for extensibility and being able to quickly generate multiple tests using Python’s flexibility and robust metaprogramming support.

Want more info about metafunc? Check out the source code yourself!