Making a Mockery of Coding

Lately I’ve been writing a lot of library code for working with streams of data to expand the esig package for Python. This is a Python C++ extension that acts as a bridge between the low-level C++ library libalgebra where computations happen and the Python world where people tend to work with data. The gap between static (compiled) types of libalgebra and C++ templates and the dynamic (runtime) types of Python is about as large as it can be, and ultimately this relies on a giant switch statement and some careful encapsulation.

I had a realisation a few weeks ago that this encapsulation could be made more general, and leave open the possibility of different computational backends other than libalgebra provided they have roughly the same interface. From a library designer’s point of view, this means I need to provide a template for what the interfaces for the various components should look like; this takes the form of abstract classes with (pure) virtual functions that must be implemented. Since I don’t know what the actual implementation will look like, it raises an interesting problem for testing the library.

Luckily, this is a common problem in software testing. The remedy is to mock the parts of the library that might have different implementations. This means inserting a mock class that implements a given interface while testing. This mock class can be interrogated during testing to check whether particular methods were called, and have controllable side effects.

Rather than trying to jump in to mocking in the complicated world of C++, let’s look at what would happen in Python code. Since Python is a dynamic language - everything is a Python object, including the classes that define objects - mocking an object is as simple as injecting an object that “looks like” the object to mock. Fortunately, for the same reason, making an object that “looks like” another is also very easy for the same reason. In fact, in Python one can make an object that allows itself to generate methods when they are called. This is how the MagicMock class from the Python standard library unittest.mock works. Let’s see how this might work in an example.

Let’s pretend we’re writing some kind of graphics library, that is based on some abstract notion of a shape that can be drawn. To represent this concept, we have an abstract base class called Shape, and some function render that consumes a list of Shape derivatives to represent some kind of drawing operation.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
import abc

class Shape(abc.ABC):

    @abc.abstractmethod
    def draw(screen):
        ...


def render(shapes, screen):
    for shape in shapes:
        shape.draw(screen)

Now for testing this interface, we need to create a mock shape object that can be passed into the render function. Now if we simply passed in a MagicMock object, if the render function were to call a method not specified in the Shape interface - i.e. the draw method - then nothing would fail since the mock object would simply create the corresponding method. However, if we use the set a spec for the mock object using the abstract class then we can make sure that any method calls to functions that don’t exist in the Shape interface will fail. Once we’ve used the mock object we can make assertions on the mock object and it’s methods to make sure they were called, with specific arguments and a specific number of times, and generally test that it was used appropriately and as expected in the render function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
from unittest import mock
import unittest

class TestRender(unittest.TestCase):

    def test_render_function(self):
        mock_shape = mock.MagicMock(spec=Shape)
        
        screen_obj = SomeScreenObject()
        render([mock_shape], screen_obj)

        mock_shape.render.assert_called_with(screen_obj)

This is a really simple unittest test case that tests that the render function is called with the screen object we created. Of course, this is the least sophisticated way to use mocks in tests, and doesn’t really do much to demonstrate just how amazing this class really is.

This process of mocking is rather more complicated in C++, where things need to be known to the compiler and satisfy the type system. Clearly the Python solution won’t work here, but actually the solution turns out to be remarkably similar. Let’s first look at exactly the same situation that we had in the Python world.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
#include <vector>

// Just a stub for completeness
class Screen;

class Shape 
{
public:
    
    virtual void draw(Screen& screen) const = 0;
};


void render(const std::vector<const Shape*>& shapes, Screen& screen)
{
    for (auto shape : shapes) {
        shape->draw(screen);
    }
}

The Shape abstract base class looks very similar in the C++ vs the Python. The render function looks a bit different. It takes a (const reference to a) vector of const pointers to Shapes and a reference to the screen object. We need to use some indirection mechanism (here pointers) to allow for polymorphic objects (derived from Shape).

Unlike Python, we have to actually do some work to define our mock class. We’re going to use Google’s testing and mocking suite, because it has all the facilities we need, but there are alternatives. Our first job is to create a new class that derives from Shape. However, instead of providing an implementation for the draw method ourselves, we’re going to use the Googlemock macro MOCK_METHOD to define our method.

1
2
3
4
5
6
7
8
#include <gmock/gmock.h>  // bring in the MOCK_METHOD macro

class MockShape : public Shape
{
public:

    MOCK_METHOD(void, draw, (Screen&), (const, override));
};

The four arguments provided to the macro are the return type, name, function signature, and modifiers, respectively. What this macro does, is generate a definition for the draw method, along with some additional metadata. This additional metadata is what allows us to interact with the method during our tests to query the number of calls etc to make sure it is used as expected. It also allows us to change the behaviour of the mocked method in certain tests; for example, we might want it to throw an exception to test error handling, or return some particular value (our function returns nothing, so that isn’t interesting here).

Now that we have a mock class we can start putting out test together. We have to bring in the rest of the Google test suite gtest and use the TEST macro to generate a new unit test. Inside the test looks very similar to the Python unit test above. We create a new Screen object, make a new MockShape object, call the render function, and then test that the draw method was called. Unlike the Python test, we use the EXPECT_CALL macro from gMock to generate the necessary code to query the metadata, and this expect call goes before we run the render function during the test.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#include <gtest/gtest.h>

TEST(TestRender, TestRenderFunction) {

    Screen screen;
    MockShape mock_shape;

    EXPECT_CALL(mock_shape, draw(screen)).Times(1);

    render({&mock_shape}, screen);
}

The code generated for this test will call the render function and test that it was called once with “screen” as the argument (which probably requires that the Screen class implements operator== for equality testing). I expected this process to be much harder than it turned out to be. I think this is mostly thanks to the existence of the gMock framework and my familiarity with it’s cousin gTest. (I’ve been using gTest as my go-to testing framework for C++ for a little while now.)

So the examples above cover a very simple example, but this says nothing about how to use this in anger to test a real interface. Before we get to writing some actual mock classes, it is important for us to understand exactly how the library uses abstract interfaces and what exactly I need to test with mocks.

The library I’ve been working is designed to expose mathematical types, implemented in C++, in Python. More often than not, there are multiple implementations of a single mathematical object that needs to be passed into Python, and it’s quite important that all these implementations have the same interface. To enforce this - and to provide encapsulation so I can freely pass these objects around within the library - I first wrap the implementation in an interface, and then hide this using the pointer to implementation (pimpl) idiom. For example, the free tensors have two classes in the library: the interface free_tensor_interface, which is the abstract base for the class that wraps the implementation; and the free_tensor_wrapper, which is a consistent object used in the interface of other components of the library and that is exposed to Python. The interface looks something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

class free_tensor_interface
{
public:

    virtual size_t size() const noexcept = 0;
    virtual unsigned degree() const noexcept = 0;
    virtual unsigned width() const noexcept = 0;
    virtual unsigned depth() const noexcept = 0;

    virtual free_tensor_wrapper uminus() const = 0;
    virtual free_tensor_wrapper add(const free_tensor_interface& other) const = 0;
    virtual free_tensor_wrapper sub(const free_tensor_interface& other) const = 0;
    virtual free_tensor_wrapper smul(double scal) const = 0;
    virtual free_tensor_wrapper sdiv(double scal) const = 0;
    virtual free_tensor_wrapper mul(const free_tensor_interface& other) const = 0;

    virtual free_tensor_interface& add_inplace(const free_tensor_interface& other) const = 0;
    virtual free_tensor_interface& sub_inplace(const free_tensor_interface& other) const = 0;
    virtual free_tensor_interface& smul_inplace(double scal) const = 0;
    virtual free_tensor_interface& sdiv_inplace(double scal) const = 0;
    virtual free_tensor_interface& mul_inplace(const free_tensor_interface& other) const = 0;

    virtual free_tensor_wrapper exp() const = 0;
    virtual free_tensor_wrapper log() const = 0;
    
    virtual std::ostream& print(std::ostream& os) const = 0;
    virtual bool equals(const free_tensor_interface& other) const = 0;

};

I’ve omitted some of the methods for brevity, but you can see the overall picture here reasonably clearly. There are several methods for querying the make-up of the tensor - its size, the degree to which the elements are currently represented, the dimension of the underlying vector space (width), and the maximum allowable degree that can be represented. Next there are external arithmetic operations, which don’t modify the current tensor in any way and produce a new owned tensor. Notice that all of these return a free_tensor_wrapper rather than a free_tensor_interface. This is because the free_tensor_interface is an abstract class and cannot actually be created, whereas a free_tensor_wrapper is concrete, and contains a pointer to a class derived from free_tensor_interface. These methods implement operations such as -a, a+b, a-b, and so on. Next there are in-place arithmetic operations, which modify the existing tensor and correspond to the operators +=, -=, etc. Then there are special functions defined on free tensors: the exponential exp and the logarithm log. Finally, there are utility functions for printing the tensor to an output stream and for testing equality.

Most parts of the library will never know exactly which implementation of this interface they are working with, and will instead handle free_tensor_wrapper objects that encapsulate them. The wrapper is very simple, it contains a single member which is a unique pointer to an instance of a free_tensor_interface, and all of the methods it implements simply defer to the interface class.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
class free_tensor_wrapper
{
    std::unique_ptr<free_tensor_interface> p_impl;

public:

    size_t size() const noexcept { return p_impl->size(); }
    unsigned degree() const noexcept { return p_impl->degree(); }
    unsigned width() const noexcept { return p_impl->width(); }
    unsigned depth() const noexcept { return p_impl->depth(); }

    free_tensor_wrapper operator-() const { return p_impl->uminus(); }
    free_tensor_wrapper operator+(const free_tensor_wrapper& other) const
    { return p_impl->add(*other.p_impl); }

    // Remaining arithmetic omitted, they are the same as above

    free_tensor_wrapper exp() const { return p_impl->exp(); }
    free_tensor_wrapper log() const { return p_impl->log(); }

    friend std::ostream& 
    operator<<(std::ostream& os, const free_tensor_wrapper& arg)
    { return arg.p_impl->print(os); }

    bool operator==(const free_tensor_wrapper& other) const
    { return p_impl->equals(*other.p_impl); }
    bool operator!=(const free_tensor_wrapper& other) const
    { return !p_impl->equals(*other.p_impl); }

};

All this seems great, but you might be wondering what it is exactly that I aim to test here using mocks. Then answer to that is the free_tensor_wrapper, to start with, and the other parts of the library that make use of free_tensor_wrapper. The wrapper itself might appear very simple - and it is - but I need to know that when I provide it with any free_tensor_interface pointer, it calls the methods appropriately and correctly in all circumstances, and properly propagates the results to the caller. (You might notice a rather obvious potential problem here in that I never test whether the pointer in p_impl is null. Rest assured that the necessary handling does exist, I’ve just left it out.) The first step is to create a test that simply runs each of the operations implemented on the wrapper and checks that the arguments are correctly forwarded to the implementation (the mock) and that the results are returned correctly.

First, we have to define our mock class.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
class MockFreeTensor : public free_tensor_interface
{
public:

    MOCK_METHOD(size_t, size, (), (const, noexcept, override));
    MOCK_METHOD(unsigned, degree, (), (const, noexcept, override));
    MOCK_METHOD(unsigned, width, (), (const, noexcept, override));
    MOCK_METHOD(unsigned, depth, (), (const, noexcept, override));

    MOCK_METHOD(free_tensor_wrapper, uminus, (), (const, override));

    // Rest of methods omitted


};

Now we create test cases that check whether the methods on the mocked class are called appropriately by executing other parts of the code. In this case, we’re calling the methods on the free_tensor_wrapper class and checking that they call the appropriate method on the mocked free_tensor_interface.

It might appear as though these tests are infallible at first glance, but their main purpose is to test that I don’t break anything in the future by messing with the wrapper classes. For example, in the future I might wish to expand the wrapper class to handle possible errors or edge cases and in those instances I need to know that methods on the actual implementation are called (or not called) with the correct arguments in all cases. They also provide a check that I have implemented everything as I intended - large parts of this library still need to be implemented so tests are created to keep track of what parts are finished and which still need work.

Conclusions

I was surprised at how easy it is to us the Google mock library to construct and use mock classes for testing in C++. I had assumed that this would be far more difficult given the strong type system that C++ employs. However, at least for polymorphic classes such as these, the type system works perfectly with the idea of mocking. I imagine that mocking in a more static class structure involving templates and complicated class hierarchies that the process is somewhat more tricky to follow. That being said, the utility of mocking there is questionable, since the compiler will do a significant amount of work to make sure that appropriate types are used and functions called.

Testing in general has been something that I’ve always understood as important, but actually implementing tests beyond basic functionality is something that I find very difficult. I’m sure that many other developers experience similar dilemmas when working on their own code. The scenario I mentioned above - where tests appear to be infallible because what they test is a very simple function - would seem to be an obvious tripping point for newcomers to testing methodologies. Indeed, it has taken me a long time to come to the understanding that tests are not just for now, they are for the future, and believe me future you will definitely thank you for well constructed and comprehensive tests!