Automated C++ header wrapping

semiwrap can be told to parse C/C++ headers and automatically generate pybind11 wrappers around the functions and objects found in that header.

Note

We use cxxheaderparser to parse headers. It can handle common and complex C++ syntax, but if you run into problems please file a bug on github.

C++ Features

semiwrap uses a pure python C++ parser and macro processor to attempt to parse header files. As a result, a full AST of the header files is not created. This means particularly opaque code might confuse the parser, as semiwrap only receives the names, not the actual type information.

However, most basic features typically work without needing to coerce the generator into working correctly, including:

  • functions/methods (overloads, static, etc)

  • public class variables

  • protected class variables are (when possible) exported with a _ prefix

  • inheritance - detects and sets up Python object hierarchy automatically

  • abstract classes - autogenerated code ensures they cannot be created directly

  • virtual functions - automatically generates trampoline classes as described in the pybind11 documentation so that python classes can override them

  • final classes/methods - cannot be overridden from Python code

  • Enumerations

  • Global variables

  • Many many more weird edge cases too

Additionally, the following features are supported, but require some manual intervention:

To tell the autogenerator to parse headers, you need to add a autogen_headers to your package in pyproject.toml:

[tool.semiwrap.extension_modules."PACKAGE.NAME".headers]
demo = "demo.h"

That causes demo.h to be parsed and wrapped.

Note

If you’re importing a large number of headers, the semiwrap scan-headers tool can generate this list for you automatically.

Documentation

semiwrap will find doxygen documentation comments on many types of elements and use sphinxify to translate them into python docstrings. If this is not sufficient, all elements that support documentation strings can have their docstrings set explicitly using a doc value in the YAML file.

classes:
  X:
    doc: Docstring for this class

Parameters

Most of the time semiwrap can infer the Python signature from the C++ signature. When it cannot, or when you want to make the Python API more idiomatic, use param_override on a function/method or overload to adjust individual parameters.

Parameters are keyed by their C++ name. If a parameter is unnamed in the header, semiwrap assigns it a name like param0 and you can use that in param_override.

Common overrides include:

  • name: rename the parameter as exposed to Python

  • x_type: override the C++ type string used in the binding

  • default / no_default: set or suppress a default value

  • disable_none: disallow implicit conversion from None for this param

  • array_size: force array length when a parameter is a raw array

  • ignore: remove the parameter from the Python signature

Example:

functions:
  fn:
    param_override:
      count:
        name: n
        default: "0"
      value:
        x_type: "myns::Value"
      opt:
        disable_none: true

Buffer protocol support

If a function takes a pointer + length pair, you can map it to Python buffer objects (bytes, bytearray, memoryview, etc.) using buffers. Each buffer entry names the data pointer parameter and the length parameter. The length parameter can be a pointer (treated as out) or a value (treated as temporary).

functions:
  read_data:
    buffers:
    - type: out
      src: data
      len: length
      minsz: 64

Out parameters

semiwrap detects C++ “out parameters” and converts them into Python return values. This keeps the Python signature clean while still allowing the C++ API to communicate multiple outputs.

Automatic detection:

  • Non-const pointers or references to fundamental types are treated as out.

  • Raw arrays are treated as out parameters.

  • If defaults.references_are_out_param is set to true, any non-const T& is treated as out.

You can force this behavior with param_override.force_out.

When out parameters are present, the generated wrapper returns:

  • the original return value (if any), followed by

  • any out parameters, in the order they appear

If there is only one value, it is returned directly; otherwise a tuple is returned. Out parameters are removed from the Python signature.

functions:
  read_value:
    param_override:
      out:
        force_out: true

Conditional compilation

You can guard individual overloads with preprocessor macros. This is useful for platform-specific APIs or optional features.

Use ifdef or ifndef on the function/method (or specific overload) to wrap the generated binding in #ifdef/#ifndef blocks.

functions:
  platform_fn:
    ifdef: SOME_PLATFORM

classes:
  MyClass:
    methods:
      maybe:
        overloads:
          int:
            ifndef: SWGEN_DISABLE_MAYBE_INT

Class templates

The code generator needs to be told which instantiations of the class template to create. For a given class:

template <typename T>
struct TBasic
{
    virtual ~TBasic() {}

    T getT() { return t; }
    virtual void setT(const T &t) { this->t = t; }

    T t;
};

You need to tell the code generator two things about your class:

  • Identify the template parameters in the class

  • Declare explicit instantiations that you wish to expose, and their name

To cause a python class to be created called TBasicString which wraps TBasic<std::string>:

classes:
  TBasic:
    template_params:
    - T

templates:
  TBasicString:
    qualname: TBasic
    params:
    - std::string

Function templates

The code generator needs to be told which instantiations of the function template to create. For a given function:

struct TClassWithFn
{
    template <typename T>
    static T getT(T t)
    {
        return t;
    }
};

The following would go in your YAML to create overloads callable from python that call bool getT(bool) and int getT(int).

classes:
  TClassWithFn:
    methods:
      getT:
        template_impls:
        - ["bool"]
        - ["int"]

Differing python and C++ function signatures

Custom configuration of your functions allows you to define a more pythonic API for your C++ classes.

Python only

This often comes up when the python type and a C++ type of a function parameter or return value is different, or you want to omit a parameter. Just define a lambda via cpp_code:

// original code
int foo(int param1);
functions:
  foo:
    cpp_code:
      [](int param1) -> std::string {
        return std::to_string(param1);
      }

If you change the parameters, then you need to use param_override to adjust the parameters. Let’s say you wanted to remove ‘param2’:

functions:
  foo:
    param_override:
      param2:
        ignore: true

Note

When you change things like this, these inline definitions are not callable from C++, you need virtual functions for that.

Python and C++

Let’s say that you have a C++ virtual function void MyClass::foo(std::iostream &s). Semantically, it’s just returning a string. Because you really don’t want to wrap std::iostream, you decide that the function should just return a string in python.

Because this is a virtual function, you need to define a virtual_xform lambda that will take the original arguments, call the python API, then return the original return type. Then when C++ code calls that virtual function, it will call the xform function which will call your python API.

classes:
  MyClass:
    methods:
      foo:
        param_override:
          s:
            ignore: true
        cpp_code: |
          // python API
          [](MyClass * self) -> std::string {
            std::stringstream ss;
            self->foo(ss);
            return ss.str();
          }
        virtual_xform: |
          // C++ virtual function transformer
          [&](py::function &overload) {
            auto s = py::cast<std::string>(overload());
            ss << s;
          }