Porting Example¶
HPy supports incrementally porting an existing C extension from the original Python C API to the HPy API and to have the extension compile and run at each step along the way.
Here we walk through porting a small C extension that implements a Point type
with some simple methods (a norm and a dot product). The Point type is minimal,
but does contain additional C attributes (the x and y values of the point)
and an attribute (obj) that contains a Python object (that we will need to
convert from a PyObject *
to an HPyField
).
There is a separate C file illustrating each step of the incremental port:
step_00_c_api.c: The original C API version that we are going to port.
step_01_hpy_legacy.c: A possible first step where all methods still receive
PyObject *
arguments and may still cast them toPyPointObject *
if they are instances of Point.step_02_hpy_legacy.c: Shows how to transition some methods to HPy methods that receive
HPy
handles as arguments while still supporting legacy methods that receivePyObject *
arguments.step_03_hpy_final.c: The completed port to HPy where all methods receive
HPy
handles andPyObject_HEAD
has been removed.
Take a moment to read through step_00_c_api.c. Then, once you’re ready, keep reading.
Each section below corresponds to one of the three porting steps above:
Note
The steps used here are one approach to porting a module. The specific steps are not required. They’re just an example approach.
Step 01: Converting the module to a (legacy) HPy module¶
First for the easy bit – let’s include hpy.h
:
3 | #include <hpy.h> |
We’d like to differentiate between references to PyPointObject
that have
been ported to HPy and those that haven’t, so let’s rename it to PointObject
and alias PyPointObject
to PointObject
. We’ll keep PyPointObject
for
the instances that haven’t been ported yet (the legacy ones) and use
PointObject
where we have ported the references:
16 17 18 19 20 21 22 23 | typedef struct { // PyObject_HEAD is required while legacy_slots are still used // but can (and should) be removed once the port to HPy is completed. PyObject_HEAD double x; double y; PyObject *obj; } PointObject; |
29 | typedef PointObject PyPointObject; |
For this step, all references will be to PyPointObject
– we’ll only start
porting references in the next step.
Let’s also call HPyType_LEGACY_HELPERS
to define some helper functions
for use with the PointObject
struct:
37 | HPyType_LEGACY_HELPERS(PointObject) |
Again, we won’t use these helpers in this step – we’re just setting things up for later.
Now for the big steps.
We need to replace PyType_Spec
for the Point
type with the equivalent
HPyType_Spec
:
131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 | // HPy type methods and slots (no methods or slots have been ported yet) static HPyDef *point_defines[] = { NULL }; static HPyType_Spec Point_Type_spec = { .name = "point_hpy_legacy_1.Point", .basicsize = sizeof(PointObject), .itemsize = 0, .flags = HPy_TPFLAGS_DEFAULT, .builtin_shape = SHAPE(PointObject), .legacy_slots = Point_legacy_slots, .defines = point_defines, }; // HPy supports only multiphase module initialization, so we must migrate the // single phase initialization by extracting the code that populates the module // object with attributes into a separate 'exec' slot. The module is not // created manually by calling API like PyModule_Create, but the runtime creates // the module for us from the specification in HPyModuleDef, and we can provide // additional slots to populate the module before its initialization is finalized HPyDef_SLOT(module_exec, HPy_mod_exec) static int module_exec_impl(HPyContext *ctx, HPy mod) { HPy point_type = HPyType_FromSpec(ctx, &Point_Type_spec, NULL); if (HPy_IsNull(point_type)) return -1; HPy_SetAttr_s(ctx, mod, "Point", point_type); return 0; } |
Initially the list of ported methods in point_defines
is empty and all of
the methods are still in Point_slots
which we have renamed to
Point_legacy_slots
for clarity.
SHAPE(PointObject)
is a macro that retrieves the shape of PointObject
as it
was defined by the HPyType_LEGACY_HELPERS
macro and will be set to
HPyType_BuiltinShape_Legacy
until we replace the legacy macro with the
HPyType_HELPERS
one. Any type with legacy_slots
or that still includes
PyObject_HEAD
in its struct should have .builtin_shape
set to
HPyType_BuiltinShape_Legacy
.
Similarly we replace PyModuleDef
with HPyModuleDef
:
162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 | // Legacy module methods (the "dot" method is still a PyCFunction) static PyMethodDef PointModuleLegacyMethods[] = { {"dot", (PyCFunction)dot, METH_VARARGS, "Dot product."}, {NULL, NULL, 0, NULL} }; // HPy module methods: no regular methods have been ported yet, // but we add the module execute slot static HPyDef *module_defines[] = { &module_exec, NULL }; static HPyModuleDef moduledef = { // .name = "step_01_hpy_legacy", // ^-- .name is not needed for multiphase module initialization, // it is always taken from the ModuleSpec .doc = "Point module (Step 1; All legacy methods)", .size = 0, .legacy_methods = PointModuleLegacyMethods, .defines = module_defines, }; |
Like the type, the list of ported methods in module_defines
is initially
almost empty: all the regular methods are still in PointModuleMethods
which has
been renamed to PointModuleLegacyMethods
. However, because HPy supports only
multiphase module initialization, we must convert our module initialization code
to an “exec” slot on the module and add that slot to module_defines
.
Now all that is left is to replace the module initialization function with
one that uses HPy_MODINIT
. The first argument is the name of the extension,
i.e., what was XXX
in PyInit_XXX
, and the second argument
is the HPyModuleDef
.
189 | HPy_MODINIT(step_01_hpy_legacy, moduledef) |
And we’re done!
Instead of the PyInit_XXX
, we now have an “exec” slot on the module.
We implement it with a C function that that takes an HPyContext *ctx
and HPy mod
as arguments. The ctx
must be forwarded as the first argument to calls to
HPy API methods. The mod
argument is a handle for the module object. The runtime
creates the module for us from the provided HPyModuleDef
. There is no need to
call API like PyModule_Create
explicitly.
Next step is to replace PyType_FromSpec
by HPyType_FromSpec
.
HPy_SetAttr_s
is used to add the Point
class to the module. HPy requires no
special PyModule_AddObject
method.
152 153 154 155 156 157 158 159 160 | HPyDef_SLOT(module_exec, HPy_mod_exec) static int module_exec_impl(HPyContext *ctx, HPy mod) { HPy point_type = HPyType_FromSpec(ctx, &Point_Type_spec, NULL); if (HPy_IsNull(point_type)) return -1; HPy_SetAttr_s(ctx, mod, "Point", point_type); return 0; } |
Step 02: Transition some methods to HPy¶
In the previous step we put in place the type and module definitions required to create an HPy extension module. In this step we will port some individual methods.
Let us start by migrating Point_traverse
. First we need to change
PyObject *obj
in the PointObject
struct to HPyField obj
:
16 17 18 19 20 21 22 23 24 25 | typedef struct { // PyObject_HEAD is required while legacy methods still access // PointObject and should be removed once the port to HPy is completed. PyObject_HEAD double x; double y; // HPy handles are shortlived to support all GC strategies // For that reason, PyObject* in C structs are replaced by HPyField HPyField obj; } PointObject; |
HPy
handles can only be short-lived – i.e. local variables, arguments to
functions or return values. HPyField
is the way to store long-lived
references to Python objects. For more information, please refer to the
documentation of HPyField.
Now we can update Point_traverse
:
40 41 42 43 44 45 | HPyDef_SLOT(Point_traverse, HPy_tp_traverse) int Point_traverse_impl(void *self, HPyFunc_visitproc visit, void *arg) { HPy_VISIT(&((PointObject*)self)->obj); return 0; } |
In the first line we used the HPyDef_SLOT
macro to define a small structure
that describes the slot being implemented. The first argument, Point_traverse
,
is the name to assign the structure to. By convention, the HPyDef_SLOT
macro
expects a function called Point_traverse_impl
implementing the slot. The
second argument, HPy_tp_traverse
, specifies the kind of slot.
This is a change from how slots are defined in the old C API. In the old API,
the kind of slot is only specified much lower down in Point_legacy_slots
. In
HPy the implementation and kind are defined in one place using a syntax
reminiscent of Python decorators.
The implementation of traverse is now a bit simpler than in the old C API.
We no longer need to visit Py_TYPE(self)
and need only HPy_VISIT
self->obj
. HPy ensures that interpreter knows that the type of the instance
is still referenced.
Only struct members of type HPyField
can be visited with HPy_VISIT
, which
is why we needed to convert obj
to an HPyField
before we implemented the
HPy traverse.
Next we must update Point_init
to store the value of obj
as an HPyField
:
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | HPyDef_SLOT(Point_init, HPy_tp_init) int Point_init_impl(HPyContext *ctx, HPy self, HPy *args, HPy_ssize_t nargs, HPy kw) { static const char *kwlist[] = {"x", "y", "obj", NULL}; PointObject *p = PointObject_AsStruct(ctx, self); p->x = 0.0; p->y = 0.0; HPy obj = HPy_NULL; HPyTracker ht; if (!HPyArg_ParseKeywords(ctx, &ht, args, nargs, kw, "|ddO", kwlist, &p->x, &p->y, &obj)) return -1; if (HPy_IsNull(obj)) obj = ctx->h_None; // INCREF not needed because HPyArg_ParseKeywords does not steal a reference HPyField_Store(ctx, self, &p->obj, obj); HPyTracker_Close(ctx, ht); return 0; } |
There are a few new HPy constructs used here:
The kind of the slot passed to
HPyDef_SLOT
isHPy_tp_init
.PointObject_AsStruct
is defined byHPyType_LEGACY_HELPERS
and returns an instance of thePointObject
struct. Because we still includePyObject_HEAD
at the start of the struct this is still a validPyObject *
but once we finish the port the struct will no longer containPyObject_HEAD
and this will just be an ordinary C struct with no memory overhead!We use
HPyTracker
when parsing the arguments withHPyArg_ParseKeywords
. TheHPyTracker
keeps track of open handles so that they can be closed easily at the end withHPyTracker_Close
.HPyArg_ParseKeywords
is the equivalent ofPyArg_ParseTupleAndKeywords
. Note that the HPy version does not steal a reference like the Python version.HPyField_Store
is used to store a reference toobj
in the struct. The arguments are the context (ctx
), a handle to the object that owns the reference (self
), the address of theHPyField
(&p->obj
), and the handle to the object (obj
).
Note
An HPyTracker
is not strictly needed for HPyArg_ParseKeywords
in Point_init
. The arguments x
and y
are C floats (so there are no
handles to close) and the handle stored in obj
was passed in to the
Point_init
as an argument and so should not be closed.
We showed the tracker here to demonstrate its use. You can read more about argument parsing in the API docs.
If a tracker is needed and one is not provided, HPyArg_ParseKeywords
will return an error.
The last update we need to make for the change to HPyField
is to migrate
Point_obj_get
which retrieves obj
from the stored HPyField
:
69 70 71 72 73 74 | HPyDef_GET(Point_obj, "obj", .doc="Associated object.") HPy Point_obj_get(HPyContext *ctx, HPy self, void* closure) { PointObject *p = PointObject_AsStruct(ctx, self); return HPyField_Load(ctx, self, p->obj); } |
Above we have used PointObject_AsStruct
again, and then HPyField_Load
to
retrieve the value of obj
from the HPyField
.
We’ve now finished all of the changes needed by introducing HPyField
. We
could stop here, but let’s migrate one ordinary method, Point_norm
, to end
off this stage of the port:
77 78 79 80 81 82 83 84 | HPyDef_METH(Point_norm, "norm", HPyFunc_NOARGS, .doc="Distance from origin.") HPy Point_norm_impl(HPyContext *ctx, HPy self) { PointObject *p = PointObject_AsStruct(ctx, self); double norm; norm = sqrt(p->x * p->x + p->y * p->y); return HPyFloat_FromDouble(ctx, norm); } |
To define a method we use HPyDef_METH
instead of HPyDef_SLOT
. HPyDef_METH
creates a small structure defining the method. The first argument is the name
to assign to the structure (Point_norm
). The second is the Python name of
the method (norm
). The third specifies the method signature (HPyFunc_NOARGS
– i.e. no additional arguments in this case). The last provides the docstring.
The macro then expects a function named Point_norm_impl
implementing the
method.
The rest of the implementation remains similar, except that we use
HPyFloat_FromDouble
to create a handle to a Python float containing the
result (i.e. the distance of the point from the origin).
Now we are done and just have to remove the old implementations from
Point_legacy_slots
and add them to point_defines
:
119 120 121 122 123 124 125 | static HPyDef *point_defines[] = { &Point_init, &Point_norm, &Point_obj, &Point_traverse, NULL }; |
Step 03: Complete the port to HPy¶
In this step we’ll complete the port. We’ll no longer include Python, remove
PyObject_HEAD
from the PointObject
struct, and port the remaining methods.
First, let’s remove the import of Python.h
:
2 | // #include <Python.h> // disallow use of the old C API
|
And PyObject_HEAD
from the struct:
15 16 17 18 19 20 21 22 23 | typedef struct { // PyObject_HEAD is no longer available in PointObject. In CPython, // of course, it still exists but is inaccessible from HPy_AsStruct. In // other Python implementations (e.g. PyPy) it might no longer exist at // all. double x; double y; HPyField obj; } PointObject; |
And the typedef of PointObject
to PyPointObject
:
29 | // typedef PointObject PyPointObject;
|
Now any code that has not been ported should result in a compilation error.
We must also change the type helpers from HPyType_LEGACY_HELPERS
to
HPyType_HELPERS
so that PointObject_AsStruct
knows that PyObject_HEAD
has been removed:
35 | HPyType_HELPERS(PointObject) |
There is one more method to port, the dot
method which is a module method
that implements the dot product between two points:
84 85 86 87 88 89 90 91 92 93 94 95 | HPyDef_METH(dot, "dot", HPyFunc_VARARGS, .doc="Dot product.") HPy dot_impl(HPyContext *ctx, HPy self, HPy *args, HPy_ssize_t nargs) { HPy point1, point2; if (!HPyArg_Parse(ctx, NULL, args, nargs, "OO", &point1, &point2)) return HPy_NULL; PointObject *p1 = PointObject_AsStruct(ctx, point1); PointObject *p2 = PointObject_AsStruct(ctx, point2); double dp; dp = p1->x * p2->x + p1->y * p2->y; return HPyFloat_FromDouble(ctx, dp); } |
The changes are similar to those used in porting the norm
method, except:
We use
HPyArg_Parse
instead ofHPyArg_ParseKeywords
.We opted not to use an
HPyTracker
by passingNULL
as the pointer to the tracker when callingHPyArg_Parse
. There is no reason not to use a tracker here, but the handles to the two points are passed in as arguments todot_impl
and thus there is no need to close them (and they should not be closed).
We use PointObject_AsStruct
and HPyFloat_FromDouble
as before.
Now that we have ported everything we can remove PointMethods
,
Point_legacy_slots
and PointModuleLegacyMethods
. The resulting
type definition is much cleaner:
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 | static HPyDef *point_defines[] = { &Point_init, &Point_norm, &Point_obj, &Point_traverse, NULL }; static HPyType_Spec Point_Type_spec = { .name = "point_hpy_final.Point", .doc = "Point (Step 03)", .basicsize = sizeof(PointObject), .itemsize = 0, .flags = HPy_TPFLAGS_DEFAULT, .defines = point_defines }; HPyDef_SLOT(module_exec, HPy_mod_exec) static int module_exec_impl(HPyContext *ctx, HPy mod) { HPy point_type = HPyType_FromSpec(ctx, &Point_Type_spec, NULL); if (HPy_IsNull(point_type)) return -1; HPy_SetAttr_s(ctx, mod, "Point", point_type); return 0; } |
and the module definition is simpler too:
139 140 141 142 143 144 145 146 147 148 149 | static HPyDef *module_defines[] = { &module_exec, &dot, NULL }; static HPyModuleDef moduledef = { .doc = "Point module (Step 3; Porting complete)", .size = 0, .defines = module_defines, }; |
Now that the port is complete, when we compile our extension in HPy universal mode, we obtain a built extension that depends only on the HPy ABI and not on the CPython ABI at all!