Index ¦ Archives  ¦ Atom  ¦ RSS

Building a binary into your Python package

Firstly, let me acknowledge this is a rare problem. So the lack of documentation I ran into is totally acceptable. The problem: I have a Python package that has a C++ component that isn't an extension, it's a binary. I wanted that binary to be on the PATH of users, but it's not a Python entry point, so the usual way failed while attempting to read the file as text.

So I did some hacking, and after a lot of attempts, I whittled the result back down to something that's actually quite reasonable. If I had ever seen a documentation page or blog post mentioning this is how you do it, I would have been able to just follow along and I would have been happy with the result. Sadly, there was none, so I'm writing it:

Let's say you have a package P that has a binary B with source file B.c++ in this layout:

P/
|---P.py
|---B.cpp
|---setup.py

You want to build B.c++ into a file B, and you want B to be on the PATH, then you would think you want to put that in the scripts attribute. Something like this:

# Warning: NOT the solution.
from distutils import ccompiler
from distutils.command.build_scripts import build_scripts

class build_my_script(build_scripts):
  def run(self):
    compiler = ccompiler.new_compiler()
    compiler.add_include_dir(...)
    ...
    objects = compiler.compile(['B.cpp'])
    [compiler.add_link_object(obj) for obj in objects]
    compiler.link_executable(
        [],  # add_link_object above makes this unnecessary.
        'B')

setup(
    ...,
    scripts=[
      'B'
    ],
    cmdclass={'build_scripts': build_my_script},
    ...
)

Sadly, that doesn't quite do it. Anything in 'scripts' will be parsed during a part of installation that we can't replace. Instead, we have to put it in the 'data_files' section instead. But not as a list, because that emits a warning and because we want it in the 'bin' folder. This may only work in Linux, so if any Windows or Mac users try this, please let me know in the comments.

setup(
   ...,
   data_files=[
       ('bin', ['B']),
   ],
   ...,
)

And lastly, we want to make our compiler class extend build instead of build_scripts and make it get executed by the installation process.

class build_B(build):
  def run(self):
    ...

class package_build(build):
  sub_commands = build.sub_commands + [
      ('build_B', lambda _: True),
  ]

setup(
    ...,
    cmdclass={
        'build': package_build,
        'build_B': build_B,
    },
    ...,
)

Altogether, this will build your binary as part of the building process. This means if you put this on PyPI or another place as a source distribution, it will get built by the person who downloaded it. Hopefully, they have the necessary libraries and compiler that you use. If you use a binary distribution, it will get built by you and included in the output, so whoever installs it won't have to compile it.

Here's the result:

from distutils import ccompiler
from distutils.command.build import build
from distutils.core import setup

class build_B(build):
  def run(self):
    compiler = ccompiler.new_compiler()
    compiler.add_include_dir(...)
    ...
    objects = compiler.compile(['B.cpp'])
    [compiler.add_link_object(obj) for obj in objects]
    compiler.link_executable([], 'B')

class package_build(build):
  sub_commands = build.sub_commands + [
      ('build_B', lambda _: True),
  ]

setup(
    ...,
    cmdclass={
        'build': package_build,
        'build_B': build_B,
    },
    data_files=[
        ('bin', ['B']),
    ],
    ...,
)

There are many things I tried, such as package_data, overriding build_scripts and install_scripts, messing with install and replacing build entirely. Some search results only mentioned how to get already-existing files into the package, and others simply explained the situation, but there was enough documentation of it all to scrape this together. Though the actual distutils source code was also useful.

None of them did what I wanted, which was to build the file when it wasn't there, and to put it in the bin/ folder of the resulting package (egg, wheel, etc). Installing via this new setup.py gets my script into my PATH (even using a virtualenv).

© Fahrzin Hemmati. Built using Pelican. Theme by Giulio Fidente on github.