farmdev

importing modules from setup.py (chicken vs. egg!)

There's a chicken vs. egg problem when you want to import 3rd party packages in a setup.py file. I often want to import my main module so that I can use inspect.getdoc(my_module) for the description in my setup function (avoids duplication). But if a user is running python setup.py install then importing in setup.py might fail if my module is in a subdir or if one of the yet-to-be-installed dependencies is not there yet. I'm even ashamed to admit that I've added try ... except ImportError: pass to an __init__.py file to make this work once.

Luckily, there is still an easy way to get a docstring without fragile regex hacks and without importing. Here's a little recipe for the setup problem above using the standard compiler module and the abstract syntax tree (AST) :

from setuptools import setup, find_packages
import compiler, os, pydoc
from compiler import visitor

def split_doc_from_module(modfile):
    class ModuleVisitor(object):
        def __init__(self):
            self.mod_doc = None
        def visitModule(self, node):
            self.mod_doc = node.doc
            # if you didn't want to short circuit the visitation,
            # you would call self.default(node)
    ast = compiler.parseFile(modfile)
    modnode = ModuleVisitor()
    visitor.walk(ast, modnode)
    if modnode.mod_doc is None:
        raise RuntimeError(
            "could not parse doc string from %s" % modfile)
    return pydoc.splitdoc(modnode.mod_doc)

description, long_description = split_doc_from_module(os.path.join('my_module', '__init__.py'))

setup(
    name="MySpiffyModule",
    version="1.0",
    description=description,
    long_description=long_description,
    # ...
)

The handy splitdoc() function simply assumes the first few line of a docstring until a blank is reached is its "short description" (see PEP 257) and the rest is the long description. Note that setuptools is not required, I just prefer it for distribution. You could also take this a step further and parse out a __version__ value as long as it was hard coded.

Often introspection is just easier at runtime but you have to contend with some side effects—possibility of ImportError and unwanted code execution—thus, the AST approach is a good one to know. I've been interested in it for various reasons lately, mainly as automatic discovery of tests for nose. I've even submitted a talk proposal for Pycon on it entitled Advanced Introspection Using Abstract Syntax Trees ... but I went a little crazy and submitted 2 more talk proposals and am co-authoring another!