a make example using doit

31 08 2009

Intro

This is a reply to a comment on doit in reddit. You should be familiar with doit to understand this article. The comment says:

"Much too verbose compared to make syntax." (nirs)

Well, that is actually true, but intentional. Here I will explain why it was designed this way. And show how easy it can be extended to be less (or more) verbose. The design of the syntax was based on the following principles:

  1. syntax must be easy to understand (even if a bit verbose).
  2. easy to extend. different domains require different syntaxes to keep it less verbose. Maybe I should say doit is a build-tool framework

doit != make

doit tasks can a bit more complex than a make’s target-dependencies-command (dependencies = prerequisites). doit supports multiple targets, a setup object… and it is extensible i.e. on next release it will support a "clean" parameter. Once you have a fixed syntax it is hard to expand it and add new features. But lets admit if all you want is target-dependencies-command doit is too verbose. right?

Lets say you want to specify it in a similar way to make… but writing straight to python code. You could define this into a text file and parse it.

I will take the example from GNU make manual . I removed part of it because this is a blog post not a book :D

version 0

make example:

edit : main.o kbd.o command.o display.o insert.o files.o
        cc -o edit main.o kbd.o command.o display.o insert.o files.o

main.o : main.c defs.h
        cc -c main.c
kbd.o : kbd.c defs.h command.h
        cc -c kbd.c
command.o : command.c defs.h command.h
        cc -c command.c
display.o : display.c defs.h buffer.h
        cc -c display.c
insert.o : insert.c defs.h buffer.h
        cc -c insert.c
files.o : files.c defs.h buffer.h command.h
        cc -c files.c
clean :
        rm edit main.o kbd.o command.o display.o \
           insert.o files.o

A very naive translation to doit:

DEFAULT_TASKS = ['edit']

COMPILE = "cc -c %s"

def task_edit():
    return {'action': ('cc -o edit main.o kbd.o command.o display.o '+
                       'insert.o files.o'),
            'dependencies': ['main.o', 'kbd.o', 'command.o', 'display.o',
                             'insert.o', 'files.o'],
            'targets': ['edit']
            }

def task_main():
    return {'action': COMPILE % 'main.c',
            'dependencies': ['main.c', 'defs.h'],
            'targets': ['main.o']
            }

def task_kbd():
    return {'action': COMPILE % 'kbd.c',
            'dependencies': ['kbd.c', 'defs.h', 'command.h'],
            'targets': ['kbd.o']
            }

def task_command():
    return {'action': COMPILE % 'command.c',
            'dependencies': ['command.c', 'defs.h', 'command.h'],
            'targets': ['command.o']
            }

def task_display():
    return {'action': COMPILE % 'display.c',
            'dependencies': ['display.c', 'defs.h', 'buffer.h'],
            'targets': ['display.o']
            }

def task_insert():
    return {'action': COMPILE % 'insert.c',
            'dependencies': ['insert.c', 'defs.h', 'buffer.h'],
            'targets': ['insert.o']
            }

def task_files():
    return {'action': COMPILE % 'files.c',
            'dependencies': ['files.c', 'defs.h', 'buffer.h', 'command.h'],
            'targets': ['files.o']
            }

def task_clean():
    return "rm edit main.o kbd.o command.o display.o insert.o files.o"

Sure. This is too verbose. But that’s not the way things should be done…

version 1

a more reasonable "translation".

doit make1:

DEFAULT_TASKS = ['make1:edit']

TASKS = {
    'edit': ("main.o kbd.o command.o display.o insert.o files.o",
             "cc -o edit main.o kbd.o command.o display.o insert.o files.o"),
    'main.o': ("main.c defs.h",
               "cc -c main.c"),
    'kbd.o': ("kbd.c defs.h command.h",
              "cc -c kbd.c"),
    'command.o': ("command.c defs.h command.h",
                  "cc -c command.c"),
    'display.o': ("display.c defs.h buffer.h",
                  "cc -c display.c"),
    'insert.o': ("insert.c defs.h buffer.h",
                 "cc -c insert.c"),
    'files.o': ("files.c defs.h buffer.h command.h",
                "cc -c files.c"),
    'clean': ("",
              "rm edit main.o kbd.o command.o display.o insert.o files.o")
    }

def task_make1():
    for target, rule in TASKS.iteritems():
        yield {'name': target,
               'targets': [target],
               'dependencies': rule[0].split(),
               'action': rule[1]}

This version ain’t more verbose than the make version (except the "" and ()). the task method is all you need to "translate" from make syntax to doit syntax. but nobody writes makefiles like this… Lets follow the improvements from the make manual.

Version 2

Does anyone get impressed with variables?

make example 2:

objects = main.o kbd.o command.o display.o insert.o files.o

edit : $(objects)
        cc -o edit $(objects)
main.o : main.c defs.h
        cc -c main.c
kbd.o : kbd.c defs.h command.h
        cc -c kbd.c
command.o : command.c defs.h command.h
        cc -c command.c
display.o : display.c defs.h buffer.h
        cc -c display.c
insert.o : insert.c defs.h buffer.h
        cc -c insert.c
files.o : files.c defs.h buffer.h command.h
        cc -c files.c
clean :
        rm edit $(objects)

doit make2:

DEFAULT_TASKS = ['make2:edit']

objects = "main.o kbd.o command.o display.o insert.o files.o"
TASKS = {
    'edit': (objects,
             "cc -o edit " + objects),
    'main.o': ("main.c defs.h",
               "cc -c main.c"),
    'kbd.o': ("kbd.c defs.h command.h",
              "cc -c kbd.c"),
    'command.o': ("command.c defs.h command.h",
                  "cc -c command.c"),
    'display.o': ("display.c defs.h buffer.h",
                  "cc -c display.c"),
    'insert.o': ("insert.c defs.h buffer.h",
                 "cc -c insert.c"),
    'files.o': ("files.c defs.h buffer.h command.h",
                "cc -c files.c"),
    'clean': ("",
              "rm edit " + objects)
    }

def task_make2():
    for target, rule in TASKS.iteritems():
        yield {'name': target,
               'targets': [target],
               'dependencies': rule[0].split(),
               'action': rule[1]}

Version 3

make has build in support for some kind of tasks like compiling a C object.

make example 3:

objects = main.o kbd.o command.o display.o insert.o files.o

edit : $(objects)
        cc -o edit $(objects)

main.o : defs.h
kbd.o : defs.h command.h
command.o : defs.h command.h
display.o : defs.h buffer.h
insert.o : defs.h buffer.h
files.o : defs.h buffer.h command.h

.PHONY : clean
clean :
        rm edit $(objects)

doit make3:

DEFAULT_TASKS = ['make3:edit']

objects = "main.o kbd.o command.o display.o insert.o files.o"
TASKS = {
    'edit': [objects, "cc -o edit " + objects],
    'main.o': ["defs.h"],
    'kbd.o': ["defs.h command.h"],
    'command.o': ["defs.h command.h"],
    'display.o': ["defs.h buffer.h"],
    'insert.o': ["defs.h buffer.h"],
    'files.o': ["defs.h buffer.h command.h"],
    'clean': ["", "rm edit " + objects]
    }

def task_make3():
    # letting doit deduce the commands
    for target, rule in TASKS.iteritems():
        if target.endswith('.o') and len(rule)==1:
            c_file = "%s.c" % target[:-2]
            rule[0] += " %s" % c_file
            rule.append("cc -c %s" % c_file)

    for target, rule in TASKS.iteritems():
        yield {'name': target,
               'targets': [target],
               'dependencies': rule[0].split(),
               'action': rule[1]}

When you have python do you really need built-in support for simple string manipulations?

Version 4

there are different styles to specify dependencies…

make example 4:

objects = main.o kbd.o command.o display.o insert.o files.o

edit : $(objects)
        cc -o edit $(objects)

$(objects) : defs.h
kbd.o command.o files.o : command.h
display.o insert.o files.o : buffer.h

doit make4:

DEFAULT_TASKS = ['make4:edit']

objects = "main.o kbd.o command.o display.o insert.o files.o"
RULES = {
    'edit': [objects, "cc -o edit " + objects],
    objects: ["defs.h"],
    'kbd.o command.o files.o': ["command.h"],
    'display.o insert.o files.o': ["buffer.h"],
    }

def task_make4():
    # another style of doit
    tasks = {}
    for targets, rule in RULES.iteritems():
        for target in targets.split():
            if target not in tasks:
                tasks[target] = rule[:]
            else:
                tasks[target][0] += " %s" % rule[0]

    # letting doit deduce the commands
    for target, rule in tasks.iteritems():
        if target.endswith('.o') and len(rule)==1:
            c_file = "%s.c" % target[:-2]
            rule[0] += " %s" % c_file
            rule.append("cc -c %s" % c_file)

    for target, rule in tasks.iteritems():
        yield {'name': target,
               'targets': [target],
               'dependencies': rule[0].split(),
               'action': rule[1]}

just more a few lines of code and voila.

Conclusion

The provided syntax is the basic one. You can create your own syntax that is good for your domain on top of it. You can extend it even if there is no API, nothing to import, nothing to subclass :)

In the future I will probably include some helper methods for well-know domains like building C programs. Feel free to contribute ;) I guess doit could support the styles used on make, SCons, XML, JSON, you name it!

Caveats

To be fair there is a small problem. When using generators to create tasks. Tasks will be considered as "sub-tasks" and will have a compound name like "make4:edit". I will create a bug for that. It should be possible to generate tasks and have them not to be considered as "sub-tasks".





doit – a build-tool tale

14 04 2008

doit is a built-tool like written in python. In this post explain my motivation for writting yet another buil tool. If you just want to use it. Please check the website

Build-tool

I started working on a web project… As a good TDD disciple I have lots of tests spawning everywhere. There are plain python unittest, twisted’s trial tests and Django specific unit tests. That’s all for python, but I also have unit tests for javascript (using a home grown unit test framework) and regression tests using Selenium. Running lint tools (JavaScriptLint and PyFlakes) are as important.

So I have seven tools to help me keeping the project healthy. But I need one more to control the seven tools! Actually there are more. I am not counting the javascript compression tool, the documentation generator…

I am not looking for a continuous integration (at least right now). I want to execute the tests in a efficient way and get problems before committing the code to a VCS.

- What tool do we use to automate running tasks?
- GNU Make. Or any other build tool.

SCons

I had the misfortune to (try to) debug some Makefile’s before. XML based was never really an option to me. Since I work with python SCons looked like a good bet.

SCons. Writing the rules/tasks in python helps a lot. But the configuration (construct) file is not as simple and intuitive as I would expect. Maybe too powerful for my needs. Thats ok I don’t have to write new “Builders” that often.

Things went ok for a while… but things started to get too slow. Normal python tests are fast enough not to bother about it. But Django tests using postgres execution time do bother. The javascript tests run on the browser. So it needs to start the server, launch the browser, load and execute the tests… uuoooohhhh.

Most of the time i really need to execute just a subset of tests/tasks. The whole point of build tools is to keep track of dependencies and re-build only what is necessary, right? The problem with tests is that actually i am not building anything. I am executing tasks(in this case tests). Building something is a “task” with a “target” file(s), but running a test is a “task” with no “target”. The problem is that build tools were designed to keep track of target/file dependencies not task dependencies. Yes I know you can use hacks to pretend that every task has a target file. But I was not really willing to do this…

I was not using any of the great SCons features. Actually at some point I easily substitute it to a simple (but lengthy) python script using the subprocess module. Of course this didn’t solve the speed problem.

doit

doit. I want a tool to automatically execute any kind of tasks, having a target or not. It must keep track of the dependencies and re-do (or re-execute) tasks only if necessary, (like every build tool do for target files). And of course it shouldn’t get on my way while specifying the tasks.

Requirements:

. keep track of dependencies. but they must be specified by user, no automatic dependency analysis. (i.e. nearly every build tool supports this)
. easy to create new task rules. (i.e. write them in python)
. get out of your way, avoid boiler-plate code. (i.e. something like what nose does to unittest)
. dependencies by tasks not on files/targets.

The only distinctive requirement is item 4. I guess any tool that implements dependency on targets could support dependency on tasks with not so much effort.
You just need to save the signature of the dependent files on successful completion of the task. If none of the dependencies changes the signature the task doesn’t need to be executed again. Since it is not required to have a target the tasks needs to be uniquely identified. But thats an implementation detail…

So how does it look like?

look at the tutorial:





how to execute tests on a bazaar pre-commit hook

20 01 2008
you should always execute the test suite from a project before committing the source code to a version control system. everybody knows this but sometimes we forget… so lets have a pre-commit hook that will automatically execute the test suite. i am using bazaar.

in bazaar you write hooks as plugins.

how it works

the “hook” is simple python function that will be executed before really committing when bzr commit is called . it is very simple, it just run an executable and check its result. if it is successful the commit operation is executed. if it fails the commit operation is cancelled.

plugins in bazaar are not project specific. so you cant control in which projects (branches) your plugin will be applied (it will be applied to all). so, first the plugin checks if the branch contains a “precommit” executable. it must be named “precommit” and must be in root path of your branch.

the plugin file should be placed in ~/.bazaar/plugins/. the name of the module doesn’t matter (i called it “check_pre_commit.py”). putting the boilerplate code together i got this:

"""this is a plugin/hook for bazaar. just add this file to ~/.bazaar/plugins/"""

from bzrlib import branch

def pre_commit_hook(local, master, old_revno, old_revid, future_revno, future_revid, tree_delta, future_tree):
    """This hook will execute precommit script from root path of the bazaar
    branch. Commit will be canceled if precommit fails."""

    import os,subprocess
    from bzrlib import errors

    # this hook only makes sense if a precommit file exist.
    if not os.path.exists("precommit"):
        return
    try:
        subprocess.check_call(os.path.abspath("precommit"))
    # if precommit fails (process return not zero) cancel commit.
    except subprocess.CalledProcessError:
        raise errors.BzrError("pre commit check failed.")

branch.Branch.hooks.install_hook('pre_commit', pre_commit_hook)
branch.Branch.hooks.name_hook(pre_commit_hook, 'Check pre_commit hook')

thats all for the hook.

now we just need to create an executable that will run the tests and return something different from zero if it fails. and put it on the root path of the branch.
in this example i execute my tests using nose.

precommit

#!/bin/sh
nosetests

ok. everything is working. but who remember to run the tests should be “punished” and will have to wait for the tests to run twice (once manually before committing and once automatically as a pre-commit hook)? no. if you remember to run the tests you can also remember to skip the plugin :)

bzr commit --no-plugins

Update: You can put your tests in a smart build-system (like doit) to keep track of the modifications in your source code and re-execute only the required tests (the ones that depends on the modified files).

caveats

  • the precommit executable is part of the branch but the hook is not. so just doing a branch on the project is not enough. every user has to install the plugin. but just once for all projects.

  • the hook is executed after you supply the commit message. it is annoying to write a message and than loose it because the commit failed. or maybe it is a good punishment for trying to commit an untested code :)

  • the output gets a bit messed up. but not a big deal…

    eduardo@eduardo-laptop:~/work/doit$ bzr commit
    Committing to: /home/eduardo/work/doit/
    modified precommit
    [==========      ] Running pre_commit hooks [Check pre_commit hook] - Stage 3/5....
    ----------------------------------------------------------------------
    Ran 4 tests in 0.268s
    
    OK
    Committed revision 2.

thats it.