How to auto-reload a server on changes with doit

5 03 2012

Another day my friend told me he is using wsgiref.simple_server and asked if doit could be used to auto-reload the server. My first answer was do not use wsgiref :)

But you might prefer to use wsgiref & doit for for auto-reload for two reasons. doit can be bundled in a single file and included in the project. You might want to have explicit control on when the server should be reloaded.

server.py:

from wsgiref.simple_server import make_server, demo_app

httpd = make_server('', 8000, demo_app)
print "Serving HTTP on port 8000..."

# Respond to requests until process is killed
httpd.serve_forever()

dodo.py:

import subprocess
import glob
import os
import signal

def start_server(server_cmd, pid_filename, restart=False):
    # check the server is running
    if os.path.exists(pid_filename):
        if restart:
            stop_server(pid_filename)
        else:
            msg = "It seems the server is already running, check the file %s"
            print  msg % pid_filename
            return False

    # start server
    process = subprocess.Popen(server_cmd.split())

    # create pid file
    with open(pid_filename, 'w') as pid_file:
        pid_file.write(str(process.pid))
    return True

def stop_server(pid_filename):
    # check server if is running
    if not os.path.exists(pid_filename):
        return
    # try to terminate/stop server's process
    with open(pid_filename) as pid_file:
        pid = int(pid_file.read())
        try:
            os.kill(pid, signal.SIGTERM)
        except OSError:
            pass #ignore errors if process does not exist
    # remove pid file
    os.unlink(pid_filename)


########################################

DOIT_CONFIG = {'default_tasks': ['restart']}

PID_FILENAME = 'pid.txt'
START_SERVER = 'python server.py'

def task_start():
    return {'actions': [(start_server, (START_SERVER, PID_FILENAME,))]}

def task_stop():
    return {'actions': [(stop_server, (PID_FILENAME,))]}

def task_restart():
    return {'actions': [(start_server, (START_SERVER, PID_FILENAME, True))],
            'file_dep': glob.glob('*.py'),
            'uptodate': [False],
            }

In order to do a auto-reload/restart the server we need to be able to start and stop the server with two independent commands.
So when starting the server we create a text file containing the PID of the process server.

The start_server funcion contains a boolean parameter ‘restart’ to control the behaviour when there is already a pid file.

task_restart

Usually ‘file_dep’ is used to indicated when a task is up-to-date but in this case we use it just to trigger a re-execution of the task in the ‘auto’ mode.

So apart from the action to restart the server the task’s file_dep controls which files to watch for modifications. Since we want always want to start the server when the task is called we need to add the ‘uptodate’ parameter to be false.

To use it just type:

$ doit auto restart




A faster (but incomplete) implementation of SCons on top of doit

19 07 2011

Motivation

doit is an automation tool. It is kind of build-tool but more generic…

My motivation was to demonstrate how to create a specialized interface for defining tasks. doit can be used for many different purposes, so its default interface can be quite verbose if compared to tools created to solve one specific problem.

Instead of creating an interface myself I decided to use an existing interface. I picked SCons. So the goal was to be able to build C/C++ project using an existing SConstruct file without any modification. And of course it should be as good as SCons on dependency tracking, ensuring always a correct result.

A secondary goal was to make it fast.

Note that this implementation is very far from complete. I only implemented the bare minimum to get some benchmarks running.

Implementation

I wont go into the gory details… Just a few notes. You can check the code here.

In docons.py there is an implementation of the API available in SConstrcut files. When a “Builder Method” (like Program or Object) is executed a reference to the builder is saved in a global variable. These “Builder Methods” are actually implemented as class that can generate dictionaries representing doit tasks.

In doit the configuration file that define your tasks is called dodo.py. In this case the end user wont edit this file directly. dodo.py will import the SCons API namespace from docons, than it will execfile the SConstruct file and collect the tasks from the “Builder Methods”.

Creating tasks for compile/link is straightforward. The hard part is automatically finding out the dependencies in the source code and mapping it into account on your tasks. To find out the dependencies (the #include for C code) in the source I am using the same C preprocessor as used by SCons.

SCons uses the concept of a “Scanner” function associated with a Builder. In doit the implicit dependencies are “calculated” in a separate task. The dependencies are than put into the build (compile/link) tasks through calc_dep (calculated dependencies).

A faster implementation

It seems SCons creates the whole dependency graph before starting to execute the tasks/builders. Because of this it doesn’t scale so well when the number of files increase.

doit creates the dependency graph dynamically during task execution. But even on a no-op build it will end-up with a complete graph built because it will check all tasks dependencies.

tup is a build-tool that saves the dependency graph in a SQLite database. I decided to give it a try to its approach.  So I created “dup” – sorry for the name :) . It will still read the build configuration from SConstrcut files but it will keep a SQLite database mapping the targets to all of its dependencies. This enables a much faster no-op and incremental builds. The  underlying graph dependency of a target will only be built if required.

Benchmarks

I did some very basic benchmarks from SCons and my two SCons implementations. The benchmarks were created using the gen-bench from wonderbuild benchmarks.

I used gen-bench script was run with the arguments “50 100 15 5″.  This generates 10000 tiny interdependent C++ source and header files, to be built into 50 static libs.

All benchmarks were run on a intel i3 550 quad-core (3.20GHz), running Ubuntu 10.10, python2.6.6, doit 0.13.0, SCons 2.0.0. All benchmarks were run using a single process.

The graph doesn’t include full build time. It was SCons=266 seconds, docons=249 seconds, dup=253 seconds.

  • no-op 1 lib -> no-operation build when all files are up-to-date and only one of the 50 libraries were selected to be built (scons build-scons/lib_0/lib0.a)
  • no-op -> no-operation build
  • partial build – cpp file -> only the file lib_1/class_0.cpp was modified. rebuilt 1 object file and 1 lib.
  • partial build – hpp file -> only the file lib_0/class_0.hpp was modified. rebuild 30 object files and 14 libs.

Analysis

  1. comparing SCons and docons on no-op build you can see how doit is considerably faster than SCons on creating the dependency graph and checking what to build.
  2. comparing “no-op 1 lib” with “no-op” you can see how both SCons and docons have a performance degradation from creating the dependency graph (5.2 and 3.6 times slower respectivelly). And how dup shows little influence on no-op build relative to the size of the dependency graph.
  3. all 3 solutions have almost no difference between a no-op build and a partial/incremental build where a single source file is modified.
  4. as the number of built objects increase the advantage of dup over docons is reduced because it requires the dependency graph from the affected tasks to be built.

Should you stop using SCons?

Probably not. The implementation is very incomplete and probably buggy in many ways. This code was written just as proof-of-concept (and for fun) to check how powerful, flexible and fast doit can be.

I have personally no interest in developing a C/C++ build tool and if I would build one I would create a different interface from the one used by SCons.





appengine & virtualenv

21 11 2010

UPDATE: appengine 1.6.1 & uses gaecustomize.py

This article will explain how to setup Google AppEngine (GAE) with virtualenv.

GAE does not provide a “setup.py” to make the SDK “installable”, it is supposed to be used from a folder without being “installed”. GAE actually forbids the use of any python library in the site-packages folder. All included libraries must be in the same folder as your application, this allows GAE to automatically find and upload third-party libraries together with your application code when you upload the code to GAE servers.

So what would be the advantages of using of using virtualenv with GAE? The main reason is to have an environment to run unit-tests and functional tests. It will allow us to use the interactive shell to make operations on DB. And it also enforce you are using the correct python version.

Step 0 – install App Engine SDK

Make sure you use 1.6.1 or later, an important bug was fixed on this release. As described on official docs. Tested with virtualenv 1.6.4.

Step 1 – create and activate a virtualenv

Same as usual…


$ virtualenv --python python2.5 --no-site-packages gae-env
$ source gae-env/bin/activate

Step 2 – add google_appengine path

Add a path configuration file named “gae.pth” to the virtualenv site-packages with the path to google_appengine. This way google_appengine will be in sys.path enabling it to be imported by other modules.

You will need to adjust the content of the file according to where you created your virtualenv and google_appengine location. Mine looks like this:


$ cat gae-env/lib/python2.5/site-packages/gae.pth
../../../../google_appengine

Simple test to make sure your gae.pth is correct:

(gae-env)$ python
>>> from google import appengine

If you did not get any exception you are good to go on.

Step 3 – fix path for third-party libs

The AppEngine SDK comes with a few third-party libraries. They are not in the same path as google’s libraries. If you look at dev_appserver.py you will see a function called fix_sys_path, this function adds the path of the third-party libraries to python’s sys.path. One option would be to add these paths to gae.pth… But I prefer to use the function fix_sys_path so we have less chances of having problems with future releases of the SDK.

Note that this will not look for your config in app.yaml. So you might need to add some extra imports. The example below is using webob version 1.1.1 instead of the default one.

Path configuration files can also execute python code on if the line starts with import. Add a module gaecustomize.py to site-packages:

gae-env/lib/python2.5/site-packages/gaecustomize.py

def fix_sys_path():
    try:
        import sys, os
        from dev_appserver import fix_sys_path, DIR_PATH
        fix_sys_path()
        # must be after fix_sys_path
        # uses non-default version of webob
        webob_path = os.path.join(DIR_PATH, 'lib', 'webob_1_1_1')
        sys.path = [webob_path] + sys.path
    except ImportError:
        pass

And modify gae.pth it calls the above module:

gae-env/lib/python2.5/site-packages/gae.pth

../../../../google_appengine
import gaecustomize; gaecustomize.fix_sys_path()

For some unknown reason gae.pth is being processed twice and on the first time google_appengine is not added to sys.path. Thats why I explicitly call the function fix_sys_path.

Check if it is working fine:


(gae-env)$ python
>>> import yaml

Again. You should not any exceptions on this…

Step 4 – add dev_appserver.py to bin

Not really required but handy.


gae-env/bin $ ln -s ../../google_appengine/dev_appserver.py .

Conclusion

Now you have an isolated environment running AppEngine! But pay attention libraries used your production code should not be installed in your virtualenv, you should do “GAE way” and link them from your application folder. You should install on virtualenv only stuff used on your tests. Check site.py docs for more details on using .pth files.





doit slides (pycon asia-pacific 2010)

29 06 2010

Here are the slides of my doit presentation at pycon apac 2010





doit @ Asia-Pacific pycon

16 05 2010

The first Asia-pacific pycon will be on 9 – 11 June 2010 in Singapore. I will give a presentation on doit

So see you in singapure ;)





Slide presentations in reStructuredText -> S5 -> PDF

16 03 2010

Let’s say you want to create a slide presentation and you are not very much into presentation software.

S5 is a good enough HTML based alternative for a slideshow presentation. reStructuredText and rst2s5 can free you from writing HTML yourself…

S5 can generate a "printer-friendly" version of your slides. But I was really missing a way to create a PDF version of my slides to ease its distribution. I finally found a tool that could handle that Prince.

Example (slides.rst):

.. include:: <s5defs.txt>

======================================================================
reStructuredText to PDF
======================================================================

in 2 easy steps

:Author: Eduardo Schettino


(1) rst2s5
=======================

rst => s5

::

  rst2s5 --theme=small-white slides.rst slides.html


(2) prince
=======================

s5 => PDF

::

  prince --media projection -s page.css slides.html -o slides.pdf

Where page.css controls the PDF page size:

@page { size: 1280px 800px }

Caveats

  • Prince is not OpenSource though it provides a free license for non-commercial user.
  • Page footer is displayed only on first page.
  • Some CSS tweaking might be necessary depending on your theme.




inotify & text editors (emacs, vim, …)

7 03 2010

This post describes how to get notifications on file modifications made through a text editor. Using python on linux (ubuntu).

I am working on adding some inotify goodness to doit. For that I want to receive one, and one only, notification every time a file is modified. Inotify makes the hard work of watching the file system and Pyinotify provides a python interface. But using it was not straight-forward as I expected. The problem is that editors manipulate files on its own ways…

I have started with a modified example from Pyinotify tutorial to watch for file modifications. The IN_MODIFY event looked like perfect to me, it is an event that is called when “file was modified”.

It worked fine when I used “echo”. But than when I tried with Emacs I got 3 notifications. With VIM it was even worst, I got no notifications and an error message!

[Pyinotify ERROR] The pathname ‘notify.py’ of this watch <Watch wd=1 mask=4095 auto_add=False proc_fun=None path=notify.py exclude_filter=<function at 0x257e398> dir=False > has probably changed and couldn’t be updated, so it cannot be trusted anymore. To fix this error move directories/files only between watched parents directories, in this case e.g. put a watch on ‘.’.

The error message is pretty clear, I should watch the folder containing the file… In order to understand what was going on I came up with the following code. It watch for all inotify events on file’s folder:

import os.path
import pyinotify

class EventHandler(pyinotify.ProcessEvent):
    def process_default(self, event):
        print "==> ", event.maskname, ": ", event.pathname


wm = pyinotify.WatchManager()  # Watch Manager
#mask = pyinotify.IN_MODIFY
mask = pyinotify.ALL_EVENTS
ev = EventHandler()
notifier = pyinotify.Notifier(wm, ev)

for watch_file in ['notify.py']:
    watch_dir = os.path.dirname(os.path.abspath(watch_file))
    wm.add_watch(watch_dir, mask)

notifier.loop()

I got the following results doing modifications on the file with echo, emacs and vim:

  • echo a >> notify.py
  • ==> IN_OPEN : /my_folder/notify.py
    ==> IN_MODIFY : /my_folder/notify.py
    ==> IN_CLOSE_WRITE : /my_folder/notify.py

    Exactly what I expected :)

  • emacs (C-x C-s)
  • ==> IN_CREATE : /my_folder/.#notify.py
    ==> IN_MODIFY : /my_folder/notify.py
    ==> IN_OPEN : /my_folder/notify.py
    ==> IN_MODIFY : /my_folder/notify.py
    ==> IN_CLOSE_WRITE : /my_folder/notify.py
    ==> IN_DELETE : /my_folder/.#notify.py

    I am not trying to understand emacs internals… but I can notice that IN_CLOSE_WRITE raised just one event.

  • vim (:w)
  • ==> IN_MODIFY : /my_folder/.notify.py.swp
    ==> IN_CREATE : /my_folder/4913
    ==> IN_OPEN : /my_folder/4913
    ==> IN_ATTRIB : /my_folder/4913
    ==> IN_CLOSE_WRITE : /my_folder/4913
    ==> IN_DELETE : /my_folder/4913
    ==> IN_MOVED_FROM : /my_folder/notify.py
    ==> IN_MOVED_TO : /my_folder/notify.py~
    ==> IN_CREATE : /my_folder/notify.py
    ==> IN_OPEN : /my_folder/notify.py
    ==> IN_MODIFY : /my_folder/notify.py
    ==> IN_CLOSE_WRITE : /my_folder/notify.py
    ==> IN_ATTRIB : /my_folder/notify.py
    ==> IN_MODIFY : /my_folder/.notify.py.swp
    ==> IN_DELETE : /my_folder/notify.py~
    ==> IN_CLOSE_WRITE : /my_folder/.notify.py.swp
    ==> IN_DELETE : /my_folder/.notify.py.swp

    It seems Vim do not modify the file directly. It just move/replaces the original one with the edited through a “swap file”. IN_CLOSE_WRITE was called for the target file only once…

I played around a bit more and it is clear that IN_CLOSE_WRITE is what I was really looking for. In order to deal with VIM I should watch the folder and not the file itself. Putting all pieces together I got this code:

import os.path
import pyinotify

class FileModifyWatcher(object):

    def __init__(self, file_list):
        self.file_list = set([os.path.abspath(f) for f in file_list])
        self.watch_dirs = set([os.path.dirname(f) for f in self.file_list])

    def handle_event(self, event):
        if event.pathname in self.file_list:
            print "==> ", event.maskname, ": ", event.pathname

    def loop(self):
        handle_event = self.handle_event
        class EventHandler(pyinotify.ProcessEvent):
            def process_default(self, event):
                handle_event(event)

        wm = pyinotify.WatchManager()  # Watch Manager
        mask = pyinotify.IN_CLOSE_WRITE
        ev = EventHandler()
        notifier = pyinotify.Notifier(wm, ev)

        for watch_this in self.watch_dirs:
            wm.add_watch(watch_this, mask)

        notifier.loop()

if __name__ == "__main__":
    fw = FileModifyWatcher(['notify.py'])
    fw.loop()

To use this, subclass FileModifyWatcher, over-write handle_event. Just pass the list of files to watch to its constructor and thats it ;)








Follow

Get every new post delivered to your Inbox.