ll.sisyphus

Writing cron jobs with Python

Home › Python software › ll.sisyphusText · XIST · Python

ll.sisyphus simplifies running Python stuff as cron jobs.

There will be no more than one sisyphus job of a certain name running at every given time. A job has a maximum allowed runtime. If this maximum is exceeded, the job will kill itself.

In addition to that, job execution can be logged.

To use this module, you must derive your own class from Job and implement the execute method.

Logs will (by default) be created in the ~/ll.sisyphus directory. This can be changed by deriving a new subclass and overwriting the appropriate class attribute.

To execute a job, use the module level function execute (or executewithargs when you want to support command line arguments).

Example

The following example illustrates the use of this module:

#!/usr/bin/env python

import os
import urllib
from ll import sisyphus

class Fetch(sisyphus.Job):
        projectname = "ACME.FooBar"
        jobname = "Fetch"
        argdescription = "fetch http://www.python.org/ and save it to a local file"
        maxtime = 180

        def __init__(self):
                self.url = "http://www.python.org/"
                self.tmpname = "Fetch_Tmp_{}.html".format(os.getpid())
                self.officialname = "Python.html"

        def execute(self):
                self.log("fetching data from {!r}".format(self.url))
                data = urllib.urlopen(self.url).read()
                datasize = len(data)
                self.log("writing file {!r} ({} bytes)".format(self.tmpname, datasize))
                open(self.tmpname, "wb").write(data)
                self.log("renaming file {!r} to {!r}".format(self.tmpname, self.officialname))
                os.rename(self.tmpname, self.officialname)
                return "cached {!r} as {!r} ({} bytes)".format(self.url, self.officialname, datasize)

if __name__=="__main__":
        sisyphus.executewithargs(Fetch())

You will find the log files for this job in ~/ll.sisyphus/ACME.FooBar/Fetch/.

def literaldecode​(exc):

class MaximumRuntimeExceeded​(Exception):

def __init__​(self, maxtime):

def __str__​(self):

class Job​(object):

A Job object executes a task once.

To use this class, derive your own class from it and overwrite the execute method.

Logging itself is done by calling self.log:

self.log("can't parse XML file {}".format(filename))

This logs the argument without tagging the line. To add tags to the logging call, simply access attributes of self.log:

self.log.xml.warning("can't parse XML file {}".format(filename))

This adds the tags "xml" and "warning" to the log line.

ll.sisyphus itself uses the following tags:

sisyphus

This tag will be added to all log lines produced by ll.sisyphus itself

init

This tag is used for the log lines output at the start of the job

result

This tag is used for final line it the log files that shows a summary of what the job did (or why it failed)

fail

This tag is used in the result line if the job failed with a exception.

kill

This tag is used in the result line if the job was killed because it exceeded the maximum allowed runtime.

The job con be configured in three ways. By class attributes in the Job subclass, by attributes of the Job instance (e.g. set in __init__) and by command line arguments (if executewithargs is used). The following attributes are supported:

projectname (-p or --projectname)

The name of the project this job belongs to. This might be a dot-separated hierarchical project name (e.g. including customer names or similar stuff).

jobname (-j or --jobname)

The name of the job itself (defaulting to the name of the class if none is given).

argdescription (No command line equivalent)

Description for help message of the command line argument parser.

maxtime (-m or --maxtime)

Maximum allowed runtime for the job (as the number of seconds). If the job runs longer than that it will kill itself.

fork (--fork)

Forks the process and does the work in the child process. The parent process is responsible for monitoring the maximum runtime (this is the default). In non-forking mode the single process does both the work and the runtime monitoring.

noisykills (--noisykills)

Should a message be printed when the maximum runtime is exceeded?

logfilename (--logfilename)

Path/name of the logfile for this job as an UL4 template. Variables available in the template include user_name, projectname, jobname and starttime.

loglinkname (--loglinkname)

A link that points to the currently active logfile (as an UL4 template). If this is None no link will be created.

log2file (-f or --log2file)

Should a logfile be written at all?

formatlogline (--formatlogline)

An UL4 template for formatting each line in the logfile. Available variables are time (current time), starttime (start time of the job), tags (list of tags for the line) and line (the log line itself).

keepfilelogs (--keepfilelogs)

The number of days the logfiles are kept. Old logfiles (i.e. any file in the same directory as the current logfile that's more than keepfilelogs days old) will be removed at the end of the job.

inputencoding (--inputencoding)

The encoding to be used for data that is supposed to be unicode, but isn't (e.g. host/user/network info, lines passed to self.log etc.)

inputerrors (--inputerrors)

Decoding error handler name (goes with inputencoding)

outputencoding (--outputencoding)

The encoding to be used for the logfile.

outputerrors (--outputerrors)

Encoding error handler name (goes with outputencoding)

Command line arguments take precedence over instance attributes (if executewithargs is used) and those take precedence over class attributes.

def prefix​(*args, **kwds):

prefix is a context manager. For the duration of the with block prefix will be prepended to all log lines. prefix calls can be nested.

def execute​(self):

Execute the job once. The return value is a one line summary of what the job did. Overwrite in subclasses.

def failed​(self):

Called when running the job generated an exception. Overwrite in subclasses, to e.g. rollback your database transactions.

def argparser​(self):

Return an argparse parser for parsing the command line arguments. This can be overwritten in subclasses to add more arguments.

def parseargs​(self, args=None):

Use the parser returned by argparser to parse the argument sequence args, modify self accordingly and return the result of the parsers parse_args call.

def _alarm_fork​(self, signum, frame):

def _alarm_nofork​(self, signum, frame):

def _handleexecution​(self):

Handle executing the job including handling of duplicate or hanging jobs.

def _log​(self, tags, *texts):

Log items in texts to the log file using tags as the list of tags.

def _getscriptsource​(self):

Reads the source code of the script into self.source.

def _getcrontab​(self):

Reads the current crontab into self.crontab.

def _createlog​(self):

Create the logfile and the link to the logfile (if requested).

def _cleanupoldlogs​(self):

Remove old logfiles.

def _string​(self, s):

Convert s to unicode if it's a str.

def _exc​(self, exc):

Format an exception object for logging.

class Tag​(object):

A Tag object can be used to call a function with an additional list of tags. Tags ca be added via __getattr__ or __getitem__ calls.

def __init__​(self, log, *tags):

def __getattr__​(self, tag):

__getitem__ = def __getattr__​(self, tag):

def __call__​(self, *texts, **kwargs):

class AttrDict​(dict):

dict subclass that makes keys available as attributes.

def __getattr__​(self, name):

def __setattr__​(self, name, value):

def execute​(job):

Execute the job job once.

def executewithargs​(job, args=None):

Execute the job job once with command line arguments.

args are the command line arguments (None results in sys.argv being used)