ll.sisyphus simplifies running Python stuff as cron jobs.
There will be no more than one sisyphus job of a certain name running at every given time. A job has a maximum allowed runtime. If this maximum is exceeded, the job will kill itself.
In addition to that, job execution can be logged.
To use this module, you must derive your own class from Job and
implement the execute method.
Logs will (by default) be created in the ~/ll.sisyphus directory.
This can be changed by deriving a new subclass and overwriting the appropriate
class attribute.
To execute a job, use the module level function execute (or
executewithargs when you want to support command line arguments).
Example
The following example illustrates the use of this module:
#!/usr/bin/env python
import os
import urllib
from ll import sisyphus
class Fetch(sisyphus.Job):
projectname = "ACME.FooBar"
jobname = "Fetch"
argdescription = "fetch http://www.python.org/ and save it to a local file"
maxtime = 180
def __init__(self):
self.url = "http://www.python.org/"
self.tmpname = "Fetch_Tmp_{}.html".format(os.getpid())
self.officialname = "Python.html"
def execute(self):
self.log("fetching data from {!r}".format(self.url))
data = urllib.urlopen(self.url).read()
datasize = len(data)
self.log("writing file {!r} ({} bytes)".format(self.tmpname, datasize))
open(self.tmpname, "wb").write(data)
self.log("renaming file {!r} to {!r}".format(self.tmpname, self.officialname))
os.rename(self.tmpname, self.officialname)
return "cached {!r} as {!r} ({} bytes)".format(self.url, self.officialname, datasize)
if __name__=="__main__":
sisyphus.executewithargs(Fetch())You will find the log files for this job in ~/ll.sisyphus/ACME.FooBar/Fetch/.
def literaldecode(exc):
class MaximumRuntimeExceeded(Exception):
def __init__(self, maxtime):
selfdef __str__(self):
selfclass Job(object):
A Job object executes a task once.
To use this class, derive your own class from it and overwrite the
execute method.
Logging itself is done by calling self.log:
self.log("can't parse XML file {}".format(filename))This logs the argument without tagging the line. To add tags to the logging
call, simply access attributes of self.log:
self.log.xml.warning("can't parse XML file {}".format(filename))This adds the tags "xml" and "warning" to the log line.
ll.sisyphus itself uses the following tags:
sisyphusThis tag will be added to all log lines produced by
ll.sisyphusitselfinitThis tag is used for the log lines output at the start of the job
resultThis tag is used for final line it the log files that shows a summary of what the job did (or why it failed)
failThis tag is used in the result line if the job failed with a exception.
killThis tag is used in the result line if the job was killed because it exceeded the maximum allowed runtime.
The job con be configured in three ways. By class attributes in the
Job subclass, by attributes of the Job instance (e.g. set
in __init__) and by command line arguments (if executewithargs
is used). The following attributes are supported:
projectname(-por--projectname)The name of the project this job belongs to. This might be a dot-separated hierarchical project name (e.g. including customer names or similar stuff).
jobname(-jor--jobname)The name of the job itself (defaulting to the name of the class if none is given).
argdescription(No command line equivalent)Description for help message of the command line argument parser.
maxtime(-mor--maxtime)Maximum allowed runtime for the job (as the number of seconds). If the job runs longer than that it will kill itself.
fork(--fork)Forks the process and does the work in the child process. The parent process is responsible for monitoring the maximum runtime (this is the default). In non-forking mode the single process does both the work and the runtime monitoring.
noisykills(--noisykills)Should a message be printed when the maximum runtime is exceeded?
logfilename(--logfilename)Path/name of the logfile for this job as an UL4 template. Variables available in the template include
user_name,projectname,jobnameandstarttime.loglinkname(--loglinkname)A link that points to the currently active logfile (as an UL4 template). If this is
Noneno link will be created.log2file(-for--log2file)Should a logfile be written at all?
formatlogline(--formatlogline)An UL4 template for formatting each line in the logfile. Available variables are
time(current time),starttime(start time of the job),tags(list of tags for the line) andline(the log line itself).keepfilelogs(--keepfilelogs)The number of days the logfiles are kept. Old logfiles (i.e. any file in the same directory as the current logfile that's more than
keepfilelogsdays old) will be removed at the end of the job.inputencoding(--inputencoding)The encoding to be used for data that is supposed to be unicode, but isn't (e.g. host/user/network info, lines passed to
self.logetc.)inputerrors(--inputerrors)Decoding error handler name (goes with
inputencoding)outputencoding(--outputencoding)The encoding to be used for the logfile.
outputerrors(--outputerrors)Encoding error handler name (goes with
outputencoding)
Command line arguments take precedence over instance attributes (if
executewithargs is used) and those take precedence over class
attributes.
def prefix(*args, **kwds):
prefix is a context manager. For the duration of the with block
prefix will be prepended to all log lines. prefix calls can
be nested.
def execute(self):
selfExecute the job once. The return value is a one line summary of what the job did. Overwrite in subclasses.
def failed(self):
selfCalled when running the job generated an exception. Overwrite in subclasses, to e.g. rollback your database transactions.
def argparser(self):
selfReturn an argparse parser for parsing the command line arguments.
This can be overwritten in subclasses to add more arguments.
def parseargs(self, args=None):
selfUse the parser returned by argparser to parse the argument
sequence args, modify self accordingly and return
the result of the parsers parse_args call.
def _alarm_fork(self, signum, frame):
selfdef _alarm_nofork(self, signum, frame):
selfdef _handleexecution(self):
selfHandle executing the job including handling of duplicate or hanging jobs.
def _log(self, tags, *texts):
selfLog items in texts to the log file using tags as the list
of tags.
def _getscriptsource(self):
selfReads the source code of the script into self.source.
def _getcrontab(self):
selfReads the current crontab into self.crontab.
def _createlog(self):
selfCreate the logfile and the link to the logfile (if requested).
def _cleanupoldlogs(self):
selfRemove old logfiles.
def _string(self, s):
selfConvert s to unicode if it's a str.
def _exc(self, exc):
selfFormat an exception object for logging.
class Tag(object):
A Tag object can be used to call a function with an additional list
of tags. Tags ca be added via __getattr__ or __getitem__ calls.
def __init__(self, log, *tags):
selfdef __getattr__(self, tag):
self__getitem__ = def __getattr__(self, tag):
selfdef __call__(self, *texts, **kwargs):
selfclass AttrDict(dict):
dict subclass that makes keys available as attributes.
def __getattr__(self, name):
selfdef __setattr__(self, name, value):
selfdef execute(job):
Execute the job job once.
def executewithargs(job, args=None):
Execute the job job once with command line arguments.
args are the command line arguments (None results in
sys.argv being used)