Two years ago or so we where looking into ways to generate various PDF documents from python. I had used various ways of generating PDFs before like directly outputting Postscript (code generation!) and converting to PDF, FPDF, ReportLab and HTML2PDF. I was looking fore something more “What you see is what you get” like. And i wanted integrated PDF generation.
It turned out JasperReports had all I needed. While Jasper is promoted as reporting solution it can be used as a simple PDF generator. Together with the iReport visual report designer we had the complete Toolchain. USers could use iReport to “paint” the desired PDF output using placeholders for all the variable data to be filled in. iReport would save this design in a
.jrxml file. The jasper library would then compile that file to a
.jasper file for performance reasons.
For generation the actual report you would feed that
.jasper and a “datasource” to the JasperReports library.. A datasource could be a SQL/JDBC query or an xml-file. This datasource would than be used to fill in the placeholders you left in the
.jrxml and generate a PDF.
All nice and smooth. The only problem is: Jasper Reports is written in Java and our application stack is written mostly in Python. I also avoid programming Java whenever possible. I feel incredibly clumsy when having to code in Java.
Jython to the rescue! Jython allows you to access Java libraries and still write Python code. this is done by having Python running inside a Java VM. Wit some help we where able to generate PDFs from jython. But the whole beast was still running inside the java VM.
So to call it from our regular Python code running inside mod_python on the Apache web server we had to write the XML to disk, fork a process, fire up the Java VM with Jython and the call the Jasper library to generate the PDF from the XML and the jrxml report source code.
Worked but took about 7 seconds per document. This was OK for low volume production but for some application we needed much shorter turn-arround times. It turned out, starting up the java VM was the biggest time consumer. So we created a long running Jython process which received PDF rendering jobs via a home grown UDP Protocol. It was messi. Sending absolute filesystem paths via UDP was probably not a good idea but it worked. PDF Generation times where down to 1.5 seconds or so. Using ths structure we have been running 21 months or so and generated about 500.000 PDFs.
But it was still a mess. Especially it meant that every server which had to generate PDFs also needed Java installed. And Java on FreeBSD Servers is very limited fun. So we came up with a servlet based approach: There would be a webserver; upon POSTing a XML datasource and a JRXML file to a specific URL you would get back a rendered PDF. As a Java based Webserver we use Jetty which is driving a Jython servlet calling the JasperReports library. Works like a charm.
After using it for 9 Months or so Im just in the process of publishing this Jetty based version real soon now(TM). It will be available at http://cybernetics.hudora.biz/projects/wiki/pyJasper