A job I often encounter and have approached many ways is report creation. An oversimplified description is presenting data in a format that's useful to the recipient. When looks don't matter much, this can be as easy as creating an HTML template and merging the results of a database query, but when looks do matter the best solution isn't always apparent.
For the case when looks matter, I prefer PDF documents for the final output. The hard part is getting from raw data to PDF. Being a Python programmer I've used the reportlab library to this end with success, but often it's laborious. Also, because programming is required, non-programmers aren't able to create these reports. Many tools for end-user report creation require proprietary software and/or non-standard template formats and processes.
So what I'm looking for is:
- template based
- produce print ready PDF files
- usable by non-programmers
- use standard formats
The tools I found to meet these requirements are:
- python (of course)
- cairo
- rsvg
- Inkscape (for svg templates)
The templates are created and maintained using Inkscape in its native SVG format. In the SVG file, template fields are added as {FIELD1}, {FIELD2}, etc. A database query is run, returning fieldnames that match those in the template then python, rsvg and cairo handle template substitution and conversion to PDF.
Sample Code
def processSVGTemplate(template, data):
filename = createNewUniqueFile()
new_svg_file = open(filename, 'wb')
# no need to make it difficult with xml parsing
# just do find/replace
svg_data = open(template).read()
for (key, value) in data.items():
svg_data = svg_data.replace('{%s}' % key, value)
new_svg_file.write(svg_data)
new_svg_file.close()
return filename
def mergeData(template_name, data, output_path=None):
"""merge `data` with `template_path`
returns None if `output_filename` is given or a string otherwise
"""
template_path = os.path.join(TEMPLATE_PATH, template_name)
# if not writing to a file, just send back a string.
using_tempfile = False
if output_path is None:
using_tempfile = True
output_path = createNewUniqueFile()
try:
svgt = rsvg.Handle(template_path)
props = svgt.props
pdf_dpi = 72.0
width = pdf_dpi / svgt.props.dpi_x * svgt.props.width
height = pdf_dpi / svgt.props.dpi_y * svgt.props.height
# convert to 72 ppi
surf = cairo.PDFSurface(output_path, width, height)
cr = cairo.Context(surf)
# scale the context for the change in dpi
cr.scale(pdf_dpi / svgt.props.dpi_x, pdf_dpi / svgt.props.dpi_y)
for rset in data:
svg_filename = processSVGTemplate(template_path, rset)
try:
svg = rsvg.Handle(svg_filename)
svg.render_cairo(cr)
cr.show_page()
finally:
os.unlink(svg_filename)
surf.finish()
if using_tempfile:
pdf_data = open(output_path).read()
return pdf_data
finally:
if using_tempfile:
os.unlink(output_path)
Caveats and Thoughts
- rsvg required installing python-gnome-desktop on Debian (45Mb), which makes no sense to me.
- I didn't do any xml parsing, but instead used string substitution. At first I used xml.dom.minidom, but it seemed needlessly complex for my needs. Maybe that will change.