Monday, June 17, 2013
XMLtoPDFBook now supports chapter numbers and names
By Vasudev Ram
I've added support for chapter numbers and names to XMLtoPDFBook, which I blogged about recently. XMLtoPDFBook enables you to create simple PDF ebooks from chapters stored as text in an XML file.
The chapter numbers and names are printed in the header of the PDF file created. Chapter numbers are added automatically, starting from 1, and incremented by 1 for each chapter. For chapter names, you have to change the chapter elements in the XML file from the earlier format, which had no attributes for the chapter element, to add an attribute called 'name', with its value being the chapter name.
Earlier format for the chapter element:
<chapter>
New format for the chapter element:
<chapter name="chapter_name">
where you replace "chapter_name" with the name of each chapter, as desired.
That is the only change needed. The (updated) XMLtoPDFBook program takes care of the rest.
Chapter names, though supported, are optional. If a chapter element has no name attribute, it is not an error. No chapter name will be printed in the header for that chapter.
You can run XMLtoPDFBook the same way as I said in my first post about it:
python XMLtoPDFBook.py vi_quickstart.xml vi_quickstart.pdf
For viewing the PDF file, you may want to try using either Foxit PDF Reader or NitroReader. I've used Foxit Reader a lot, and it is fairly good. Just started trying NitroReader (*).
Here is a screenshot of page 1 of the generated PDF file, vi_quickstart.pdf, in NitroReader (right-click to open in a new tab and view larger size):
And here is a screenshot of page 5 of the same PDF file, vi_quickstart.pdf, in Foxit PDF Reader (right-click to open in a new tab and view larger size):
I also added some more error handling to the program.
I've uploaded XMLtoPDF to my Bitbucket repository for xtopdf, since it is now a part of my xtopdf toolkit. You can download it from here.
Incidentally, I saw on the NitroReader site that it was PDF's birthday this month; the PDF format is now 20 years old.
(*) And finally, it was a bit interesting to me to remember that NitroPDF (from the same company as NitroReader) was one of the topics of my very second blog post on my earlier blog, jugad's Journal :-). I ran that blog for about 3 years before moving to this one (which you are reading now), on Blogger, due to the takeover of LiveJournal by some other company.
- Vasudev Ram - Dancing Bison Enterprises
Contact me
Saturday, June 15, 2013
Create PDF books with XMLtoPDFBook
XMLtoPDFBook is a program that lets you create simple PDF books from XML text content. It requires Python, ReportLab and my xtopdf toolkit for PDF creation.
(Use ReportLab v1.21, not the 2.x series; though 2.x has more features, xtopdf has not been tested with it; also, those additional features are not required for xtopdf.)
XMLtoPDFBook.py is released as open source software under the BSD license, and I'll be adding it to the tools in my xtopdf toolkit.
Here's how to use XMLtoPDFBook:
In a text editor, create a simple XML template for the book, like this:
<?xml version="1.0"?> <book> <chapter> Chapter 1 content here. </chapter> <chapter> Chapter 2 content here. </chapter> </book>Add as many chapter elements as you need.
Then write or paste the text of one chapter inside each chapter element, in sequence.
Now you can convert the book content to PDF using this program, XMLtoPDFBook:
#-------------------------------------------------- # XMLtoPDFBook.py # A program to convert a book in XML text format to a PDF book. # Uses xtopdf and ReportLab. # Author: Vasudev Ram - http://www.dancingbison.com # Version: v0.1 #-------------------------------------------------- # imports import sys import os import string import time from PDFWriter import PDFWriter try: import xml.etree.cElementTree as ET except ImportError: import xml.etree.ElementTree as ET #-------------------------------------------------- # global variables sysargv = None #-------------------------------------------------- def debug(message): sys.stderr.write(message + "\n") #-------------------------------------------------- def get_xml_filename(sysargv): return sysargv[1] #-------------------------------------------------- def get_pdf_filename(sysargv): return sysargv[2] #-------------------------------------------------- def XMLtoPDFBook(): debug("Entered XMLtoPDFBook()") global sysargv xml_filename = get_xml_filename(sysargv) debug("xml_filename: " + xml_filename) pdf_filename = get_pdf_filename(sysargv) debug("pdf_filename: " + pdf_filename) pw = PDFWriter(pdf_filename) pw.setFont("Courier", 12) pw.setHeader(xml_filename + " to " + pdf_filename) pw.setFooter("Generated by ElementTree and xtopdf") tree = ET.ElementTree(file=xml_filename) debug("tree = " + repr(tree)) root = tree.getroot() debug("root.tag = " + root.tag) if root.tag != "book": debug("Error: Root tag is not 'book'") sys.exit(2) debug("=" * 60) for root_child in root: if root_child.tag != "chapter": debug("Error: root_child tag is not 'chapter'") sys.exit(3) debug(root_child.text) lines = root_child.text.split("\n") for line in lines: pw.writeLine(line) pw.savePage() debug("-" * 60) debug("=" * 60) pw.close() debug("Exiting XMLtoPDFBook()") #-------------------------------------------------- def main(): debug("Entered main()") global sysargv sysargv = sys.argv # Check for right number of arguments. if len(sysargv) != 3: sys.exit(1) XMLtoPDFBook() debug("Exiting main()") #-------------------------------------------------- if __name__ == "__main__": main() #--------------------------------------------------
Here is an example run of XMLtoPDFBook, using my vi quickstart article earlier published in Linux For You magazine:
python XMLtoPDFBook.py vi_quickstart.xml vi_quickstart.pdf
This results in the contents of the article being published to PDF in the file vi_quickstart.pdf.
- Vasudev Ram - Dancing Bison Enterprises
Contact me
Friday, December 7, 2012
PyRSS2Gen, to create RSS feeds Pythonically
It is by Andrew Dalke of Dalke Scientific, a company that develops Python tools for science:
http://www.dalkescientific.com/company.html
I saw PyRSS2Gen used on this blog:
http://blaag.haard.se/
- Vasudev Ram
www.dancingbison.com
Friday, September 16, 2011
dexml, simple object-XML mapper for Python
Saw this via a tweet by @raymondh - Raymond Hettinger, core Python developer/guru:
Ryan Kelly ( https://github.com/rfk ) has created dexml, a simple "ORM for XML" in Python; IOW, an object-XML mapper for Python. It works both ways - map a Python object to XML and vice versa. This is something like the Java API for XML (Data) Binding (JAXB), a pretty useful API which I had looked at some time ago.
Get dexml here: https://github.com/rfk/dexml
Examples of dexml usage are here: https://github.com/rfk/dexml#readme
Posted via email
- Vasudev Ram @ Dancing Bison