Graham King

Solvitas perambulum

Printing Word And PDF files from Python

software
Summary
To automate printing Word and PDF documents on Windows using Python, you can leverage the win32com bindings to interact with Microsoft Word and Internet Explorer COM interfaces. For Word documents, you can use the `client.Dispatch("Word.Application")` method to open, print, and close documents. For PDFs, since Acrobat lacks a COM interface, you can use Internet Explorer to handle the printing process by navigating to the file and using `ie.Document.printAll()`. Ensure you have at least Acrobat 7.0.7 to avoid script errors. Use JACOB to perform similar tasks in Java. Adjust sleep times in the script to handle potential dialog boxes during printing.

Recently I had to automate printing a whole bunch of CVs on Windows. Having successfully avoided VBA my whole programming life, it was time to think fast. Thankfully Python has some win32com bindings, which allows you to talk COM to various Windows applications, and get them to print the documents.

First you need the Python win32com bindings.

Printing Word documents

Microsoft Word has a nice COM interface, which is well documented. The documentation is targetted at VBA, but the methods are the same.

    from win32com import client
    import time

    word = client.Dispatch("Word.Application")

    def printWordDocument(filename):

        word.Documents.Open(filename)
        word.ActiveDocument.PrintOut()
        time.sleep(2)
        word.ActiveDocument.Close()

    word.Quit()

This opens the Word application without making it visible to the user, opens the document, prints it out on the users default printer, and closes the document, then the application.

The only catch is if the printing throws up a dialog box (something like the document extends out of the print margins), this is displayed to the user. You might want to extend the time.sleep(2) to 5 seconds, to give the user time to click OK before you close the document.

Printing PDF documents

Adobe Acrobat doesn’t have a COM interface. For that you have to buy Acrobat Writer. Thankfully Internet Explorer comes to the rescue (I never thought I would say that). IE has a COM interface, and via it you can control the embedded Acrobat.

    from win32com import client
    import time

    ie = client.Dispatch("InternetExplorer.Application")

    def printPDFDocument(filename):

        ie.Navigate(filename)

        if ie.Busy:
            time.sleep(1)

        ie.Document.printAll()
        time.sleep(2)

    ie.Quit()

The only catch is you need at least Acrobat 7.0.7. Versions prior to 7 don’t have the printAll method, and 7 versions prior to 7.0.7 have an bug. The first time your script tries to print this dialog box appears:

WARNING! A script has requested to print an Acrobat file. This could print an entire document. Do you want to proceed printing?

Click the Don't ask me again tick box and say Yes. On versions prior to 7.0.7 it always remembers No, even when you said Yes. Further details on the bug, and how I found out about it, are here.

You can find documentation on the IE COM interface at MSDN.

What about from Java

I have used JACOB to call COM objects from Java on Windows before, and it worked well. The application names and methods will be the same in any language.