Using Gimp to Alter PDF Files
Copyright (C) 2009 by Steve Litt, All
rights reserved. Material provided as-is, use at your own risk.
By Steve Litt
It's nice of the
organizations and the government to provide online PDFs of their forms, but filling them out,
especially in any size but US Letter size, is a pain.
One way to
do it is to fill the form out online. Your mileage may vary, but
when filling out something important like a trademark application, I
like to take my time, repeatedly read over what I have, and ask
questions. On at least one occasion I've taken a week to fill out a
trademark application, even though it's probably 400 words at the most.
Such careful and slow work isn't compatible with web forms, especially
when a lot of government "fill it out online" forms won't let you save
partial work.
Another way may be to purchase Adobe Acrobat. I've
heard it through the grapevine that Adobe Acrobat lets you modify PDF
files to your heart's content, always assuming they're not DRM'ed. But
if you're a Free Software kinda guy, you're looking to do it with free
tools on your Linux machine. That's what this web page is all about.
Here's the basic process:
- Open
the PDF as separate images in Gimp.
- Save
each created image as an .xcf file, numbered consecutively. For
instance, if you GIMP a 12 page PDF, you'll name the created images
p01.xcf through p12.xcf.
- For each page that you'll be changing,
add a transparent layer called "writing" right on top of the
background. Then put any text on top of the "writing" layer, which
protects the background (the scanned image), from alteration.
- If any pages need a signature in the finished PDF, see the Scanning and Using a Signature article.
- When you're done editing all the pages, convert them all to TIFF files.
- Use tiffcp to put all the TIFF images into one.
- Use the tiff2pdf program to create the new PDF file.
Scanning and Using a Signature
Here's how you scan it:
- Sign a piece of paper with black pen of substantial weight.
- Scan the signature. Scan it dark.
- Open the scanned signature in Gimp
- Remove any extraneous spots.
- Crop to leave a few pixels of white beyond the black on all four sides.
- Make the layer transparent: Layer->Transparency->Add_Alpha_Channel
- Doubleclick the "select by color" icon (
)
and set the threshhold value to 15. Don't set it much lower or you risk
white pixels overwriting darker background when you finally use the
signature. - Click the "select by color" icon, then click the
whitest part of the signature image. You'll see selections on
everything except the signature.
- Rightclick->Edit->Clear, and everything except the signature becomes transparent.
- Rightclick->Select->None to view the signature.
- If
the signature contains significant whitish looking pixel blobs, use
Undo to back out past the "select by color", and then repeat step 7
increasing the threshhold value by 10, and then repeat 8 through 11.
You want to get to the point where you have plenty of mid-gray
pixels, but no significant whitish pixel blobs.
- Once it's right, save the signature image file.
You
use it by pasting it into the page images created by GIMPing a .pdf
file. Never paste it directly on the background, but instead create a
transparent layer above the background, called "writing", and paste the
signature above the "writing" layer.
Exercise: Filling out a 1040 Tax Form
- Download a 1040 tax form: wget http://www.irs.gov/pub/irs-pdf/f1040.pdf.
- Import the PDF into Gimp, as 1 .xcf file per page:
- gimp f1040.pdf. A dialog box appears.
- Change the "Open pages as" dropdown to "images" so that each PDF page becomes a separate image.
- Click the Import button. Images will open for each page.
- Save the files as p01.xcf and p02.xcf for pages 1 and 2 respectively.
- Modify the Pages
- On each page:
- Rightclick->Dialogs->Layers
- Click the "new layer" button (
)
on the layers list and create a transparent layer called "writing"
above the numbered layer, which is really the background. The number is
the page number of the original.
- Click on p01.xcf and zoom it until you have room to work.
- T for the text tool. Type in your first name and initial. DO NOT close the text entry box.
- Click the image window and then Ctrl+B to bring the toolbox to the forefront.
- Doubleclick the Text tool icon (
). - Select
a truetype font (you want something readable in the Windows world). I
suggest Arial Bold unless it involves little boxes for each letter, in
which case I'd recommend Courier New Bold. Then change the size until you have
something appropriate for the form.
- Type in the text, and readjust the size if necessary.
- On
page 2, in the box labeled "Federal income tax withheld from Forms W2
and 1099", type in $10,000.00. Make sure that you type that ABOVE the
"writing" layer, not below it.
- Save both pages.
- Convert back to a PDF
- Save them again, this time as p01.tif and p02.tif.
- Do this command: tiffcp p01.tif p02.tif p.tif
- The preceding combines both pages into a single tiff file
- Do this command: tiff2pdf -p letter -o p.pdf p.tif
- The
preceding converts the tiff file to PDF, and specifies that the
intended paper size is letter. Other paper size choices are legal and
A4.
- Use a PDF reader to verify that you have a 2 page PDF 1040 tax form containing the changes you made on each page.
x
Printing Legal Size PDFs
There's a special place in the land of
the devil reserved for organizations and governmental entities whose
forms are legal sized (8.5 x 14). Very few people have legal size paper
on hand, and a single ream of legal sized paper costs nine bucks --
almost double the price of letter size. Many printers can't print this
size at all. Those that can often don't have a legal sized tray
attached, so all trays must be pulled out and the legal paper fed
manually. Normal image handling commands often don't work with
legal sized papers. My experience tells me that using legal sized forms
doubles the work, but nevertheless sometimes you have to do it. When
you do, here's how...
Your mileage may vary, but I was not able to produce a PDF which could be printed with a simple lpr command:
lpr -P lp_myprinter mylegaldoc.pdf
The preceding command causes part of each page to be cut off. Instead, I had to specify the paper size in the lpr command:
lpr -P lp_myprinter -o PageSize=Legal
The
preceding command works well, but of course you can't expect the drones
at the organization or governmental entity to run that command (even if
they had Linux, they probably wouldn't know how to). So instead, let
them know they can print the document from Acrobat Reader by:
- Page scaling = none
- Auto rotate and center = yes (probably)
- Properties button->Page Size = legal
The
preceding are all set from the dialog box that pops up after
File->Print from within Acrobat reader. Note that the recipient may
be using a different version of Acrobat Reader, so the properties might
be accessed a little differently, but scaling, center and size have
long been a part of Acrobat Reader, so the recipient will probably know
how to set these if he or she is in the business of receiving PDFs to
print.
Cropping
I
Color Reduction
T