Paper document archival
For the past decade I’ve scanned any paper documents that might be worth saving into my laptop as PDFs, and then shredded them.
When I push the blue button on the scanner, the file gets scanned, run through optical character recognition (OCR), automatically renamed to YYYY-MM-DD - «Document Type».pdf
and moved to a Paper Records
folder.
Setup
Here’s how I set it up on Mac OS.
My scanner is a Fuji ScanSnap (model 1300i, not sure if they’re sold anymore).
I created an Incoming Scans
folder on my desktop and configured the ScanSnap software to save all new scanned PDFs directly into that folder.
I also created a Paper Records
folder. The PDFs get moved here when they are after being properly renamed and tagged.
I bought Hazel for automated file processing and PDFPen Pro (so I can do automated OCR on scans).
OCR
I created a Hazel rule for the Incoming Scans
folder called “OCR all PDFs”.
- If
all
of the conditions are met:- Extension is
pdf
- Tags do not contain
OCR
- Extension is
- Do the following
- Run Applescript — embedded script
- Add tags:
OCR
The AppleScript is below. I’ve stripped it down from other versions of it you might find online; it doesn’t attempt to detect if the program was already open or if the document was already OCR’d.
Note: Make sure you configure PDFPen not to prompt you to OCR when opening new documents. Otherwise it will interfere with the AppleScript automation and Hazel will throw errors.
tell application "PDFpen"
open theFile as alias
tell document 1
ocr
repeat while performing ocr
delay 1
end repeat
delay 1
close with saving
end tell
quit
end tell
Auto-rename and -tag common documents
Set up additional Hazel rules for the Incoming Documents
folder, one rule for each type of document you want auto-filed.
Example rule:
- If
all
of the conditions are met:- Contents contain
CITY OF SPRINGVILLE
- Contents contain match
BILLING DATE«…»«Billing Date»
- Contents contain
- Do the following
- Move to folder
Paper Records
- Rename with pattern
«Billing Date» - Springville utilities«extension»
- Add tag:
utilities
- Move to folder
The «…»
in the match pattern refers to Hazel’s “Anything” pattern element, and «Billing Date»
refers to a custom date element.
Manually renamed documents.
I don’t have a rule for every possible type of document I scan; obviously there are going to be many one-offs. So Hazel can’t inspect and rename these for me automatically. But I have an extra rule (“Autofile manually renamed files”) so that it can auto-file them into Paper Records
when I’ve renamed it myself:
- If
all
of the conditions are met:- Name matches
«Filename datestamp» - «…»
- Name matches
- Do the following:
- Move to folder
Paper Records
- Move to folder