I recently found myself on a project involving PDF file organization. I’ve always known there to be countless open source PDF manipulation tools, but I’ve never really used many myself, and especially not via a Linux shell.
Specifically what I needed to do is:
- split multiple pages into each individual page
- create a thumbnail image to preview each page
- extract all readable text from each page for searching