PDF-ShellTools > Ideas/Suggestions

Splitting by bookmarks (undo merge -BookmarkAll)


I've just been handed several large PDFs which I need to split up into individual files.  Fortunately it looks like someone used PDF Shell Tools to merge them as each sub-file does have a bookmark. 

But as there's no unmerge -BookmarkAll  I'm going to have to reach for PdfTK to extract the bookmark information and write a MS Dos script to call PDF Shell tools with the right start/end pages and titles.
Something like
for /F "eol=; tokens=1,2*  %i in (bookmarks.txt) do PDFShellTools Split -s SplitRules=%i-%j "OutputFilename=%k" BigMergedFile.pdf

If there's not a way to do this with the API (which I haven't exhaustively checked) can you please consider for some time in the future?
[edit = bug fixed in above - %k is automatically created]

Not possible with the current version, because the bookmarks script API object still lacks access to the target page reference. Something I will try to add to the split tool and scripts API in a next release. ;)

Thanks for mentioning the need of this functionality.

Thanks - PDF Labs have managed to work this out in PDFTk - it's GPL so you may get some hints from how they've implemented it (I have no idea!)

I ran

pdftk "BigMergeFile.pdf" dump_data > info.txt

this generates a whole bunch of data like the following

BookmarkTitle: TCDL-801198_A
BookmarkLevel: 1
BookmarkPageNumber: 84

plus the all important total number of pages.

NumberOfPages: 1472

A quick mucking about in XLS (or your favourite script language) can turn this into text like
; 1st page   Last   Title
1   2   Table Of Contents
3   24   BE847-AK-BOM-100_1_Reviewed_AC
25   77   B8F47-AK-SPC-100_1_Reviewed_AC
78   83   B84G7-TK-BLD-101.001_1_IPX Reviewed_AC
84   89   TCDL-871198_A

which can be fed into the FOR loop above

With the 2.6.3 minor version release, it is now possible to split by top-level bookmarks, directly from the split/extract pages tool, or to use the new DestPageIndex property, added to the scripting API Bookmark object, from any operation that needs this bookmark destination page index information.

Thanks!  Top level split works as advertised.


[0] Message Index

Go to full version