RTTSoftware Support Forum

PDF Explorer => General => Topic started by: Padanges on November 25, 2016, 08:03:49 AM

Title: Problems with text indexation
Post by: Padanges on November 25, 2016, 08:03:49 AM
Hi,
It seams that Index text words Batch tool does not reindex already indexed documents even if they were modified - after modifying a PDF and using dbSearch tool I get "bypasing already indexed text files" I/O message. Is that correct? How can then we re-index a document?
Also, I have noticed that indexed PDF documents don't get their Bookmarks indexed. That a misfortune - I have many PDF documents which are not OCR, i.e. do not contain any text, but have a "quick-fix" for keywords/tags as a bookmark structure. Is it possible to expand dbSearch so that it could check in bookmarks as a part of the text as well?


Thanks in advance
Title: Re: Problems with text indexation
Post by: RTT on November 27, 2016, 01:47:27 AM
How can then we re-index a document?
Right now you can do this by removing from the DB the containing folder, using the database edit tool, menu database>edit. Just select in the DB tree the folder(s), where you have that/these file(s), and delete it. After, you just need to reindex the files again.

Quote
Also, I have noticed that indexed PDF documents don't get their Bookmarks indexed. That a misfortune - I have many PDF documents which are not OCR, i.e. do not contain any text, but have a "quick-fix" for keywords/tags as a bookmark structure. Is it possible to expand dbSearch so that it could check in bookmarks as a part of the text as well?
The pdfe can already parse the PDF bookmarks, so indexation of the bookmarks is something that can be added. When the text content indexer got developed, bookmarks support wasn't available yet.
Title: Re: Problems with text indexation
Post by: Padanges on November 30, 2016, 10:21:57 AM
One more feature for the next release!  ;D
I'm leading the high-score board  8)
Title: Re: Problems with text indexation
Post by: Padanges on December 16, 2016, 09:35:45 AM
Quote
The pdfe can already parse the PDF bookmarks, so indexation of the bookmarks is something that can be added.
Do you consider bookmark indexation addition as a separate option/check or will it be done automatically?
Title: Re: Problems with text indexation
Post by: RTT on December 18, 2016, 01:16:18 AM
Automatically.