Author Topic: Keywords export in PDF Explorer?  (Read 8802 times)

0 Members and 1 Guest are viewing this topic.

Wolfram

  • Guest
Keywords export in PDF Explorer?
« on: March 02, 2012, 12:04:55 AM »
Your PDF Explorer as a great product. Keyword editing is easy with your program, even with many pdf-files.

What I would need and could not find: Keyword list export.

Keywords of every single file are shown in PDF Explorer. For searching specific keywords (stored as pdf meta data or XMP meta data) I need a list of all available keywords. (sorted A-Z?)
This list would also be very helpfull for a database of already submitted keywords.

Q1: How can I do a export of keywords with present version already today?

Q2: What is your opinion if such an export is worth considering ? And if yes, what are your plans?

Q3: Do you know other software permitting an export of keywords after scanning pdf´s?

Hoping for your answer with comments.

RTT

  • Administrator
  • *****
  • Posts: 907
Re: Keywords export in PDF Explorer?
« Reply #1 on: March 02, 2012, 12:18:38 AM »
Quote
Q1: How can I do a export of keywords with present version already today?
You can use the File>ExportGridFields tool to export the keywords column to an external .csv or .txt file.
Then you can post-process that file, using  a script, etc., to build your list of sorted, an unique, keywords list.

Here is a simple script you could use to post-process that exported keywords column file.

Code: [Select]
/*****************
Helper prototype methods
*****************/
if (!Array.indexOf) {
  Array.prototype.indexOf = function (obj, start) {
    for (var i = (start || 0); i < this.length; i++) {
      if (this[i] == obj) {
        return i;
      }
    }
    return -1;
  }
}

String.prototype.trim = function() {
    var    str = this.replace(/^\s\s*/, ''),
        ws = /\s/,
        i = str.length;
    while (ws.test(str.charAt(--i)));
    return str.slice(0, i + 1);
}

Array.prototype.uniqueMerge = function( a ) {
    for ( var i = 0, l = a.length; i<l; ++i ) {
    var s=a[i].replace(/['"]/g,'').trim();
        if (s && this.indexOf( s ) === -1 ) {
            this.push( s );
        }
    }
    return this
};

/*****************
code starts here
****************/
var fso = new ActiveXObject("Scripting.FileSystemObject");
f = fso.OpenTextFile(WScript.Arguments.Item(0), 1);
var keywordsList=new Array;
while (!f.AtEndOfStream) {
   var keywords=f.ReadLine().split(/,|;/);
     keywordsList=keywordsList.uniqueMerge(keywords);
}

keywordsList=keywordsList.sort();

WScript.echo(keywordsList.join('; '))
Just save this code to a BuildKeywordsList.js file, and drag-drop on it the PDFE exported keywords column file. It will show you a popup with the sorted list of all your PDFs keywords.

if you want that output in a file, run the next command line in the same folder where you have the BuildKeywordsList.js and Keywords.txt
cscript //NoLogo BuildKeywordsList.js Keywords.txt>KeywordsList.txt

And feel free to ask, If you have doubts on the usage of this solution.

Quote
Q2: What is your opinion if such an export is worth considering ? And if yes, what are your plans?
The next to release PDFE version has a scripting tool, so custom scripting this kind of tasks will be quite easy.

Quote
Q3: Do you know other software permitting an export of keywords after scanning pdf´s?
No

Padanges

  • Newbie
  • *
  • Posts: 179
Re: Keywords export in PDF Explorer?
« Reply #2 on: September 05, 2016, 12:22:06 PM »
Code: [Select]
if (!Array.indexOf) {
  Array.prototype.indexOf = function (obj, start) {
    for (var i = (start || 0); i < this.length; i++) {
      if (this[i] == obj) {
        return i;
      }
    }
    return -1;
  }
}

Could you please clarify why do we use the keyword start here? The method prototype works just fine without that too.

RTT

  • Administrator
  • *****
  • Posts: 907
Re: Keywords export in PDF Explorer?
« Reply #3 on: September 06, 2016, 01:48:48 AM »
Could you please clarify why do we use the keyword start here? The method prototype works just fine without that too.
Not quite sure what you are asking here. :-\
The JScript language lacks the indexOf method for arrays. The above code just adds it to its prototype. The code implements the same behavior of the one in JavaScript. Read here about the start optional parameter.

Padanges

  • Newbie
  • *
  • Posts: 179
Re: Keywords export in PDF Explorer?
« Reply #4 on: September 06, 2016, 06:47:09 AM »
"It's an optional parameter which defines where to start the search". Thanks.