Author Topic: Keywords Problem by Automation with Excel VBA  (Read 3497 times)

0 Members and 1 Guest are viewing this topic.

stephan

  • Newbie
  • *
  • Posts: 1
Keywords Problem by Automation with Excel VBA
« on: January 17, 2015, 08:08:33 AM »
Hello together,

the problem is a little bit hard to explain. I try to write metainfo, especially keywords in pdf files, by EXCEL VBA Script using the Adobe Arcobat Type Library 10.0 (adoberfp.dll). Important: to use the objects from this dll you have to install adobe acrobat pro XI otherwise you get the Runtimeerror 429, because the access to the Active X component is denied.

The used code is:

Sub pdf_changer()
Const Pdfdatnam = "C:\test.pdf"
Dim objApp As Acrobat.CAcroApp
Dim objAcroAVDoc As Acrobat.CAcroAVDoc
Dim objAcroPDDoc As Acrobat.CAcroPDDoc
    Set objApp = CreateObject("AcroExch.App")
    Set objAcroAVDoc = CreateObject("AcroExch.AVDoc")
    objAcroAVDoc.Open Pdfdatnam, ""
    Set objAcroPDDoc = objAcroAVDoc.GetPDDoc
   
    With objAcroPDDoc
        .SetInfo "Title", "Title of PDF"
        .SetInfo "Author", "Author of PDF"
        .SetInfo "Subject", "Subject of PDF"
        .SetInfo "Keywords", "Keywords of PDF"
        .Save 1 Or 4 Or 32, Pdfdatnam
        .Close
    End With
objAcroAVDoc.Close (False)
objApp.Exit

End Sub

Now the problem is: if you have already set more than one keyword by pdfshell tools, they cannot be overwritten by this sub. The funcion just adds the keyword to the others. If you have added a keyword this way and try to read them out by "Keywords = .GetInfo("Keywords")" . GetInfo returns only the added keyword and not the others.

So my question is: Is there a possibility to delete the keywords set by pdfshell tools with an Excel vba script? So I can rewrite them with the script above. Or do someone know a better solution?

Thanks in advance
Stephan



RTT

  • Administrator
  • *****
  • Posts: 778
Re: Keywords Problem by Automation with Excel VBA
« Reply #1 on: January 17, 2015, 08:49:51 PM »
This is not a bug in PDF-ShellTools.
The problem occur because the Acrobat API PDFDoc.SetInfo method, that only accesses the document basic metadata fields, don't update the more advanced XMP object correctly.
The PDF XMP object keeps the keywords in two places: the <pdf:Keywords> node, of the pdf namespace (as a semicolon separated list of keywords string) and the <dc:subject> node, of the dublin core namespace (as a Bag list of individual keywords). The document SetInfo method only updates the keywords under the PDF namespace. From the Acrobat file properties dialog, or PDF-ShellTools, you read/write both namespaces. When the keywords listed in both namespaces are not the same, you get a merged list of both. That's why after running your code it appears it appended your added keywords to the already existent ones.

To deal with this:

You may use the Acrobat JSObject, you get from the PDFDoc.GetJSObject method, and then deal with the XMP xml string you can get/set from the JSObject.metadata property. You need to parse/edit this XML from your VBA, or even on the Acrobat JavaScript engine itself (using the JSObject.console). Take a look to the example 3 of the metadata property in the "JavaScript for Acrobat API Reference" manual for what needs to be done in order to edit the XMP.

Or you may call the PDF-ShellTools SetMetadata function from the command line interface, or, probably even better, the DLL interface, from your VBA code to set the metadata.