Author Topic: Acrobat 8 Mail Metatags  (Read 7869 times)

0 Members and 1 Guest are viewing this topic.

Anonymous

  • Guest
Acrobat 8 Mail Metatags
« on: September 09, 2007, 06:48:00 AM »
I've archived my old emails using Acrobat 8. When doing so, Acrobat 8 created the following metafile tags for the files:
 pdfx:MailFrom:
 pdfx:MailSubject:
 pdfx:MailDate:
 pdfx:MailFolder:
 pdfx:MailTo:
 pdfx:MailCc:

I'm interested in using PDF Explorer to catalog and export the tags for each of these files using these metatags. Is it possible to do this? I tried with the trial version and had no luck. These fields were not defined and I couldn't figure out how to do this. (It seems that your option to custom define fields is for new tags but doesn't cover these mail tags).

If your software could do this, it would be great and save me a lot of time.

Would appreciate your feedback on this as soon as possible (as I will manually catalog this information if there is no automated way to do so--which would be a pity as there must be a better solution given the information is already defined in the pre-existing metatags...).


Thank you,
Daniel

RTT

  • Administrator
  • *****
  • Posts: 765
« Reply #1 on: September 09, 2007, 09:53:00 AM »
If these metatags are also present in the InfoDictionary object you just need to index the fields names to a custom field, using the custom fields settings editor.

As example:
Custom1 : MailFrom
Custom2 : MailSubject
Custom3 : MailDate
...

Next, you need to force PDFE to rescan the files so it can use the new settings while gathering the metadata. If the files are already indexed from a previous scan you first need to delete the folder where these files are from the PDFE database using the Database>Edit tool.

Next, scan the files using the DiskTree scan mode.

To show these custom fields in the Grid you need also to create a new grid layout (Edit>GridLayout>CreateNew) that include the used custom fields.

If you continue without luck send me one of these files, attached to an email message, so I can tell, for sure, if it's possible or not.

Anonymous

  • Guest
Worked!
« Reply #2 on: September 09, 2007, 02:44:00 PM »
Followed your instructions and, viola, it worked!

(I had to be careful to exactly match the case--Made a typo the first tiime through with "MailCC" instead of "MailCc" but upon fixing that typo, no problems).

As Acrobat is being used for mail archiving, you may want to more blatantly advertise/support this feature. In support of this, I'd be happy to archive an email using Acrobat 8 and email it to you...

Will purchase today. Thank you!

Anonymous

  • Guest
One more question / request re Bates numbers
« Reply #3 on: September 09, 2007, 03:46:00 PM »
Is there a way to automatically assign sequential numbers to all the documents using your software or alternatively to track the Bates numbers that can be assigned by Acrobat 8?

RTT

  • Administrator
  • *****
  • Posts: 765
Re: Worked!
« Reply #4 on: September 09, 2007, 04:36:00 PM »
Quote from: "daniel"
Followed your instructions and, viola, it worked!

(I had to be careful to exactly match the case--Made a typo the first tiime through with "MailCC" instead of "MailCc" but upon fixing that typo, no problems).

Yes, these meta-tag names are unique, and capitalization counts.

Quote from: "daniel"
Will purchase today. Thank you!

Thanks, I already received the purchase order and the license email has already sent to you.

RTT

  • Administrator
  • *****
  • Posts: 765
Re: One more question / request re Bates numbers
« Reply #5 on: September 09, 2007, 04:45:00 PM »
Quote from: "daniel"
Is there a way to automatically assign sequential numbers to all the documents using your software or alternatively to track the Bates numbers that can be assigned by Acrobat 8?

No and no, but I really need to develop. at least. the first option.

For the second option, If these numbers obey to a distinct pattern probably you can extract these numbers to some of the custom fields using the Search&Extract batch tool, but for that you need some regular expressions knowledges.

Anonymous

  • Guest
Re: One more question / request re Bates numbers
« Reply #6 on: September 09, 2007, 05:27:00 PM »
Quote from: "RTT"
No and no, but I really need to develop. at least. the first option.

For the second option, If these numbers obey to a distinct pattern probably you can extract these numbers to some of the custom fields using the Search&Extract batch tool, but for that you need some regular expressions knowledges.

Is this difficult to do? Numbers are inserted by Acrobat into, e.g., the footnotes (and are probably the ONLY thing in the footnotes) and I can put any string before and/or after the numbers to help identify.

If there are a simple set of steps to implement, I wouldn't mind trying it out. E.g., okay, I (1) assign Bates numbers in Acrobat in the form "DOC000001" (with 000001 sequential numbers that change on each page of the documents). I then create a custom field called StartDocNumber and another called EndDocNumber. I can I use the Search&Extract batch tool to go through all the documents and assign StartDocNumber (on footnote of first page of document) and EndDocNumber (from footnote of last page of document). Or, if that's too copmlicated, any variation thereof would be great (e.g., I could just assign numbers to the First Page of each document and try to just get the document number from the footer of the first page...).

Also, is there any way to break out the number of pages in a document? I noticed that Acrobat also assigns the "MailAttachment" tag to mail attachments...

Thank you!

RTT

  • Administrator
  • *****
  • Posts: 765
« Reply #7 on: September 09, 2007, 06:13:00 PM »
First check it these numbers appear in the text only view in the PDFView. If yes, send me one sample of these PDFs and I try to create the regular expression to extract the bates numbers to a custom field.

Another way to do it is to export the PDFE grid (File>ExportGridFields) containing the files where you want to add the bates numbers to a csv file (create first a grid layout with the custom fields where you are going to enter the numbers), next you need to edit the csv file in a spreadsheet editor, MS Excel for example. In here you have to construct a function to automatically add the numbers to these empty custom fields, producing equal numbers, to same files, you already added to the pages with Acrobat. After csv file edited you can import it again into PDFE database (File>Import...).
Even so, the first solution, if possible, it's a better one.

Quote
Also, is there any way to break out the number of pages in a document? I noticed that Acrobat also assigns the "MailAttachment" tag to mail attachments...

Are you referring to the number of pages of the pdf document? If yes, that number is already available, you just need to create a custom grid layout to include that column and this way show it in the grid.

Anonymous

  • Guest
« Reply #8 on: September 09, 2007, 07:25:00 PM »
Quote from: "RTT"
First check it these numbers appear in the text only view in the PDFView. If yes, send me one sample of these PDFs and I try to create the regular expression to extract the bates numbers to a custom field.

Another way to do it is to export the PDFE grid (File>ExportGridFields) containing the files where you want to add the bates numbers to a csv file (create first a grid layout with the custom fields where you are going to enter the numbers), next you need to edit the csv file in a spreadsheet editor, MS Excel for example. In here you have to construct a function to automatically add the numbers to these empty custom fields, producing equal numbers, to same files, you already added to the pages with Acrobat. After csv file edited you can import it again into PDFE database (File>Import...).
Even so, the first solution, if possible, it's a better one.

Unfortunately, the Acrobat assigned Bates numbers do not appear in your text search since they can only be inserted into headers and footers of the document.

I have sent you a sample document by email for you to take a look.

Otherwise, I will take your suggestion and assign numbers manually in Excel and then import them back into your program. This seems to be an inferior solution, though, since I assume that the assigned numbers would not be embedded in the original files and would only be in your database (and it would probably be painful to add further documents, etc.).

Will eagerly await your suggestions regarding tracking Bates numbers via your software--and hope you can come up with a solution.

RTT

  • Administrator
  • *****
  • Posts: 765
« Reply #9 on: September 09, 2007, 08:33:00 PM »
I've checked your file and unfortunately Acrobat uses a form element to insert the numbers and PDFE don't extract text from these elements :( so, Search&Extract batch tool is out of solution, or you can find another program to add these bates numbers. There are stamp/watermark programs that add bates numbers too. Google for it and try some, and if you came up with a solution that shows these numbers in the PDFE text extractor, Search&Extract can work.


Quote
This seems to be an inferior solution, though, since I assume that the assigned numbers would not be embedded in the original files and would only be in your database

While importing you have an option to set PDFE to also edit the imported metadata in the PDF file itself.

Last edited by RTT on Mon Sep 10, 2007 1:54 am; edited 1 time in total

Anonymous

  • Guest
thank you
« Reply #10 on: September 10, 2007, 01:26:00 AM »
Okay. Thank you.

RTT

  • Administrator
  • *****
  • Posts: 765
« Reply #11 on: September 10, 2007, 01:56:00 AM »
Even so, I'm going to try to add text extraction from these form objects to the next PDFE version ;)

Anonymous

  • Guest
import wizard
« Reply #12 on: September 10, 2007, 05:17:00 AM »
I'm having trouble importing the list of numbers to assign numbers to the documents. I get the following error message when I try to import the csf file:
"Error: Zero Items to Import"

RTT

  • Administrator
  • *****
  • Posts: 765
« Reply #13 on: September 10, 2007, 10:09:00 AM »
In the second page of the import wizard you have to match the imported column to the respective internal column. Column by column check if the "ImporteSelectedColumnAs" combobox item is set to the correct field.

Take note that in the previous export operation you have to export, at least, the Filename and Filepath fields or on next import operation PDFE will not be able to construct a full file path. I recommend also the export of the DiskLabel and DiskSerial fields to speed up import operations.

Even best is to export all the fields, except the unused custom ones, and when importing use the same gridlayout used when exporting. This way, and because the csv contains a header row with the same columns names, that you confirm at first import wizard page, the columns correspondence will be automatically detected.