Recent Posts

Pages: 1 [2] 3 4 ... 10
11
General / Re: Changing OCR'd text PDF View Tab in Text Mode
« Last post by puckman on July 31, 2019, 05:54:51 AM »
Try with MS Word. It can convert the PDF to a editable document, you can then save again as PDF.
Thanks for your prompt reply.  Full disclosure here, the topic of creating PDF's is a complex topic.  I'm replying here from an i
I thought about a similar solution.  Prior to your posted suggestion I tried it with a different software and then after reading your suggestion with MS Office 365 Plus.  Both produced the same results.
Here's my observation:
In both cases, when opening the rehashed pdf, the only pdf DOM remaining was the text.  The original scanned document was obscured or altogether missing.

This is a departure from the AdobeĀ® document model.  I know this is beyond the control of the PDFE software.  The underlying text of the OCR'd image can be manipulated with Javascript and then saved with the original document.  The extent and fidelity of manipulating the text relies on the script's sophistication and eloquence .

I believe while viewing the PDF it should retain the original scanned document and the underlying text (which is hidden) but also should retain as much as possible the perceptual accuracy of the text.
12
General / Re: Changing OCR'd text PDF View Tab in Text Mode
« Last post by RTT on July 29, 2019, 09:09:52 PM »
After comparing resulting text from TIFF's OCR'd by Acrobat and TOPOCR, I wanted to see if I could change the underlying texts in PDF by using the PDF View Tab in PDFE.  I switched to text mode and manually altered a few words.  Unfortunately, I haven't been able to make the edits persistent.
I have concluded that perhaps that this task is not possible with any software.
Am I correct?
Yes, the text mode edit functionality exists mainly to easily edit of text to be copied, or to add minor changes (punctuation, etc.) in order to get better text to speech results. No easy way to connect changes in a text mode only view, that doesn't has accurate position and font style information, to the already formatted PDF.
Try with MS Word. It can convert the PDF to a editable document, you can then save again as PDF.
13
General / Changing OCR'd text PDF View Tab in Text Mode
« Last post by puckman on July 29, 2019, 08:36:45 AM »
Hi everyone,
My first post here.
After comparing OCR software offerings from Acrobat, ABBYY, Epson and OmniPage for use with receipts I am still looking for a solution. Neat does a good job but I prefer neither to place my financial information in the cloud or subscribe to software.
My research on this project has revealed that receipts present extraordinary OCR challenges because of the small print, poor quality of paper and ink and a myriad of formats.  Regardless, I am still looking for a solution.
I found promising results from TOPOCR.  However, the trial demo does disables saving results as well as copying and pasting.  Its favourable factor is the price.  Compared to the cost of other similar software, it's affordable.  It lacks some standard features like batch processing.
After comparing resulting text from TIFF's OCR'd by Acrobat and TOPOCR, I wanted to see if I could change the underlying texts in PDF by using the PDF View Tab in PDFE.  I switched to text mode and manually altered a few words.  Unfortunately, I haven't been able to make the edits persistent.
I have concluded that perhaps that this task is not possible with any software.
Am I correct?
14
General / Re: Displaying PDF Page Size in Windows Explorer
« Last post by RTT on June 05, 2019, 01:13:03 AM »
Check if the attached script, a modification of the first script to also fill a PageOrientation named property, does the job.
Don't forget to configure a metadata property named "PageOrientation", before testing.
The script code, for easy reference.
Code: [Select]
var ProgressBar = pdfe.ProgressBar;
ProgressBar.max = pdfe.SelectedFiles.Count;

for (var i = 0; i < pdfe.SelectedFiles.Count; i++) {
    ProgressBar.position = i + 1;
    var file = pdfe.SelectedFiles(i);
    var Page = file.Pages(0);
    if (Page) {
        var w = Math.min(Page.Width, Page.Height);
        var h = Math.max(Page.Width, Page.Height);
        var PSizeStr = w.toFixed() + 'x' + h.toFixed();
        var FileMetadata = file.Metadata;
        var Changed = false;
        if (FileMetadata.PageSize !== PSizeStr) {
            FileMetadata.PageSize = PSizeStr;
            Changed = true;
        }
        var PageOrientationStr;
        (Page.Height > Page.Width) ? PageOrientationStr = 'Portrait' : PageOrientationStr = 'Landscape';
        if (FileMetadata.PageOrientation !== PageOrientationStr) {
            FileMetadata.PageOrientation = PageOrientationStr;
            Changed = true;
        }
        if (Changed) {
            if (FileMetadata.CommitChanges()) {
                pdfe.echo(file.Filename + ' : (' + PSizeStr + ' - ' + PageOrientationStr + ') [OK]');

            } else {
                pdfe.echo(file.Filename + ' [commit changes failed]', 0xFF0000);
            }
        } else {
            pdfe.echo(file.Filename + ' : (' + PSizeStr + ' - ' + PageOrientationStr + ') [properties already set]');
        }
    }
}
pdfe.echo("Done");
15
General / Re: Displaying PDF Page Size in Windows Explorer
« Last post by nightslayer23 on June 04, 2019, 05:04:03 AM »
Hi again, could I request an add on to this script? Say keep everything as it is, but make another metadata entry for a Property Handle called "orientation" whereby, if the height > width of PDF size, then add metadata 'orientation' called "Portrait". Likewise, for if width > height then 'orientation' = "Landscape"
16
General / Re: rename page size file
« Last post by Urraco on May 28, 2019, 12:51:51 PM »
I updated the script but still same error." Unable to rename..."
It's working here, with the 3.3 version and Windows 10.
Did you deleted the the old one, from the list of scripts, before importing the updated script? If not, it's still using the old one, and the imported script ended named PageSize1.

Quote
About format if you can, i want to get trimbox size of the pdf document and if does not contain trim size, return cropbox or mediabox size

Check if this one works:
Code: [Select]
function PageTrimSize() {
    var Size = '';
    var Page = BatchFile.Pages(0);
    if (Page) {
        var box = Page.TrimBox ? Page.TrimBox : Page.CropBox ? Page.CropBox : Page.MediaBox;
        if (Page.Rotation == 90 || Page.Rotation == 270) {
            Size = GetDist_mm(box.top, box.bottom).toFixed() + 'x' + GetDist_mm(box.left, box.right).toFixed() + ' mm';
        } else {
            Size = GetDist_mm(box.left, box.right).toFixed() + 'x' + GetDist_mm(box.top, box.bottom).toFixed() + ' mm';
        }
    }
    BatchFile.close();
    return Size;
}

function GetDist_mm(x1, x2) {
    return Math.abs(x1 - x2) * 25.4 / 72
}

Type [F]_[PageTrimSize] in the tool rename formula field, to use this new script.



It works!
Thank for help, I appreciate it
Great software, I recommend.

17
General / Re: rename page size file
« Last post by RTT on May 25, 2019, 01:21:26 AM »
I updated the script but still same error." Unable to rename..."
It's working here, with the 3.3 version and Windows 10.
Did you deleted the the old one, from the list of scripts, before importing the updated script? If not, it's still using the old one, and the imported script ended named PageSize1.

Quote
About format if you can, i want to get trimbox size of the pdf document and if does not contain trim size, return cropbox or mediabox size

Check if this one works:
Code: [Select]
function PageTrimSize() {
    var Size = '';
    var Page = BatchFile.Pages(0);
    if (Page) {
        var box = Page.TrimBox ? Page.TrimBox : Page.CropBox ? Page.CropBox : Page.MediaBox;
        if (Page.Rotation == 90 || Page.Rotation == 270) {
            Size = GetDist_mm(box.top, box.bottom).toFixed() + 'x' + GetDist_mm(box.left, box.right).toFixed() + ' mm';
        } else {
            Size = GetDist_mm(box.left, box.right).toFixed() + 'x' + GetDist_mm(box.top, box.bottom).toFixed() + ' mm';
        }
    }
    BatchFile.close();
    return Size;
}

function GetDist_mm(x1, x2) {
    return Math.abs(x1 - x2) * 25.4 / 72
}

Type [F]_[PageTrimSize] in the tool rename formula field, to use this new script.

18
General / Re: rename page size file
« Last post by Urraco on May 24, 2019, 07:38:41 AM »
Apparently it works, but when I try to rename the file, I get this error: Unable to rename file - The process cannot access the file because it is being used by another process.
My bad. I've update the above script to fix the issue.

..and..can i get the "trim" page size format of pdf ?
Do you mean the size of the PDF page trimbox, that reverts to the cropbox, or mediabox, if that box property is not defined?


I updated the script but still same error." Unable to rename..."
About format if you can, i want to get trimbox size of the pdf document and if does not contain trim size, return cropbox or mediabox size

Thanks


19
General / Re: rename page size file
« Last post by RTT on May 23, 2019, 03:14:47 PM »
Apparently it works, but when I try to rename the file, I get this error: Unable to rename file - The process cannot access the file because it is being used by another process.
My bad. I've update the above script to fix the issue.

..and..can i get the "trim" page size format of pdf ?
Do you mean the size of the PDF page trimbox, that reverts to the cropbox, or mediabox, if that box property is not defined?
20
General / Re: rename page size file
« Last post by Urraco on May 23, 2019, 10:32:41 AM »
Thank you for your prompt answer!
Apparently it works, but when I try to rename the file, I get this error: Unable to rename file - The process cannot access the file because it is being used by another process.
..and..can i get the "trim" page size format of pdf ?

Thanks and a good day!
Pages: 1 [2] 3 4 ... 10