PDF-ShellTools > Ideas/Suggestions

Script to count how many colour pages in PDF?

<< < (3/4) > >>

nightslayer23:
Had an issue displaying the Ranges.. Had to set the Data Type to Integer,64bit and then it worked?

RTT:
Yes, I forgot to make reference to this. I've edited my post, to add the screenshot depicting the data type needed configuration. I have it set to Integer 16 bit, but any integer data type can hold the 0..100 value.

nightslayer23:
:) all good

nightslayer23:

--- Quote from: RTT on May 19, 2017, 04:49:53 PM ---
--- Quote from: nightslayer23 on May 18, 2017, 03:09:10 AM ---whatever ISN'T white space..

--- End quote ---
Not easy to find a ImageMagick set of commands to calculate this properly for all the situations, even because I'm not an expert on this subject, but thresholding the image, to convert to black all non-white pixels, and then calculating the page percentage of black pixels seems to be giving good results.

--- Code: ---var imo = new ActiveXObject("ImageMagickObject.MagickImage.1");
var fso = new ActiveXObject("Scripting.FileSystemObject");

var tmpfolder = fso.GetSpecialFolder(2 /*TemporaryFolder*/ );
var InfoFilename = tmpfolder + '\\PagesInfo.txt';

var ProgressBar = pdfe.ProgressBar;
ProgressBar.max = pdfe.SelectedFiles.Count;

for (var i = 0; i < pdfe.SelectedFiles.Count; i++) {
    ProgressBar.position = i + 1;
    var file = pdfe.SelectedFiles(i);
    var FileMetadata = file.Metadata;

    //Bypass already processed files.
    if (FileMetadata.InkCoverage) {
        pdfe.echo(file.filename + ': Ink coverage = ' + FileMetadata.InkCoverage);
        pdfe.echo(' [Already set]', 0xFF, 1);
        continue;
    }

    pdfe.echo('Processing ' + file.filename + ' (' + file.NumPages + ' pages)');
    try {
        //use imagemagick to render each pdf page, convert all non-white colors to black
        //and calculate the average of black pixels, that correspond to the percentage of non-white area.

        //imo.convert(file.filename, "-fuzz","1%","-fill","white","opaque","white","-fill","black","+opaque","white","-format", "%[fx:100-mean*100]\n", "info:" + InfoFilename);               
        imo.convert(file.filename, "-colorspace", "gray", "-auto-level", "-threshold", "99%", "-format", "%[fx:100-mean*100]\n", "info:" + InfoFilename);

        //read the result info file, that contains a line of ink coverage percentage value for each page.
        var f = fso.GetFile(InfoFilename);
        var fts = f.OpenAsTextStream();
        var PagesInkCoverage = fts.ReadAll().split('\n');
        fts.Close();
        f.Delete();

        //calculate the document total ink coverage by averaging the by page values.
        var InkCoverage = 0;
        for (var index = 0, len = PagesInkCoverage.length - 1; index < len; index++) {
            InkCoverage += Number(PagesInkCoverage[index]);
        }
        InkCoverage = Math.round((InkCoverage / (len ? len : 1)));

        pdfe.echo(file.filename + ': Ink coverage=' + InkCoverage + '%', 0, 2);

        if (FileMetadata.InkCoverage !== InkCoverage.toString()) {
            FileMetadata.InkCoverage = InkCoverage;
            if (FileMetadata.CommitChanges()) {
                pdfe.echo(' [OK]', 0x006400, 1);
            } else {
                pdfe.echo(' [Setting metadata failed]', 0xFF0000, 1);
            }
        } else {
            pdfe.echo(' [Already set]', 0xFF, 1);
        }

    } catch (e) {
        pdfe.echo(file.filename + ' : ', 0, 2);
        pdfe.echo(e.name + ' ( ' + e.message + ' )', 0xff0000, 1);
    }
}

pdfe.echo('Done');

--- End code ---
This script expects a custom property named InkCoverage and to show in the Shell this ink coverage percentage value as ranges named "Line", "Medium" or "High", this custom property needs to be configured as depicted in the attached screenshots.

If it's not giving the expected results, better if you ask in a ImageMagick forum on how to calculate this and then we can update the script with a better set of image processing/analysis commands.

--- End quote ---


Is there a way for the file to be flattened first before doing this conversion? Some work perfectly, but others come out at a really high percentage when they aren't technically going to print that way. I figured it was looking at other layers or some other hidden info and converting that to bw too.

I did a test saving one to jpg, converting it back to pdf then running the tool again which gave me an accurate result. However the process to convert one to jpg and back to pdf was quite slow. I am needing to colm over hundreds of files at once with this tool to get a fast result. So would i be possible in code to first flatten layers before running the check?

I actually batch flattened layers in acrobat and it didn't solve the issue.. I had to batch flatten AND convert everything to CMYK to get it to work.

In the optimizer tool, can CMYK and RGB colour spaces be added somehow? Because Acrobat is just way too slow at doing these steps.. your tool is much faster!

RTT:

--- Quote from: nightslayer23 on July 19, 2017, 12:29:43 AM ---Is there a way for the file to be flattened first before doing this conversion? Some work perfectly, but others come out at a really high percentage when they aren't technically going to print that way. I figured it was looking at other layers or some other hidden info and converting that to bw too.

--- End quote ---
Are these trouble PDF layers set to be visible in the PDF reader (screen mode) and hidden when printed? If that's the case, edit the file delegates.xml, where you have ImageMagick installed, and change the line "<delegate decode="ps:alpha" stealth="True" command="&quot;@PSDelegate@&quot; -q -dQUIET -..." to include the -dPrinted parameter.


--- Quote ---So would i be possible in code to first flatten layers before running the check?

--- End quote ---
When the ImageMagick tool calls the Ghostscript to convert each of the PDF pages to an image, that then uses to run the color check, is effectively flattening the PDF. If the issue is not the mentioned above (these layers are set to be hidden only when the PDF is printed) and even hidden layers are being rendered too, then that's an issue with Ghostscript.
If you have Acrobat, I suppose the script can automate it to flatten the PDF layers to a temporary PDF file and then run the check on that PDF.


--- Quote ---I actually batch flattened layers in acrobat and it didn't solve the issue.. I had to batch flatten AND convert everything to CMYK to get it to work.

--- End quote ---
Can't opine without a sample file.


--- Quote ---In the optimizer tool, can CMYK and RGB colour spaces be added somehow?

--- End quote ---
I'm not understanding your question. Please explain this better.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version