Post reply

Name:
Email:
Subject:
Message icon:

Verification:

shortcuts: hit alt+s to submit/post or alt+p to preview


Topic Summary

Posted by: RTT
« on: December 18, 2016, 01:13:14 AM »

An alternative code could be: FileName = FileName.substring(FileName.lastIndexOf('|') + 1);
No. It will fail if the archived file is in the main archive (depth 0), i.e. no '|' character present.
Posted by: Padanges
« on: December 16, 2016, 09:26:56 AM »

Quote
The archive within archive scan depth, that I just finished implementing, is about instructing the scanner how many levels of archives inside archives should be scanned.
That's sweet :)

Quote
FileName = FileName.substring(FileName.indexOf('>') + 1).split('|').slice(-1)[0];
An alternative code could be: FileName = FileName.substring(FileName.lastIndexOf('|') + 1);
Posted by: RTT
« on: December 01, 2016, 12:47:47 AM »

Quote
How's that?

What about a case where we have a text-book archived with an archive of a CD content, where many file formats are recognizable by the scanner, for example, *.txt, but ultimately have no purpose for being indexed into a DB?
The archive within archive scan depth, that I just finished implementing, is about instructing the scanner how many levels of archives inside archives should be scanned. If in the scenario you are referring, these .txt files are archived in an archive inside a main archive, then setting the scan depth can indeed exclude these files from the indexation, and speed-up the scanning. But if you just want to scan all, the scan depth check, in the end, makes the process slower. But not that much, and the feature is indeed useful.
Posted by: Padanges
« on: November 30, 2016, 10:26:17 AM »

Quote
How's that?

What about a case where we have a text-book archived with an archive of a CD content, where many file formats are recognizable by the scanner, for example, *.txt, but ultimately have no purpose for being indexed into a DB?
Posted by: RTT
« on: November 27, 2016, 02:03:57 AM »

I think limiting scan depth should even speed-up file scanning in cases where we have archived archives of various recognizable file types.
How's that? ???
Posted by: RTT
« on: November 27, 2016, 01:59:50 AM »

I used to extract file name from full path by checking whether it's inside an archive with such code:
Code: [Select]
if (fileName.indexOf('>') > 0) {                // remove archive name tag
fileName = fileName.substring(fileName.indexOf('>') + 1); }
After messing around I found out that it would not work properly depending on archive depth.
Try this way:
FileName = FileName.substring(FileName.indexOf('>') + 1).split('|').slice(-1)[0];

Quote
Currently our file name pattern is: <archive.zip>archive-inside.zip|document-inside.pdf .
Wouldn't it be simpler if we had pattern like this: <archive.zip><archive-inside.zip>document-inside.pdf ?
No. Current format makes it easy to parse with a simple split operation. What's after the main archive name will be handled by the un-archive code, and it is passed to it as the filename to extract. It splits it and follows the split array in order to reach the last level, that is the file the caller requested.
Posted by: Padanges
« on: November 26, 2016, 08:59:47 AM »

I think limiting scan depth should even speed-up file scanning in cases where we have archived archives of various recognizable file types.
Posted by: Padanges
« on: November 26, 2016, 08:46:26 AM »

I used to extract file name from full path by checking whether it's inside an archive with such code:
Code: [Select]
if (fileName.indexOf('>') > 0) {                // remove archive name tag
fileName = fileName.substring(fileName.indexOf('>') + 1); }
After messing around I found out that it would not work properly depending on archive depth.
Currently our file name pattern is: <archive.zip>archive-inside.zip|document-inside.pdf .
Wouldn't it be simpler if we had pattern like this: <archive.zip><archive-inside.zip>document-inside.pdf ?
Posted by: Padanges
« on: November 25, 2016, 07:54:15 AM »

This feature would be most welcome ;D
Posted by: RTT
« on: November 23, 2016, 12:04:56 AM »

Not possible right now but, but definitively something that may be implemented. I will check it. Thanks for the suggestion.
Posted by: Padanges
« on: November 22, 2016, 10:36:23 AM »

Hi,
is it possible to limit the depth of archives for document scanning? For example, I have an archive within an archive, and I would like to find only documents which are only in the primary archive - is there a way to do that?


Thanks in advance