Page 1 of 1

need help with regex plz

Posted: Tue May 12, 2015 4:12 am
by nikolas1612
Trying to decide the following problem by regex in Textpipe:

Mark (no matter how) every FIRST unique abbreviation in text (for instance consisting of 3 capital letters \b[A-Z]{3}\b). The latter abbreviations coinciding with the marked ones are to be ignored. Got no idea how to do that.
Thanks for any help on the subject.

Re: need help with regex plz

Posted: Wed May 13, 2015 5:03 pm
by DataMystic Support
Use
\b[A-Z]{3}\b
as your search text, and
$0
as the replace text.

Then add a scripting filter as a subfilter. Inside the script filter, record each arriving fragment in an array - if it is already there, don't mark it. If it is there already, mark it.

We can provide consulting help if needed here.

Re: need help with regex plz

Posted: Wed May 20, 2015 4:46 pm
by nikolas1612
"Note: startFile() is ALSO called when the Script is a sub filter, for each text value that the sub filter operates on. The impact of this is that real per-file initialization/finalization needs to be performed in a script that is not inside any sub filters".

Could you explain what this means? Is there any concern to my task?

Re: need help with regex plz

Posted: Thu May 21, 2015 8:43 am
by DataMystic Support
Easy - just add a second script filter outside of any subfilter, that is there purely for the startFile and endFile functions.

The processLine function should just pass any text through unchanged.

Does that make sense?