Page 1 of 1

Extract irregular patterns

Posted: Wed Jun 27, 2007 5:28 pm
by bone
Hi,
I'm new to TextPipe and seem to have started out with a particularly tricky problem. I need extract data for statistical analysis from a text document.

Lets say, I'm looking for information AAA and BBB for each item ###.
The structure of the text ist the following:

Code: Select all

item###1
<junk>
AAA

item###2
<junk>
AAA
<junk>
BBB
<junk>

item###3
BBB
<junk>

--> i.e. I do not know which item contains information on AAA and / or BBB. However I need the extracted data organized by ###. The Easy pattern searches I managed to come up with so far would fail because they keep on searching for - say - pattern BBB ignoring the fact that BBB might belong to a different item.

I read your excellent white papers - but they do not seem to offer some code that addresses my problem.

I would greatly appreciate your help on this.
Bernie

Posted: Thu Jun 28, 2007 11:12 am
by DataMystic Support
Add a special character before each item### such as '~', then use

[ longest 1+ not '~' ]

to prevent matches occuring in a different item.

Posted: Thu Jun 28, 2007 5:00 pm
by bone
Thanks a lot!!
Each of my items now start with a ~ and my seach code is

Code: Select all

[longest 1+ not '~']
[capture(8digits)]
[capture('Guidelines', 4chars)]
[capture('Changes based on', 5chars)]
My 'Replace with' is

Code: Select all

####1-$1
####2-$2
####3-$3
unfortunately, the result comes up empty - although the first item contains the guidelines string and the second the changes string. I would be greatful for another hint.

Thanks
B.

Posted: Thu Jun 28, 2007 7:32 pm
by DataMystic Support
Why don't you email us a real sample?

Posted: Thu Jun 28, 2007 7:38 pm
by bone
In the forum or directly? Which address?
Thanks

Posted: Thu Jun 28, 2007 9:00 pm
by DataMystic Support