DataMystic

Posted: **Wed Jun 27, 2007 5:28 pm**

Hi,
I'm new to TextPipe and seem to have started out with a particularly tricky problem. I need extract data for statistical analysis from a text document.

Lets say, I'm looking for information AAA and BBB for each item ###.
The structure of the text ist the following:

Code: Select all

item###1
<junk>
AAA

item###2
<junk>
AAA
<junk>
BBB
<junk>

item###3
BBB
<junk>

--> i.e. I do not know which item contains information on AAA and / or BBB. However I need the extracted data organized by ###. The Easy pattern searches I managed to come up with so far would fail because they keep on searching for - say - pattern BBB ignoring the fact that BBB might belong to a different item.

I read your excellent white papers - but they do not seem to offer some code that addresses my problem.

I would greatly appreciate your help on this.
Bernie

Posted: **Thu Jun 28, 2007 11:12 am**

Add a special character before each item### such as '~', then use

[ longest 1+ not '~' ]

to prevent matches occuring in a different item.

Posted: **Thu Jun 28, 2007 5:00 pm**

Thanks a lot!!
Each of my items now start with a ~ and my seach code is

Code: Select all

[longest 1+ not '~']
[capture(8digits)]
[capture('Guidelines', 4chars)]
[capture('Changes based on', 5chars)]

My 'Replace with' is

Code: Select all

####1-$1
####2-$2
####3-$3

unfortunately, the result comes up empty - although the first item contains the guidelines string and the second the changes string. I would be greatful for another hint.

Thanks
B.

Posted: **Thu Jun 28, 2007 7:32 pm**

Why don't you email us a real sample?

Posted: **Thu Jun 28, 2007 7:38 pm**

In the forum or directly? Which address?
Thanks

Posted: **Thu Jun 28, 2007 9:00 pm**

Directly - see http://www.datamystic.com/support.html

DataMystic

Extract irregular patterns

Extract irregular patterns