Extract irregular patterns

Get help with installation and running here.

Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators

Post Reply
bone
Posts: 3
Joined: Wed Jun 27, 2007 5:04 pm

Extract irregular patterns

Post by bone »

Hi,
I'm new to TextPipe and seem to have started out with a particularly tricky problem. I need extract data for statistical analysis from a text document.

Lets say, I'm looking for information AAA and BBB for each item ###.
The structure of the text ist the following:

Code: Select all

item###1
<junk>
AAA

item###2
<junk>
AAA
<junk>
BBB
<junk>

item###3
BBB
<junk>

--> i.e. I do not know which item contains information on AAA and / or BBB. However I need the extracted data organized by ###. The Easy pattern searches I managed to come up with so far would fail because they keep on searching for - say - pattern BBB ignoring the fact that BBB might belong to a different item.

I read your excellent white papers - but they do not seem to offer some code that addresses my problem.

I would greatly appreciate your help on this.
Bernie
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Post by DataMystic Support »

Add a special character before each item### such as '~', then use

[ longest 1+ not '~' ]

to prevent matches occuring in a different item.
Regards,

Simon Carter, https://www.DataMystic.com
https://www.JadeDiabetes.com - Insulin dose calculator for Type 1 diabetes
https://www.DownloadPipe.com - 250,000 free software downloads
bone
Posts: 3
Joined: Wed Jun 27, 2007 5:04 pm

Post by bone »

Thanks a lot!!
Each of my items now start with a ~ and my seach code is

Code: Select all

[longest 1+ not '~']
[capture(8digits)]
[capture('Guidelines', 4chars)]
[capture('Changes based on', 5chars)]
My 'Replace with' is

Code: Select all

####1-$1
####2-$2
####3-$3
unfortunately, the result comes up empty - although the first item contains the guidelines string and the second the changes string. I would be greatful for another hint.

Thanks
B.
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Post by DataMystic Support »

Why don't you email us a real sample?
Regards,

Simon Carter, https://www.DataMystic.com
https://www.JadeDiabetes.com - Insulin dose calculator for Type 1 diabetes
https://www.DownloadPipe.com - 250,000 free software downloads
bone
Posts: 3
Joined: Wed Jun 27, 2007 5:04 pm

Post by bone »

In the forum or directly? Which address?
Thanks
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Post by DataMystic Support »

Regards,

Simon Carter, https://www.DataMystic.com
https://www.JadeDiabetes.com - Insulin dose calculator for Type 1 diabetes
https://www.DownloadPipe.com - 250,000 free software downloads
Post Reply