Extractions based on Variable

Get help with installation and running here.

Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators

Post Reply
mrkinsp
Posts: 2
Joined: Tue Jul 25, 2006 11:12 pm

Extractions based on Variable

Post by mrkinsp »

I need help with the following line extractions:

Example of text -

1|1001|dog|1238094710219381098
2|1001|2398479872934872938
3|1001|0892039420938029384
1|1002|cat|23409230498203948
2|1002|23094203480293840924

where "dog", extract (or remove) that line and all following lines that match the variable preceeding dog (1001)

Clarifications:
"dog" is constant, but "1001" is variable... "|" is a delimiter

Thanks
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Post by DataMystic Support »

So how would the new output file look...?
mrkinsp
Posts: 2
Joined: Tue Jul 25, 2006 11:12 pm

Post by mrkinsp »

I need to ultimately create two text files from a single text file, One the result of an extraction and one the result of a remove:

2 Examples:


Input:
1|1001|dog|1238094710219381098
2|1001|2398479872934872938
3|1001|0892039420938029384
1|1002|cat|23409230498203948
2|1002|23094203480293840924

Output file 1:
1|1001|dog|1238094710219381098
2|1001|2398479872934872938
3|1001|0892039420938029384

Output file 2:
1|1002|cat|23409230498203948
2|1002|23094203480293840924


Input:
1|1021|mouse|1238094710219381098
2|1021|2398479872934872938
3|1021|0892039420938029384
1|1022|cat|23409230498203948
2|1022|23094203480293840924
1|1023|dog|1238094710219381098
2|1023|2398479872934872938
3|1023|0892039420938029384
1|1024|rabbit|23409230498203948
2|1024|23094203480293840924

Output file 1:
1|1023|dog|1238094710219381098
2|1023|2398479872934872938
3|1023|0892039420938029384

Output file 2:
1|1021|mouse|1238094710219381098
2|1021|2398479872934872938
3|1021|0892039420938029384
1|1022|cat|23409230498203948
2|1022|23094203480293840924
1|1024|rabbit|23409230498203948
2|1024|23094203480293840924

Hope this helps.

Many thanks
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Post by DataMystic Support »

You can generate this EasyPattern search/replace

Code: Select all

[ pipefield, pipe, capture(pipefield), pipe, 'dog', pipe, 0+ not cr or lf, cr, lf,
  longest 0+ (pipefield, pipe, group1, pipe, 0+ not cr or lf, cr, lf) ]
As a subfilter, add a Filters\Special\Secondary output filter to send this output to Output File 1. Follow this with a Remove All filter.

Set the main output filter to send text to Output File 2.

Here is how it looks:

|--EasyPattern [[ pipefield, pipe, capture(pipefield), pipe, 'dog', pipe, 0+ not cr or lf, cr, lf,\r\n longest 0+ (pipefield, pipe, group1, pipe, 0+ not cr or lf, cr, lf) ]] with [$0]
| | [ ] Match case
| | [ ] Whole words only
| | [ ] Case sensitive replace
| | [X] Prompt on replace
| | [ ] Skip prompt if identical
| | [ ] First only
| | [ ] Extract matches
| | Maximum text buffer size 4096
| |
| |--Merge output to file c:\outputfile1.txt
| |
| +--Remove all
|
+--Merge output to file c:\outputfile2.txt
Post Reply