Page 1 of 1

Delete Variable Length Text

Posted: Fri Jun 14, 2013 9:20 pm
by kbrady
I'm new to the forum and the TextPipe product. I am in my third day of evaluating the software and I'm stuck. Here's the overview of what I am trying to do: using SSIS 2012, download a text file from a FTP site every day, parse through the file using TextPipe grabbing only certain elements and then update a database with those elements. My question relates to parsing the file with TextPipe...

Here is what I want the file to look like after processing through TextPipe (I didn't enclose this information in code brackets because it's not really code):
runNumber 133293238 ICN 1013140549270
runNumber 133286956 ICN 1013135241820
runNumber 133292468 ICN 1013140552240
runNumber 133291020 ICN 1013140551990

Here is where I've gotten to so far using TextPipe:
runNumber 133293238 19 595.77 0 595.77 ICN 1013140549270
runNumber 133286956 4 666.66 0 ICN 1013135241820
runNumber 133292468 19 638.81 0 638.81 ICN 1013140552240
runNumber 133291020 1 613.49 0 613.49 ICN 1013140551990

I can't figure out how to delete a variable-length block of text (and I feel rather dumb for not being able to figure it out). The underlined red text needs to be removed. The "runNumber" can be variable-length also (although in my example above that is not the case). If I were to state what I want to accomplish in words: "Delete all text starting after the space following the last digit of the runNumber until the letter "I" is found".

I do apologize if this is a truly newbie question; I browsed the forum and found some posts that were close to what I need to do, but I'm not experienced enough with TextPipe to modify those to fit my needs.

Thanks.
Kris

Re: Delete Variable Length Text

Posted: Sat Jun 15, 2013 7:53 pm
by tumtum
Hi Kris

You want to remove some text in each line , rigth ? .

Fiest, You should to Convert End Of Line in file to make sure you can write pattern in each line.
Next, In each line - you know start point to remove and end point to remove .

If you know 2 things above , easy to do each step by text pipe .

1. Use filter convert EOL to DOS .

2. Use filter Replace -> Find pattern (perl-style) replace text in line to output what you want .

You can write regular expression below to capture data what you want

Code: Select all

^(runNumber [0-9]+ )([^I]+)(ICN [^\r\n]+)$
Explain

Group 1 : keep Start to line and text in you example to first number group .
Group 2 : keep all text except "I" until pattern is found I in Group 3 .
Group 3 : Keep all text from ICN to End Of Line .

replace to

Code: Select all

$1 $3
You can got output what you want

Panupong

Re: Delete Variable Length Text

Posted: Mon Jun 17, 2013 9:13 am
by DataMystic Support
Another approach is to use EasyPatterns:

Code: Select all

[ 1+ digits,    ;the 19
' ', 
1+ digits, '.', 1+ digits,    ; the 595.77
' ', 
1+ digits,  ;the 0
' ', 
1+ digits, '.', longest 1+ digits   ;the 595.77
 ]
and replace with nothing.

Re: Delete Variable Length Text

Posted: Mon Jun 17, 2013 9:07 pm
by kbrady
Thank you both for the replies.

I'll be doing some testing today using both recommended methods. On first glance, I think I would have to create an Easy Pattern for every possible combination of numbers, decimal places, etc. - but that may not be a bad thing.

I'll keep you posted.

Thanks again.

Re: Delete Variable Length Text

Posted: Mon Jun 17, 2013 9:14 pm
by DataMystic Support
No Kris,

The EasyPattern piece fragment:

Code: Select all

[ 1+ digits ]
matches any number of digits, from 1 digit to 4000 digits in a row.

That's the whole point of pattern matching.

Re: Delete Variable Length Text

Posted: Mon Jun 17, 2013 11:13 pm
by kbrady
I apologize, I should have been more clear in my previous statement.

I understand about the 1+ functionality; I was referring to those situations where a decimal may occur but was not necessarily included in the sample data that I provided. Additionally, sometimes one or more of the numbers may be a negative number (again, something that was not present in the sample data that I provided).

In the Easy Pattern that you provided (see below), the "19" may sometimes be a "19.54" or even a "-19.54" or the "0" may be a "-27.36".

Code: Select all

[ 1+ digits,    ;the 19
' ', 
1+ digits, '.', 1+ digits,    ; the 595.77
' ', 
1+ digits,  ;the 0
' ', 
1+ digits, '.', longest 1+ digits   ;the 595.77
 ]
Given that, the Easy Pattern is remarkably easy; I'll just have to figure out the combinations and go from there. Thanks for providing the sample - it has been really helpful in getting me started on using Easy Pattern.

I think I'm on the right path, but would appreciate any further comments / direction / advice anyone would be willing to provide.

Thanks.

Re: Delete Variable Length Text

Posted: Tue Jun 18, 2013 8:37 am
by DataMystic Support
For a general number, use EasyPattern

Code: Select all

[ ('+' or '-'), 1+ (number or period) ]

Re: Delete Variable Length Text

Posted: Wed Jun 19, 2013 3:58 am
by kbrady
Oh. Duh.
Thanks. :oops:

P.S. How do I learn about Easy Patterns without constantly bugging the forum folks? I've read the tutorial and the online help, but (obviously) I still don't understand enough to work efficiently with Easy Patterns. Any suggestions?

Thanks again.

Re: Delete Variable Length Text

Posted: Wed Jun 19, 2013 6:41 am
by DataMystic Support