Page 1 of 1

Extracting One Field of a Delimited Record

Posted: Thu Feb 12, 2009 1:54 pm
by RichPipe
I am trying to parse a logging record (W3C ELFF) with the fields delimited by spaces, except where the field is a text phrase which may (or may not) itself contains a space (sample below). All text-phrase fields are quote-enclosed so that any embedded spaces they may contain are not misinterpreted as delimiters. I am trying to use one or more of the delimited fields filters to extract just such a field, that is, a field which is always quote-enclosed. My problem is that since I must tell the filter that the field delimiter is a space, when it encounters a text-phrase field with an embedded space, it interprets the space as the end of the field, and leaves behind the rest of the quote-enclosed field.

#Field1 #Field2 "Text Field 3" #Field4 "AnotherTextField5" #Field6 .............

I understand this behavior, but how can I work around it so as to extract the whole field between the quotes?

I am a relative novice at TextPipe, but reasonably well-versed in RegEx. Am I making this harder that it should be? If so, please don't hesitate to say so.

Thanks in advance.
R

Re: Extracting One Field of a Delimited Record

Posted: Tue Feb 17, 2009 8:59 am
by DataMystic Support
RichPipe sounds like a new product name :-)

2 approaches
1. design your own regex
2. change the file to csv delimited so that you can work with native filters. You could use a perl search/replace filter for

Code: Select all

  "[^"]*"
Replace with

Code: Select all

  $0
and set the Action to 'Send NON-macthing text to subfilter.

As a subfilter, add a search replace to change space to comma.