I am trying to parse a logging record (W3C ELFF) with the fields delimited by spaces, except where the field is a text phrase which may (or may not) itself contains a space (sample below). All text-phrase fields are quote-enclosed so that any embedded spaces they may contain are not misinterpreted as delimiters. I am trying to use one or more of the delimited fields filters to extract just such a field, that is, a field which is always quote-enclosed. My problem is that since I must tell the filter that the field delimiter is a space, when it encounters a text-phrase field with an embedded space, it interprets the space as the end of the field, and leaves behind the rest of the quote-enclosed field.
#Field1 #Field2 "Text Field 3" #Field4 "AnotherTextField5" #Field6 .............
I understand this behavior, but how can I work around it so as to extract the whole field between the quotes?
I am a relative novice at TextPipe, but reasonably well-versed in RegEx. Am I making this harder that it should be? If so, please don't hesitate to say so.
Thanks in advance.
R
Extracting One Field of a Delimited Record
Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Re: Extracting One Field of a Delimited Record
RichPipe sounds like a new product name
2 approaches
1. design your own regex
2. change the file to csv delimited so that you can work with native filters. You could use a perl search/replace filter for
Replace with
and set the Action to 'Send NON-macthing text to subfilter.
As a subfilter, add a search replace to change space to comma.
2 approaches
1. design your own regex
2. change the file to csv delimited so that you can work with native filters. You could use a perl search/replace filter for
Code: Select all
"[^"]*"
Code: Select all
$0
As a subfilter, add a search replace to change space to comma.