remove errant CRLF from customer emails
Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators
remove errant CRLF from customer emails
I have to remove CRLF from customer emails who write us and organize the data into one row per customer. The CRLF in the text data needs to be removed until we encounter a 3 digits number which starts a new line.
it look like this....
001|DEC2007|I am frustrated with the bank
being able to help me
Please call me.
002|DEC2007| This is craxy. I have never seen
anything
like
this.
it look like this....
001|DEC2007|I am frustrated with the bank
being able to help me
Please call me.
002|DEC2007| This is craxy. I have never seen
anything
like
this.
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Use the EasyPattern:
Replace with:
Add a subfilter to this to replace EasyPattern
with nothing.
Code: Select all
[ 3 digits, 1+ chars, mustEndWith( cr, lf, 3 digits ) ]
Code: Select all
$0
Code: Select all
[cr, lf]
Perhaps I am missing something?
I tried what you said but it did not work right. I assumed you meant a Easy Pattern subfilter under the Subfilter that replaces
[ 3 digits, 1+ chars, mustEndWith( cr, lf, 3 digits ) ] with $0?
I must be missing something....
Here is the filter export.
Input from file(s)
| [ ] Confirm before processing each file
| [ ] Confirm before processing read/only files
| [ ] Delete input files after processing
| Process binary files
|
|--Remove multiple whitespace
|
|--EasyPattern [[ 3 digits, 1+ chars, mustEndWith( cr, lf, 3 digits ) ]] with [$0]
| | [ ] Match case
| | [ ] Whole words only
| | [ ] Case sensitive replace
| | [ ] Prompt on replace
| | [ ] Skip prompt if identical
| | [ ] First only
| | [ ] Extract matches
| | Maximum text buffer size 4096
| |
| +--EasyPattern [[cr, lf]] with []
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
[ 3 digits, 1+ chars, mustEndWith( cr, lf, 3 digits ) ] with $0?
I must be missing something....
Here is the filter export.
Input from file(s)
| [ ] Confirm before processing each file
| [ ] Confirm before processing read/only files
| [ ] Delete input files after processing
| Process binary files
|
|--Remove multiple whitespace
|
|--EasyPattern [[ 3 digits, 1+ chars, mustEndWith( cr, lf, 3 digits ) ]] with [$0]
| | [ ] Match case
| | [ ] Whole words only
| | [ ] Case sensitive replace
| | [ ] Prompt on replace
| | [ ] Skip prompt if identical
| | [ ] First only
| | [ ] Extract matches
| | Maximum text buffer size 4096
| |
| +--EasyPattern [[cr, lf]] with []
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Ok, I missed a couple of things. It does not process the last line, and it should be replacing the embedded cr.lf with a space.
I worked around this by adding a ascii(255) (hex \xff) character at the start of each record to prevent it joining all the records together. These get removed at the end
Here is the new filter:
I can also email you this filter if you drop us an email
I worked around this by adding a ascii(255) (hex \xff) character at the start of each record to prevent it joining all the records together. These get removed at the end
Here is the new filter:
Code: Select all
|
|--EasyPattern [[ linestart, 3 digits ]] with [\xff$0]
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
|
|--EasyPattern [[ ascii($ff), capture(3 digits, longest 1+ not ascii($ff)) ]] with [$0]
| | [ ] Match case
| | [ ] Whole words only
| | [ ] Case sensitive replace
| | [X] Prompt on replace
| | [ ] Skip prompt if identical
| | [ ] First only
| | [ ] Extract matches
| | Maximum text buffer size 4096
| |
| |--EasyPattern [[cr, lf]] with [ ]
| | [ ] Match case
| | [ ] Whole words only
| | [ ] Case sensitive replace
| | [X] Prompt on replace
| | [ ] Skip prompt if identical
| | [ ] First only
| | [ ] Extract matches
| | Maximum text buffer size 4096
| |
| +--Add footer [\r\n]
|
|--EasyPattern [[ ascii($ff) ]] with []
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
|
Not quite yet
All I want to do is get rid of the carriage returns before the start of a new line. The new line always start with the id of 001, 002 003 etc. The customer has imposed the cr lf when they are typing the message to us. Our downstream application can't handle the cr lf because it thinks it is a new line when it is not. This is a pipe delimited file if that helps. The simple thing to do is perhaps use the restrict filter and just remove the cr lf from the third field?
I apologize for not clarifying up front better.
The first and second parsed and processed line would look like this
001|DEC2007|I am frustrated with the bank being able to help.....
002|DEC2007|This is crazy. I have never seen anything like this.
Old
001|DEC2007|I am frustrated with the bank
being able to help me
Please call me.
002|DEC2007| This is crazy. I have never seen
anything
like
this.
I apologize for not clarifying up front better.
The first and second parsed and processed line would look like this
001|DEC2007|I am frustrated with the bank being able to help.....
002|DEC2007|This is crazy. I have never seen anything like this.
Old
001|DEC2007|I am frustrated with the bank
being able to help me
Please call me.
002|DEC2007| This is crazy. I have never seen
anything
like
this.
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
did you get my email?
I sent it to simon.carter at datamystic.com. Is that correct?
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Hi Sheridan,
Clearly something is wrong with your companies email filtering, and I'm prepared to bet that my company is not the only one with the problem. It is a huge waste of our resources to constantly resend and re-reply to emails because of external filtering issues - and understandably you get frustrated with our apparent lack of response.
Please get a gmail account and use that for contacting us in future. I'll be happy to send you the filter to your gmail account.
Clearly something is wrong with your companies email filtering, and I'm prepared to bet that my company is not the only one with the problem. It is a huge waste of our resources to constantly resend and re-reply to emails because of external filtering issues - and understandably you get frustrated with our apparent lack of response.
Please get a gmail account and use that for contacting us in future. I'll be happy to send you the filter to your gmail account.
Email Filtering
At least now we know it is on our side. I have a yahoo account that I can access at work. It is the same as my screen name in the forum here @yahoo.com. Will that work? I at least have access to anything that goes to their spam filter and can flag it as otherwise.