Page 1 of 1

string not identified & not removed

Posted: Wed Apr 10, 2013 8:25 pm
by FernandoPS
Hello,
Newcomer here and not an expert in edition. I am trying to convert some files (90-100MB e/o) from OEM - Multilingual Latin 1 to Ansi, and split the content, an an specific point, to new, different files (+/- 20.000 for each original file). I have got a success in the the first two steps but the outcome show a string:
ÄMãå ¦âÙ´wMz

that I cannot remove nor identify. I have tryied almost every possibility in the "replace" section, but it does not work.
The string presented here is not the same than in Notepad. It has been pasted from there

Having done some research, (converting the original txt file to rtf, I have found the pattern that disturbs me, but I cannot still delete it. Some "Page Breaks" are concerned, with null characters before and after

________l__=_Ü_


__ãå_ ■___________________________________________________________________



Any ideas?
Regards.

Fernando

Re: string not identified & not removed

Posted: Fri Apr 19, 2013 1:18 pm
by DataMystic Support
Hi Fernando,

The best approach might be to use a perl pattern:

Code: Select all

_{5,}+.*_{20,}+
This finds 5 or more _, some stuff, then 20 or more _, replace with nothing. Check prompt on replace so you can see if it is ok.