string not identified & not removed

FernandoPS · Post by **FernandoPS** » Wed Apr 10, 2013 8:25 pm

Hello,
Newcomer here and not an expert in edition. I am trying to convert some files (90-100MB e/o) from OEM - Multilingual Latin 1 to Ansi, and split the content, an an specific point, to new, different files (+/- 20.000 for each original file). I have got a success in the the first two steps but the outcome show a string:
ÄMãå ¦âÙ´wMz

that I cannot remove nor identify. I have tryied almost every possibility in the "replace" section, but it does not work.
The string presented here is not the same than in Notepad. It has been pasted from there

Having done some research, (converting the original txt file to rtf, I have found the pattern that disturbs me, but I cannot still delete it. Some "Page Breaks" are concerned, with null characters before and after

________l__=_Ü_
uÃ
_Ü
__ãå_ ■___________________________________________________________________

Any ideas?
Regards.

Fernando

Post by **DataMystic Support** » Fri Apr 19, 2013 1:18 pm

Hi Fernando,

The best approach might be to use a perl pattern:

Code: Select all

_{5,}+.*_{20,}+

This finds 5 or more _, some stuff, then 20 or more _, replace with nothing. Check prompt on replace so you can see if it is ok.

string not identified & not removed

string not identified & not removed

Re: string not identified & not removed