Page 1 of 1

how convert htm to text

Posted: Fri May 20, 2005 9:05 am
by ling
HI:

how to convert htm to text format,including all the comments and tags ,to become very clean text?(retain original paragraphs) ?I use remove htm function, but it leave a lot of garbish like comments,etc.how to totally convert all the unrelevant words that will not appear in text format.

thanks

Posted: Fri May 20, 2005 10:44 pm
by DataMystic Support
First use a perl search/replace to replace

Code: Select all

<!--.*-->
with nothing, to remove comments.