Page 1 of 1
Extract text from HTML tag
Posted: Wed Aug 27, 2008 12:23 pm
by asoydah
How can i extract the text from HTML tag?
Example :
<h3><b><a name="F5">Family SUV / Wagon</a></b></h3>
<h4>Mitsubishi Outlander or similar</h4>
<ul>
<li>4 door SUV</li>
<li><b>Auto</b>, Power Steering, MP3/CD player</li>
<li>Air Conditioning</li>
<li>
I want to extract to be an XML file like
<name>Mitsubishi Outlander or similar</name>
<desc>Family Suv</desc>
can someone help me with the filter?
Re: Extract text from HTML tag
Posted: Wed Aug 27, 2008 2:17 pm
by DataMystic Support
Use an EasyPattern search/replace with Extract option turned on.
Replace variable sections with:
[ capture(1+ chars) as 'car' ]
Re: Extract text from HTML tag
Posted: Wed Aug 27, 2008 4:53 pm
by asoydah
Sorry for being the idiot here..
but which one should I replace?
Re: Extract text from HTML tag
Posted: Wed Aug 27, 2008 5:40 pm
by DataMystic Support
Are you buying TextPipe Pro..? Have you looked at the web site mining docs at
http://www.datamystic.com/docs ?
Re: Extract text from HTML tag
Posted: Thu Aug 28, 2008 9:32 pm
by Fixer
Hi asoydah
I made for You filter in TextPipe.
Download it here:
http://plikojad.pl/bbg79d7lxszl (cars.rar > unzip to cars.fll and open it)
Result:
Code: Select all
<cars>
<name>Mitsubishi Outlander or similar</name>
<desc>Family SUV / Wagon</desc>
<option>4 door SUV</option>
<option><b>Auto</b>, Power Steering, MP3/CD player</option>
<option>Air Conditioning</option>
</cars>
Re: Extract text from HTML tag
Posted: Thu Aug 28, 2008 10:20 pm
by DataMystic Support
Thanks Fixer!
Re: Extract text from HTML tag
Posted: Thu Aug 28, 2008 11:01 pm
by asoydah
Thx Fixer.. I already click that but can't download anything?
Maybe that's a broken link?
Re: Extract text from HTML tag
Posted: Fri Aug 29, 2008 10:19 pm
by Fixer
No it works! Oh gosh...
You must click twice! (first on the link and next on the file cars.rar)
But ok don't worry try now click this directly link:
http://plikojad.pl/download/bbk4h4wsehn ... 642cd74058