I am trying to extract some data from webpages. All of the content I need is contained within two:
<div class="someclass">content</div>
tags.
What is best FILTER to extract just these two tage from a file and then proceed with further processing?
Something like an extract HTML/XML pair would be perfect, but I dont see that as an option
Extract certain HTML tags
Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Re: Extract certain HTML tags
Hi Jim,
Just a perl pattern:
replace with
and check 'Extract'.
Just a perl pattern:
Code: Select all
<div class="someclass">(.*)</div>
Code: Select all
$1