Extract certain HTML tags

Get help with installation and running here.

Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators

Post Reply
JimC
Posts: 3
Joined: Thu Mar 15, 2007 10:19 pm

Extract certain HTML tags

Post by JimC »

I am trying to extract some data from webpages. All of the content I need is contained within two:
<div class="someclass">content</div>
tags.
What is best FILTER to extract just these two tage from a file and then proceed with further processing?

Something like an extract HTML/XML pair would be perfect, but I dont see that as an option
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Extract certain HTML tags

Post by DataMystic Support »

Hi Jim,

Just a perl pattern:

Code: Select all

<div class="someclass">(.*)</div>
replace with

Code: Select all

$1
and check 'Extract'.
Post Reply