I have a lot of html files with similar structure:
...
<td>47:28:...numbers</td>
<td>text</td>
<td align=right>numbers</td>
...
How to extract everything between tags to different files?
At this moment I know how to extract numbers between first pair of tags:
Input
Extract lines matching [<td>47:28]
Remove HTML and XML
Remove blanks from Start of Line
Remove blanks from End of Line
Output
I can't guess how to extract text between other pairs. Can anybody help?
Extract text between tags
Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Re: Extract text between tags
Use perl pattern search/replace, find
Action: Send var 1 to subfilter.
As a subfilter, add a Special\Secondary Output filter - directing output to the file you need.
Repeat these two steps for each section you need.
Code: Select all
<td>47:28:(.*)</td>
As a subfilter, add a Special\Secondary Output filter - directing output to the file you need.
Repeat these two steps for each section you need.