Extracting text between BeginPoint and EndPoint

Get help with installation and running here.

Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators

Post Reply
apportum
Posts: 7
Joined: Thu Aug 20, 2009 5:00 pm

Extracting text between BeginPoint and EndPoint

Post by apportum »

I have hit a stone wall in trying to accomplish the following in TextPipe.

I want to extract everything from "<td valign=top><B>Any Name</B>" through at least the first instance of "<BR><I>(Plaintiff)</I></td>". (It would be nice if I could extract through the last instance, but I don't need miracles.)

Here is a sample of the source data:

Code: Select all

<td valign=top><B>Any Name</B>  
<BRLaw Law Firm
<BR>11 East Wacker Drive
<BR>Suite 5759
<BR>Chicago, IL 60601
<BR>(312) 123-4567
<BR>LawyerName&#064;AnyLawFirm.com<br>
&nbsp;&nbsp;<I>Assigned: 12/04/2015</I><br>
&nbsp;&nbsp;<I>ATTORNEY TO BE NOTICED</I></td>
<td>representing </td>
<td><B>Any Name</B>
<BR><I>(Plaintiff)</I></td>
</TR><td valign=top></td>
<td></td>
<td><B>Another Name </B>
<BR><I>(Plaintiff)</I></td>
</TR><td valign=top></td>
<td></td>
<td><B>Someones Name </B>
<BR><I>(Plaintiff)</I></td>
</TR><td valign=top><B>DefendantFirm</B>
<BR>Big Law Firm
<BR>214 West Monroe Street
<BR>Suite 7410
<BR>Chicago, IL 60606
<BR>(312) 456-7890
<BR>LitigastorName&#064;SomeLawOffice.com<br>
&nbsp;&nbsp;<I>Assigned: 11/22/2015</I><br>
&nbsp;&nbsp;<I>ATTORNEY TO BE NOTICED</I></td>
<td>representing </td>
<td><B>DefendantFirm</B>
<BR><I>(Defendant)</I></td>
I would appreciate responses for both Perl and EasyPattern, though I will be grateful for anything that allows me to do the job.

Thank you in advance.

Jerry
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Extracting text between BeginPoint and EndPoint

Post by DataMystic Support »

Hi Jerry,

Sorry for the late reply.

Here is an EasyPattern:

Search for:
<td valign=top><B>Any Name</B>"[ 1+ chars ]<BR><I>(Plaintiff)</I></td>

Replace with:
$0

Remember to check the Extract option.
Post Reply