Extracting regex from multiple pdfs and lines that surrounds

Get help with installation and running here.

Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators

Post Reply
simicar
Posts: 5
Joined: Thu Feb 15, 2007 7:25 am

Extracting regex from multiple pdfs and lines that surrounds

Post by simicar »

Hello.

I've got a certain problem. I need to find a specific data using the regular expression and save everything that surround it,
from multiple pdfs to one file:

For exalmple I've got:

Code: Select all

The California Gold Rush started in January 1848, when gold was 
discovered at Sutter's Mill. As news of the discovery spread, some 
300,000 people came to California from the rest of the United States and 
abroad. These early gold-seekers, called "Forty-Niners," traveled to 
California by sailing ship and in covered wagons across the continent, 
often facing substantial hardship on the trip
and i need to find ex. Forty-Niners using regex and get 1 surrounding line each side (or perhaps 50 surrounding characters) to get

Code: Select all

300,000 people came to California from the rest of the United States and
abroad. These early gold-seekers, called "Forty-Niners," traveled to 
California by sailing ship and in covered wagons across the continent,
How to do this, should I use find/replace or sth else?
Which type of regex will fit here best?
thanks
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Post by DataMystic Support »

Use Filters\Extract\Matching lines, and include 1 context line above and below the match.
Post Reply