Get help with installation and running here.
Moderators: DataMystic Support , Moderators , DataMystic Support , Moderators , DataMystic Support , Moderators
pheagila
Posts: 9 Joined: Mon Aug 18, 2008 9:04 pm
Post
by pheagila » Mon Aug 18, 2008 9:14 pm
Hi all,
I would like to Extract Text Between Two Fields from many HTML files
i.e all the Text between:
<!-- Start Results Section -->
.... Data .....
<!-- End Results Section -->
I would like all the extracted text combined together and output to abc.txt
Below are my current settings, but it is NOT working as it also copies a lot of data 'outside' of the tags
Can anyone help me with what I am doing wrong? (yes I am new to TextPipe)
Code: Select all
Restrict to between tags <<!-- Start Results Section -->>...<<!-- End Results Section -->>
| [X] Include text
| [X] Match case
| Max size: 65536
|
+--Merge output to file C:\1\abc.txt
Fixer
Posts: 25 Joined: Thu Jul 31, 2008 6:39 am
Location: European Union > Poland
Contact:
Post
by Fixer » Fri Aug 22, 2008 11:08 am
I almost always use perl pattern
Code: Select all
Filter List
-----------
Filter options
| [ ] Log to file
| [X] Append to logfile
| Log filename: textpipe.log
| Threshold 500
|
|--Input from file(s)
| [ ] Confirm before processing each file
| [ ] Confirm before processing read/only files
| [ ] Delete input files after processing
| Process binary files
|
|--Perl pattern [<\!-- Start Results Section -->\r\n(.*)\r\n<\!-- End Results Section -->\r\n] with [$1\r\n]
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 99999
| [ ] Maximum match (greedy)
| [ ] Allow comments
| [X] '.' matches newline
| [X] UTF-8 Support
|
|--Remove blank lines
|
+--Merge output to file c:\mergefilename.txt
Files List
----------
pheagila
Posts: 9 Joined: Mon Aug 18, 2008 9:04 pm
Post
by pheagila » Sat Aug 23, 2008 5:20 pm
Fixer wrote: I almost always use perl pattern
Code: Select all
Filter List
-----------
Filter options
| [ ] Log to file
| [X] Append to logfile
| Log filename: textpipe.log
| Threshold 500
|
|--Input from file(s)
| [ ] Confirm before processing each file
| [ ] Confirm before processing read/only files
| [ ] Delete input files after processing
| Process binary files
|
|--Perl pattern [<\!-- Start Results Section -->\r\n(.*)\r\n<\!-- End Results Section -->\r\n] with [$1\r\n]
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 99999
| [ ] Maximum match (greedy)
| [ ] Allow comments
| [X] '.' matches newline
| [X] UTF-8 Support
|
|--Remove blank lines
|
+--Merge output to file c:\mergefilename.txt
Files List
----------
Thanks Fixer
How do I import what you have typed above directly into TextPipe Pro?
Cheers
DataMystic Support
Site Admin
Posts: 2227 Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:
Post
by DataMystic Support » Mon Aug 25, 2008 9:42 am
What is shown above is just a clipboard export and can't be input directly. Soon we will have an XML export/import facility.
The key is that you are using a restriction and that is not what it is intended for. Please read the help on restrictions.
You just need to use a search/replace filter with the 'Extract matches' option.
pheagila
Posts: 9 Joined: Mon Aug 18, 2008 9:04 pm
Post
by pheagila » Mon Aug 25, 2008 6:37 pm
DataMystic Support wrote: What is shown above is just a clipboard export and can't be input directly. Soon we will have an XML export/import facility.
The key is that you are using a restriction and that is not what it is intended for. Please read the help on restrictions.
You just need to use a search/replace filter with the 'Extract matches' option.
thanks DataMystic Support
Can you give me a Clipboard Export
example like Fixer for "
use search/replace filter with 'Extract matches' option "?
DataMystic Support
Site Admin
Posts: 2227 Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:
Post
by DataMystic Support » Mon Aug 25, 2008 11:05 pm
Sure:
Code: Select all
|--Perl pattern [<!-- Start Results Section -->(.*)<!-- End Results Section -->] with [$1\r\n]
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [X] Extract matches
| Maximum text buffer size 64096
| [ ] Maximum match (greedy)
| [ ] Allow comments
| [X] '.' matches newline
| [ ] UTF-8 Support
|
+--Merge output to file C:\1\abc.txt
pheagila
Posts: 9 Joined: Mon Aug 18, 2008 9:04 pm
Post
by pheagila » Sun Aug 31, 2008 2:08 pm
thanks Support but your Perl pattern filter
returns 0 bytes
Code: Select all
|--Perl pattern [<!-- Start Results Section -->(.*)<!-- End Results Section -->] with [$1\r\n]
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [X] Extract matches
| Maximum text buffer size 99999
| [ ] Maximum match (greedy)
| [ ] Allow comments
| [X] '.' matches newline
| [ ] UTF-8 Support
This filter seems to work
Code: Select all
|--Extract [<!-- Start Auction Results Section -->(.*)<!-- End Auction Results Section -->]
| [ ] Include line numbers
| [ ] Include filename
| [X] Match case
| [ ] Count matches
| Pattern type: 0
What do I need to change to get your
Perl Pattern filter to work ?
Cheers
DataMystic Support
Site Admin
Posts: 2227 Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:
Post
by DataMystic Support » Mon Sep 01, 2008 3:51 pm
Please just send us an email referencing this discussion and we can send you a filter.