Hi Simon,
Remember that I need to output 2 files, the transformed HTML file + a text file with the extracted text. How do I do that without the secondary output?
Search found 7 matches
- Wed Apr 22, 2009 7:50 pm
- Forum: TextPipe Tips and Tricks, Questions and Support
- Topic: Extracting text from HTML, replacing with random codes
- Replies: 10
- Views: 9444
- Tue Apr 21, 2009 6:28 am
- Forum: TextPipe Tips and Tricks, Questions and Support
- Topic: Extracting text from HTML, replacing with random codes
- Replies: 10
- Views: 9444
Re: Extracting text from HTML, replacing with random codes
My question now is, could I run this stuff with Textipe Lite?
I'm only using "Find perl patern" and secondary output functions in the filter I described, but I need search/replace list (with tab delimited text file) to reverse the process. Does Textpipe Lite have that? The Standard and Pro versions ...
I'm only using "Find perl patern" and secondary output functions in the filter I described, but I need search/replace list (with tab delimited text file) to reverse the process. Does Textpipe Lite have that? The Standard and Pro versions ...
- Mon Apr 20, 2009 6:50 am
- Forum: TextPipe Tips and Tricks, Questions and Support
- Topic: Extracting text from HTML, replacing with random codes
- Replies: 10
- Views: 9444
Re: Extracting text from HTML, replacing with random codes
I revised the initial perl pattern to: >([^<\r][^<\r].*)<
This allows capturing text strings that start with a space, but not something like:
> <IMG...><
Neither < nor return chars are allowed as 1st or 2nd chars of the string.
I revised again to >( | |)([^<\r][^<\r].*|)( | |)(\r\n ...
This allows capturing text strings that start with a space, but not something like:
> <IMG...><
Neither < nor return chars are allowed as 1st or 2nd chars of the string.
I revised again to >( | |)([^<\r][^<\r].*|)( | |)(\r\n ...
- Mon Apr 20, 2009 5:29 am
- Forum: TextPipe Tips and Tricks, Questions and Support
- Topic: Extracting text from HTML, replacing with random codes
- Replies: 10
- Views: 9444
Re: Extracting text from HTML, replacing with random codes
Solved it! :D
At the end of
+--Perl pattern [^(.+)$] with [[@randomdigit@@randomdigit@@randomdigit@@randomdigit@@randomdigit@]]
I added a tab (\t) + $1 (text) + return (\r\n).
Then I output this (only tried output to clipboard for testing). The return at the end makes sure each code/text pair ...
At the end of
+--Perl pattern [^(.+)$] with [[@randomdigit@@randomdigit@@randomdigit@@randomdigit@@randomdigit@]]
I added a tab (\t) + $1 (text) + return (\r\n).
Then I output this (only tried output to clipboard for testing). The return at the end makes sure each code/text pair ...
- Mon Apr 20, 2009 3:25 am
- Forum: TextPipe Tips and Tricks, Questions and Support
- Topic: Extracting text from HTML, replacing with random codes
- Replies: 10
- Views: 9444
Re: Extracting text from HTML, replacing with random codes
Either I'm doing something wrong, or [^<>]*? targets everything inside or outside <>.
If I make a "find pattern" for [^<>]*? and replace with $0, then add a subfilter replacing . with @randomdigit I get something like:
<4845>856202931309492836753331170<66489>
from
<font>You can type sample text in ...
If I make a "find pattern" for [^<>]*? and replace with $0, then add a subfilter replacing . with @randomdigit I get something like:
<4845>856202931309492836753331170<66489>
from
<font>You can type sample text in ...
- Sun Apr 19, 2009 4:50 am
- Forum: TextPipe Tips and Tricks, Questions and Support
- Topic: Extracting text from HTML, replacing with random codes
- Replies: 10
- Views: 9444
Re: Extracting text from HTML, replacing with random codes
That pattern doesn't seem to work...
- Tue Apr 14, 2009 1:50 am
- Forum: TextPipe Tips and Tricks, Questions and Support
- Topic: Extracting text from HTML, replacing with random codes
- Replies: 10
- Views: 9444
Extracting text from HTML, replacing with random codes
I'm wondering if Textpipe can do this. I want to extract every line of text from a HTML file, replacing each with a short (5 char max) random or sequencial code, and output every code + text line to a separate text file.
So, I have an HTML file like this:
<.....> text line 1 </.....>
<.....> text ...
So, I have an HTML file like this:
<.....> text line 1 </.....>
<.....> text ...