big red apple
15 speakers
cheese *and* ham only $15
WOW!!! look-here @@fantastic@@
get in!! only 12 dollars
spades for snow .
3.1" long stick
you are the winner.
cheese and ham only £15.
and after running it through TextPipe have the list look like:
big red apples
15 speakers
cheese and ham only 15
wow look here fantastic
get in only 12 dollars
spades for snow
3.1 long stick
you are the winner
So TexPipe:
Removes all characters that aren't a-z A-Z 0-9
Removes full stops UNLESS they are in a number (not at end)
Removes multiple spaces so there is one space between each word
Removes all duplicates after above have been performed
Randomises list
|
|--Perl pattern [[a-z0-9\.\r\n\- ]] with [$0]
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [X] Extract matches
| Maximum text buffer size 4096
| [ ] Maximum match (greedy)
| [ ] Allow comments
| [X] '.' matches newline
| [ ] UTF-8 Support
|
|--Perl pattern [[ \t\-]+?] with [ ]
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
| [ ] Maximum match (greedy)
| [ ] Allow comments
| [X] '.' matches newline
| [ ] UTF-8 Support
|
|--Remove blanks from Start of Line
|
|--Remove blanks from End of Line
|
|--Perl pattern [([^\d])\.([^\d])] with [$1$$2$]
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
| [ ] Maximum match (greedy)
| [ ] Allow comments
| [X] '.' matches newline
| [ ] UTF-8 Support
|
|--Remove duplicate lines
| [X] Ignore case
| Start column 1
| Length 4096
| [ ] Include One
| format: %d %s
|
|--Randomize lines
|
I'm still a little unsure how to use the steps you've posted for me, but I'm trying to fumble through so hopefully it will become clearer once I play with it a little longer...