I am trying to write a Regex expression to match all adjacent word pairs and triplets for any given string of words. For example, the sentence:
"registrars have been contracted to perform services at very low prices" would produce the following word pairs:
"registrars have", "have been", "been contracted", "contracted to", "to perform", "perform services", etc.
or the following triplets:
"registrars have been", "have been contracted", "been contracted to", "contracted to perform", etc.
I can extract the first two words from a search string, such as:
(.*Subject: )(\w* ){2} filter out the first back reference, but I am stuck writing an expression that will pull all of the concurrent word pairs from a string.
Any suggestions how this can be done with regex alone?
Thanks,
Jeff
Word Pair or Word Triplet Extracts
Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Hi Jeff,
I'm pretty sure you can't do this with regex alone. You can however get a regex to match each word, and then use a VBScript subfilter to process those words - keeping an array of 2 or three words and outputting them.
It looks like you're trying to generate a word concordance, which is something we started buiding into TP along time ago but never finished to our satisfaction.
I'm pretty sure you can't do this with regex alone. You can however get a regex to match each word, and then use a VBScript subfilter to process those words - keeping an array of 2 or three words and outputting them.
It looks like you're trying to generate a word concordance, which is something we started buiding into TP along time ago but never finished to our satisfaction.
reply
hmmmm, interesting Q/A. I don't know how to write vb script, and I was able to create 2 filters that appear to have given me a solid jump on the problem. Simon - you know my email, send me and the guy who asked the original question a note, and I'll reply with my filters. Perhaps we can nip this one...what do you think?
"registrars have been","have been contracted","been contracted to","contracted to perform","to perform services","perform services at","services at very","at very low","very low prices"
"It looks like","looks like youre","like youre trying","youre trying to","trying to generate","to generate a","generate a word","a word concordance","word concordance which","concordance which is","something we started","we started buiding","started buiding into","buiding into TP","into TP along","TP along time","along time ago","time ago but","ago but never","but never finished"
joseph ring
"registrars have been","have been contracted","been contracted to","contracted to perform","to perform services","perform services at","services at very","at very low","very low prices"
"It looks like","looks like youre","like youre trying","youre trying to","trying to generate","to generate a","generate a word","a word concordance","word concordance which","concordance which is","something we started","we started buiding","started buiding into","buiding into TP","into TP along","TP along time","along time ago","time ago but","ago but never","but never finished"
joseph ring