Page 1 of 1
replace everything after a particular character
Posted: Fri Aug 22, 2003 4:51 am
by mcfillen
I have read the FAQ on how to replace everything after a particular character but still don't get it. I am trying to delete all of the text after a pattern (cleaning up URL's for spam filter lists). How would I delete everything after the .COM in:
http://www.domain.COM/example/index.htm
Easy one!
Posted: Tue Sep 23, 2003 9:14 am
by jring
http://www.domain.COM/example/index.htm
OK this is a few steps, but this works really well:
1. Search Replace Exact:
S: http://
R: http%%%
2. Search Replace Exact:
S: /
R: <carriage return>
3. Search Replace Exact:
S: http%%%
R: http://
4. Extract URLS.
5. Remove Blanks from begginning line.
6. Remove Blanks from end of line.
7. Remove Blank lines.
If you are working with very large files, you will bump into situations where this doesn't work, and will need to refine your strategy. However, this will definitely get you off the ground.
J-