Here's an article which should be helpful to others:
http://icu-project.org/docs/papers/iuc26_regexp.pdf
Using Regular Expressions with Unicode texts can be a nightmare, largely as (too) much public documentation is geared towards using them just with ANSI characters.
This 18 page article from 2004 rectifies a lot of that.
Analyzing Unicode Text with Regular Expressions
Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators
-
- Posts: 988
- Joined: Sun Dec 09, 2007 2:49 am
- Location: UK
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Re: Analyzing Unicode Text with Regular Expressions
Thanks David,
TextPipe uses the PCRE (perl compatable regular expression) library - hence all the Unicode regex functions are implemented. Generally you need to check the 'Allow UTF-8' option of the perl or EasyPattern replacement.
TextPipe uses the PCRE (perl compatable regular expression) library - hence all the Unicode regex functions are implemented. Generally you need to check the 'Allow UTF-8' option of the perl or EasyPattern replacement.