Unicode line separator U+2028

Get help with installation and running here.

Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators

Post Reply
dfhtextpipe
Posts: 988
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Unicode line separator U+2028

Post by dfhtextpipe »

How does TextPipe handle the Unicode line separator U+2028 ?

e.g. If the Files to be Processed have these as the EOL marker.

Assume that these are Unicode files - encoded in either UTF-16 LE or UTF-8 (with or without BOM).

Also how about in Perl pattern matching?
e.g. In the Patterns options button [...] dialog that include the tick option '.' matches newline.

David

PS. The attachment contains a simple TP filter to convert EOLs to U+2028.
Attachments
Change EOLs to U+2028.zip
TextPipe filter to change EOLs to U+2028.
(762 Bytes) Downloaded 572 times
David
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Unicode line separator U+2028

Post by DataMystic Support »

Thanks David - we've included your filter in a new 'Unicode' filter subfolder.

I don't believe that PCRE (the library we use) pattern matching handles anything other \r, \r\n and \n line feeds.
dfhtextpipe
Posts: 988
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Re: Unicode line separator U+2028

Post by dfhtextpipe »

Well, well, well.

The help page entitled Unicode Pattern Reference includes this:
Definitions

Separator - any one of U+2028, U+2029, NL, CR.
So this suggests that TextPipe ought to be able to handle U+2028 and U+2029.

Something overlooked, perhaps?

David

PS. Of the various Unicode compatible text editors (for Windows) that I use regularly, only SC Unipad handles these correctly.
David
dfhtextpipe
Posts: 988
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Re: Unicode line separator U+2028

Post by dfhtextpipe »

FWIW. Here's a similar filter to change EOLs to U+2029 Paragraph Separator.
Attachments
Change EOLs to U+2029.zip
TP filter to change EOLs to U+2029
(767 Bytes) Downloaded 792 times
David
Post Reply