Unicode sorts?

Get help with installation and running here.

Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators

Post Reply
dfhtextpipe
Posts: 986
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Unicode sorts?

Post by dfhtextpipe »

Sorting lines of Unicode text is a huge topic in its own right, yet TextPipe doesn't yet offer to sort UTF-8 (or other encodings) even on the basis of codepoint values.

Although this is no substitute for intelligent sorting for the text of various languages, it would still have some useful applications, such as for analysis of character frequencies in Unicode text files.

cf. My recent post on this topic in the Help and Support section.
David
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Unicode sorts?

Post by DataMystic Support »

Thanks David - we'll look into adding it shortly.
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Unicode sorts?

Post by DataMystic Support »

Hi David,

Windows does not provide functions for natively sorting anything except Ansi and Unicode.

SO utf-8 is out of the question - you would need to convert the text from utf-8 to UTF16LE first, then sort (with a new widestring sort we can add), and then convert it back later.

How does that sound?
dfhtextpipe
Posts: 986
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Re: Unicode sorts?

Post by dfhtextpipe »

Might be a useful (though somewhat awkward) workaround.

For now I'm content to just use Notepad++ | TextFX Tools | Sort.

Yet I can imagine that other users may find the wide sort of UTF16LE a benefit.
David
Post Reply