UTF-8 text in Comment filters not preserved in TextPipe v11.3

Get help with installation and running here.

Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators

Post Reply
dfhtextpipe
Posts: 986
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

UTF-8 text in Comment filters not preserved in TextPipe v11.3

Post by dfhtextpipe »

I had some Hebrew text in a comment filter.

Code: Select all

        <w>גָּח֜<s t='large'>וֹ</s><x>5</x>ן</w>
        <w>מִשְׁפָּטָ֖/<s t='large'>ן</s><x>5</x></w>
        <w>שְׁמַ֖<s t='large'>ע</s><x>5</x></w>
        <w>אֶחָֽ<s t='large'>ד׃</s><x>5</x></w>
        <w>מְ<s t='suspended'>נַ</s><x>7</x>שֶּׁ֜ה</w>
        <w>אֹ֖רֶ<s t='small'>ן</s><x>6</x></w>
        <w>וּ/נְבֽוּשַׁזְבָּ<s t='small'>ן֙</s><x>6</x></w>
        <w>מִ/יָּ֑<s t='suspended'>עַ</s><x>7</x>ר</w>
        <w>וְ֝/נִרְגָּ֗<s t='small'>ן</s><x>6</x></w>
        <w>רְשָׁ<s t='suspended'>עִ֣</s><x>7</x>ים</w>
        <w>מֵ/רְשָׁ<s t='suspended'>עִ֣</s><x>7</x>ים</w>
After saving the filter with TextPipe 11.3 each Hebrew character was replaced by ?

Code: Select all

        <w>?????<s t='large'>??</s><x>5</x>?</w>
        <w>???????????/<s t='large'>?</s><x>5</x></w>
        <w>??????<s t='large'>?</s><x>5</x></w>
        <w>?????<s t='large'>??</s><x>5</x></w>
        <w>??<s t='suspended'>??</s><x>7</x>??????</w>
        <w>?????<s t='small'>?</s><x>6</x></w>
        <w>??/??????????????<s t='small'>??</s><x>6</x></w>
        <w>??/????<s t='suspended'>??</s><x>7</x>?</w>
        <w>???/????????<s t='small'>?</s><x>6</x></w>
        <w>?????<s t='suspended'>???</s><x>7</x>??</w>
        <w>??/?????<s t='suspended'>???</s><x>7</x>??</w>
This is an unsatisfactory downgrade of functionality.
I don't think that this is an unavoidable consequence of the new JSON text format for .fll files!
These files are currently encoded as ANSI rather than UTF-8.
I understand that JSON data format should support UTF-8 encoding.

David
David
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: UTF-8 text in Comment filters not preserved in TextPipe v11.3

Post by DataMystic Support »

Thanks David - this is a major issue that we are working to resolve ASAP
dfhtextpipe
Posts: 986
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Re: UTF-8 text in Comment filters not preserved in TextPipe v11.3

Post by dfhtextpipe »

Thanks, Simon.

Hope you can fix it soon before too many of my filters used for current WIP get smurfed.

David
David
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: UTF-8 text in Comment filters not preserved in TextPipe v11.3

Post by DataMystic Support »

v11.4 just released - please let us know how it goes
dfhtextpipe
Posts: 986
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

Re: UTF-8 text in Comment filters not preserved in TextPipe v11.3

Post by dfhtextpipe »

Hi Simon,

I have just installed v11.4 but immediately hit two problems with filters saved yesterday from v11.3.

Problem 1
One filter could not be opened at first. This turned out to be due to a single character in a comment filter.
One of my comments contained the character ¦ U+00A6 BROKEN BAR : parted rule (in typography)
TextPipe 11.3 had saved this improperly encoded as a single byte xA6 rather than \xC2\xA6.

After I used Notepad++ to correct the encoding error, I could then open the filter with TextPipe 11.4

NB. Other filters saved yesterday from v11.3 could be opened with v11.4

Problem 2
TextPipe v11.4 saves .fll files encoded as UTF-8 with BOM.
IMHO, they should be encoded as UTF-8 (without BOM) as is nowadays the recommended common practice.

Solved
I have now tested saving a filter with some fresh Hebrew text pasted again into one of the appropriate comment filters.
The Unicode Hebrew text was preserved.

Best regards,

David
David
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: UTF-8 text in Comment filters not preserved in TextPipe v11.3

Post by DataMystic Support »

Hi David, thanks for the confirmation and validation.

We will consider whether to save without BOM for a future release.
Post Reply