In find type there is a dialog for Maxium match size.
For a newly inserted filter, the default is always 4096 bytes, yet the legend has Maxium match size (default is 32768 bytes).
The mouseover tooltip says,
"If the text you are trying to match is longer than 4096 characters,
increase this value to the length of text you are trying to match"
Software bug:
The legend is clearly wrong by referring to a "default" value. This is a UI error.
Yet there does seem to be an absolute maximum limit in terms of how these filtere work.
Entering values higher than 32768 seems to make no difference to what is found.
This begs the question,
Why allow the user to enter a value beyond what the filter is capable of supporting?
Application issue:
For the use I have in mind, 32768 is too small as an absolute limit for the maximum match size.
I'm using a filter to detect matched pairs of double angle quotation marks.
Some of the chunkier quotations in the text file are longer than 32768 characters.
It would seem to be the case that the wrong type of variable is being used for the match counter.
Probably programmed type is byte, rather than long integer. The latter would be a real improvement.
David Haslam
Maximum match size?
Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Re: Maximum match size?
Hi David,
1. We've fixed the default showingin the caption (thanks for that)
2. The coding allows for a match size of 2GB, although this would likely be very slow. If you have an example where the match size is large and does not work I'd be happy to check it for you.
1. We've fixed the default showingin the caption (thanks for that)
2. The coding allows for a match size of 2GB, although this would likely be very slow. If you have an example where the match size is large and does not work I'd be happy to check it for you.
-
- Posts: 988
- Joined: Sun Dec 09, 2007 2:49 am
- Location: UK
Re: Maximum match size?
Thanks, Simon.
The original task involved searching for matching pairs of «quotation marks» over many lines of Arabic text.
Today, I adopted a different approach for the issue I was using TextPipe to investigate.
I used TextPipe on the RTF files instead of the UTF-8 text files, and inserted color highlighting.Shared in case it might inspire others facing a similar challenge.
David
The original task involved searching for matching pairs of «quotation marks» over many lines of Arabic text.
Today, I adopted a different approach for the issue I was using TextPipe to investigate.
I used TextPipe on the RTF files instead of the UTF-8 text files, and inserted color highlighting.
Code: Select all
Comment...
| Special filter to highlight all quotations in a Wordpad RTF file
|
|--Comment...
| | Insert the color table to define yellow highlighting
| |
| +--Comment...
| | highlight1 = yellow
| | highlight2 = red
| | highlight3 = green
| | highlight4 = magenta
| |
| +--Insert lines at line 2 [{\\\\colortbl ;\\\\red255\\\\green255\\\\blue0;\\\\red255\\\\green0\\\\blue0;\\\\red0\\\\green255\\\\blue0;\\\\red255\\\\green0\\\\blue255;}\r\n]
|
|--Comment...
| | Highlight text from each left pointing double angle quotation mark.
| | Unhighlight at each right pointing double angle quotation mark.
| |
| +--Perl pattern [^(.*)(\\'ab|\\'bb)(.*)$] with []
| | [X] Match case
| | [ ] Whole words only
| | [ ] Case sensitive replace
| | [ ] Prompt on replace
| | [ ] Skip prompt if identical
| | [ ] First only
| | [ ] Extract matches
| | Maximum text buffer size 32768
| | [X] Maximum match (greedy)
| | [ ] Allow comments
| | [ ] '.' matches newline
| | [X] UTF-8 Support
| |
| |--Perl pattern [(\\'ab)] with [\\highlight1$1]
| | [X] Match case
| | [ ] Whole words only
| | [ ] Case sensitive replace
| | [ ] Prompt on replace
| | [ ] Skip prompt if identical
| | [ ] First only
| | [ ] Extract matches
| | Maximum text buffer size 4096
| | [ ] Maximum match (greedy)
| | [ ] Allow comments
| | [ ] '.' matches newline
| | [X] UTF-8 Support
| |
| +--Perl pattern [(\\'bb)] with [$1\\highlight0]
| [X] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096
| [ ] Maximum match (greedy)
| [ ] Allow comments
| [ ] '.' matches newline
| [X] UTF-8 Support
|
+--Comment...
| Find successive left pointing double angle quotation marks
| with no right pointing one in between.
|
+--Perl pattern [(\\'ab)(.+)(\\'ab)] with []
| [X] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 32768
| [ ] Maximum match (greedy)
| [ ] Allow comments
| [X] '.' matches newline
| [X] UTF-8 Support
|
+--Perl pattern [(.*)(\\'bb)(.*)] with []
| [X] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [ ] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 32768
| [X] Maximum match (greedy)
| [ ] Allow comments
| [X] '.' matches newline
| [X] UTF-8 Support
|
+--Perl pattern [^(\\'ab)] with [\\highlight2$1]
[X] Match case
[ ] Whole words only
[ ] Case sensitive replace
[ ] Prompt on replace
[ ] Skip prompt if identical
[ ] First only
[ ] Extract matches
Maximum text buffer size 4096
[ ] Maximum match (greedy)
[ ] Allow comments
[ ] '.' matches newline
[X] UTF-8 Support
David
David
-
- Posts: 988
- Joined: Sun Dec 09, 2007 2:49 am
- Location: UK
Re: Maximum match size?
Simon,
I just spotted that copying part of my filter to paste into the code lines of my previous reply, there is a quirk.
The actual text my filter inserts for the color table is
{\\colortbl ;\\red255\\green255\\blue0;\\red255\\green0\\blue0;\\red0\\green255\\blue0;\\red255\\green0\\blue255;}
Double backslahes are to force a literal backslash rather than a Perl pattern.
When I copied the filter using the TextPipe context menu, it doubled the quantity of backslashes. Weird!
Is this a bug?
David
I just spotted that copying part of my filter to paste into the code lines of my previous reply, there is a quirk.
The actual text my filter inserts for the color table is
{\\colortbl ;\\red255\\green255\\blue0;\\red255\\green0\\blue0;\\red0\\green255\\blue0;\\red255\\green0\\blue255;}
Double backslahes are to force a literal backslash rather than a Perl pattern.
When I copied the filter using the TextPipe context menu, it doubled the quantity of backslashes. Weird!
Is this a bug?
David
David
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Re: Maximum match size?
Yes - this is definitely a bug - fixed for the next release.
Let me know if you can reproduce the error with the large match.
Let me know if you can reproduce the error with the large match.