Page 1 of 1

Randomize lines is too slow! - worked out a faster way.

Posted: Sun Oct 09, 2005 8:21 pm
by Moz
[edit]

I've worked out a faster randomise now. I was trying to randomise millions of lines of numbers, but the "Randomize Lines" filter took 9 hours to process each file (and I have a few to go through).

My solution is:
Simply grab a load of "random" numbers from somewhere like random.org (two lists, one longer than the other)- add these in a "add repeating text side by side" filter to add a random number to the end of each line. As one of the lists is longer than the other you'll end up with pretty much random numbers all the way down the right hand column of your file.

Then copy these to the beginning of the line, and sort numerically whilst restricting the sort to the number of digits you have placed at the beginning.

After sorting numerically remove the digits from the beginn ing and end of each file and you end up with your lines randomised in a MUCH quicker way.

I ran this against 5 files, each with about 2 million lines in and it took 16 minutes to process - much less than the 9 hours per file when using the "randomize lines" filter.

It would be beneficial for TextPipe Pro to have a facility to generate a random digit from 0-9 - this could be used as an "insert special character" function.

Cheers,

Moz

Posted: Mon Oct 10, 2005 10:32 am
by DataMystic Support
Good idea Moz,

Look for this in the next release (out this week)! It will be the macro @randomdigit, which will return a single digit, so @randomdigit@randomdigit will return 2 random digits. These can be used in the Add header/footer/left/right margin filters, and insert column filters.

We'll also add @randomletter.

Thanks!