Using line count in global variable

Get help with installation and running here.

Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators

Post Reply
tahoar
Posts: 8
Joined: Tue Sep 23, 2008 10:35 am

Using line count in global variable

Post by tahoar »

Ok, I know there's a trick to this, and I just can't find it.

The first I count the line count in a file and save the number to an external text file. Next I read the number in a text file to calculate x percent (5%) of the line count. Third, I use a vbscript to create an array of random numbers with x (5%) elements where each element's value ranges from 1 to total lines. Finally, I use vbscript to add a tag to the random lines.

Right now, I do all four of these steps in two different textpipe scripts. I can't assign the first step's line count to a global variable and use it later in the same script. I think there's a trick to using the T-filter, but I haven't been able to make it work. A similar functionality would be to insert the total line count as a file header. I can get the T-filter to allow me to insert the global variable to the footer, but not to the header (or left / right margins).

Can you point me to a sample filter?

Thanks,
Tom
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Using line count in global variable

Post by DataMystic Support »

Hi Tom,

Given that TextPipe is designed to handle files that are Gigabytes in size, it doesn't let you add a line count to the start of a file because it doesn't know it yet. It is also designed to only pass over the source file once - whereas you want it to pass over it twice, which is inefficient. And TextPipe hates being inefficient!

A header can easily be output in the startFile() function.

You can easily add a left or right margin by prepend or appending it to line in the processLine(line, EOL) function
tahoar
Posts: 8
Joined: Tue Sep 23, 2008 10:35 am

Re: Using line count in global variable

Post by tahoar »

Yes, multi-gigabyte processing is great. My three-pass solution saved the "Output count of matches" to a one-line temp file, then a batch file read the temp file into an environment variable which a second textpipe pass used to calculate a percentage of the line total. Your suggestion to use the startFile() function now reduces that processing to one pass using the function below. The following VBscript works fine.

Thanks,
Tom

function startFile()
TextPipe.setGlobalVar "linecount", getLineCount( TextPipe.fullInputFilename )
startFile = ""
end function

function getLineCount( srcFile )
Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objReadFile = objFSO.OpenTextFile(srcFile, ForReading)
Do Until objReadFile.AtEndOfStream
b = b + 1
strLine = objReadFile.Readline
Loop
objReadFile.Close
Set objReadFile = nothing
Set objFSO = nothing
getLineCount = b
end function
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Using line count in global variable

Post by DataMystic Support »

I shudder to think how long that vbscript will take to run on a file with 100,000 lines or more!
tahoar
Posts: 8
Joined: Tue Sep 23, 2008 10:35 am

Re: Using line count in global variable

Post by tahoar »

100,000? Try 700,000 lines!

Surprisingly, not to bad. The entire pipeline included 7 total Textpipe filters tied together with a command-line batch file. Two of the filters made character-by-character passes on paired data using a PERL replace filter. One filter made two separate passes on paired halves of the data in a different VBscript.

I started the pipeline at 3:00 AM this morning and it finished at 9:30AM. I don't have a breakdown of which filters took how long, but this VBscript count was one of the least-challenging tasks.

If I were a programmer, I could be dangerous!

Tom
tahoar
Posts: 8
Joined: Tue Sep 23, 2008 10:35 am

Re: Using line count in global variable

Post by tahoar »

Oops, make that 960,000 lines.
Post Reply