split according to only first 3 bytes of the first column
Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators
split according to only first 3 bytes of the first column
Hi gurus!
I have a very large pipe file which is sorted according to the first column.
I need to split this large file in to many files according to only first 3 bytes of the first column.
For example:
All references starting with AAA Should split to text file AAA.TXT
All references starting with AAB Should split to text file AAB.TXT
And so on.
Please keep in mind that I am a Newbie...
Any ideas?
I have a very large pipe file which is sorted according to the first column.
I need to split this large file in to many files according to only first 3 bytes of the first column.
For example:
All references starting with AAA Should split to text file AAA.TXT
All references starting with AAB Should split to text file AAB.TXT
And so on.
Please keep in mind that I am a Newbie...
Any ideas?
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Re: split according to only first 3 bytes of the first column
How is this different to your other questions about splitting files by rug type?
Re: split according to only first 3 bytes of the first column
I was successful in getting it done using the other way the problem is it takes 10 days of computer work to do extraction.
My hope is that this way by using split instead I could cut down the time to few hours instead.
It would be great to have this function of split by column built in to the program at some point, but in the mean time I still need help!
My hope is that this way by using split instead I could cut down the time to few hours instead.
It would be great to have this function of split by column built in to the program at some point, but in the mean time I still need help!
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Re: split according to only first 3 bytes of the first column
Here are the basics of detecting a column change to column 1.
And here is the code to write this data to a file:
This doesn't quite work for me - not sure why. Thoughts anyone?
Code: Select all
'detect changes - write to new file
dim name
dim oldfield
'Called for every line in the file
'EOL contains the end of line characters (Unix, DOS or Mac) that must be
'appended to each line
function processLine(line, EOL)
field = mid(line,1,instr(line,","))
if oldfield <> field then
processLine = "--- new file ---" & EOL & line & " " & a & EOL
else
processLine = field & "*" & line & " " & a & EOL
end if
oldfield = field
end function
sub startJob()
end sub
sub endJob()
end sub
function startFile()
startFile = ""
oldfield = ""
end function
function endFile()
endFile = ""
end function
And here is the code to write this data to a file:
Code: Select all
'detect changes - write to new file
dim name
dim oldfield
dim fso
dim TextStream
'Called for every line in the file
'EOL contains the end of line characters (Unix, DOS or Mac) that must be
'appended to each line
function processLine(line, EOL)
field = mid(line,1,instr(line,",") - 1)
if oldfield <> field then
if TextStream <> Null then TextStream.close
Set TextStream = fso.OpenTextFile( "C:\" & field & ".txt", 8, True)
TextStream.writeLine( line )
else
TextStream.writeLine( line )
end if
oldfield = field
processLine = ""
end function
sub startJob()
Set fso = CreateObject("Scripting.FileSystemObject")
end sub
sub endJob()
Set fso = Nothing
end sub
function startFile()
startFile = ""
oldfield = ""
end function
function endFile()
if TextStream <> Null then TextStream.close
endFile = ""
end function
Re: split according to only first 3 bytes of the first column
I may not know what I am talking about here…but…
How about going at it in 2 steps.
1. Marking the split points by adding a divider line
Something like:
Mashad
Mashad
Mashad
=====Mas=====
Bidjar
Bidjar
Bidjar
=====Bid=====
2. Splitting the files to:
Mas.txt
Bid.txt
Maybe the above can be done without JavaScript(which I know nothing about) hehehehe
How about going at it in 2 steps.
1. Marking the split points by adding a divider line
Something like:
Mashad
Mashad
Mashad
=====Mas=====
Bidjar
Bidjar
Bidjar
=====Bid=====
2. Splitting the files to:
Mas.txt
Bid.txt
Maybe the above can be done without JavaScript(which I know nothing about) hehehehe
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Re: split according to only first 3 bytes of the first column
Adding the split points is easy - setting the filename to the splitted data is not.
Re: split according to only first 3 bytes of the first column
Lets split the files without the names first…
And then once we have the split files we can rename the file names according to the content of one of the columns in the split files
(in my case, I will include another column in the files with the Mas and Bid and so on info for the above file name replace filter to look at.
That would make it in to 3 steps
1. Mark a divider at split points.
2. Split the files according to the divider
3. Rename file names according to deferent column input(of the new files)
Could this be done?
And then once we have the split files we can rename the file names according to the content of one of the columns in the split files
(in my case, I will include another column in the files with the Mas and Bid and so on info for the above file name replace filter to look at.
That would make it in to 3 steps
1. Mark a divider at split points.
2. Split the files according to the divider
3. Rename file names according to deferent column input(of the new files)
Could this be done?
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Re: split according to only first 3 bytes of the first column
Attached is a script that does not use vbscript. It identifies the split position, and splits the files.
- Attachments
-
- detect changes to a field-no vbscript.zip
- (973 Bytes) Downloaded 493 times