How big a file?
Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators
How big a file?
I have a file that has a 1MM rows (22 columns csv format )daily that needs to be cleaned extensively. How many rows can TP handle on a single cpu workstation or is that even an option. Whats the best way to utilize TP for a job this big.
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Re: How big a file?
TP can handle billions of rows of CSV data. Just point it at the file with a list of filters.
What cleansing does it need?
What cleansing does it need?
Re: How big a file?
The usual cleanup some search and replace remove blanks trim leading and trailing etc. The usual TP stuff. From a deployment standpoint and a ETL perspective we would like to load the clean data into a database after TP has processed the file. How might we do that?
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Re: How big a file?
I assume you want to trim blanks on each field in turn rather than with entire lines, so use Filters\Restrict\Delimited fields (CSV, Tab, Pipe, etc) to restrict to each field in turn, and inside this filter add the trim filters.
You will then need to modify the CSV to add Filters\Add\Left margin of
and a Filters\Add\Right margin of
Then add a Filters\Special\Database connection as the last step.
You will then need to modify the CSV to add Filters\Add\Left margin of
Code: Select all
insert into tablename () values (
Code: Select all
);