Page 1 of 1
CSV-like data merging
Posted: Wed Feb 28, 2007 7:39 pm
by simicar
Hi.
I encountered problems by using "extract matching lines" with context lines (1 line - before and after) selected, because I get this:
Code: Select all
;Trial Input;noxious gases away from the users of the machine. ;
;Trial Input;Indoor generators and furnaces can quickly fill an enclosed s;
;Trial Input;pace with carbon monoxide or other poisonous exhaust gases ;
The problem is that I need to convert those three lines to only one row in excel (not three), ex. with the word (*enclosed*) to get:
Code: Select all
;Trial Input;noxious gases away from the users of the machine.
Indoor generators and furnaces can quickly fill an enclosed s
pace with carbon monoxide or other poisonous exhaust gases ;
How can I do this?
I've already tried the "convert end of lines option" or using headres and footers - but when using h/f I get ';' only at the beginning and ending,
but of the extracted file - not rows (3 rows here) as needed.
The above example I've generated by:
- File input:..
Extract matching [*enclosed*]
Replace ; with ,
Insert column 1 [;@inputFilename;]
Insert column 0 [;]
Merge to file...
Please help
Posted: Thu Mar 01, 2007 8:30 am
by DataMystic Support
To join every 3 lines into one, use an EasyPattern like this:
Code: Select all
[ capture(0+ not cr or lf), cr, lf,
capture(0+ not cr or lf), cr, lf,
capture(0+ not cr or lf), cr, lf ]
Replace with
Not quite..
Posted: Fri Mar 02, 2007 12:53 am
by simicar
Unfortunately when I've used the suggested:
- File input:..
Extract matching [*tool*] - (here extraction of the 1 line before and after)
Replace ; with ,
Insert column 1 [;@inputFilename;]
Insert column 0 [;]
EasyPattern [...as given above...]
Merge to file...
on the trial input:
Code: Select all
TextPipe provides a single point of maintenance for all your text processing tasks.
You learn one tool, rather than learning 4 or more - and their associated languages,
command line options, debugging schemes, idiosyncrasies and operating system differences and dependencies.
I got a result:
Code: Select all
;Trial Input;TextPipe provides a single point of maintenance for all your text processing tasks. ;
;Trial Input;You learn one tool, rather than learning 4 or more - and their associated languages, ;
;Trial Input;command line options, debugging schemes, idiosyncrasies and operating system differences and dependencies.;
instead of:
Code: Select all
;Trial Input;TextPipe provides a single point of maintenance for all your text processing tasks.
You learn one tool, rather than learning 4 or more - and their associated languages,
command line options, debugging schemes, idiosyncrasies and operating system differences and dependencies.;
Maybe the position of
Replace -> Find EasyPattern is wrong,
I still can't sort this thing out...
Posted: Fri Mar 02, 2007 7:28 am
by DataMystic Support
If you only want the input filename shown once, then move the EasyPattern up just underneath the Extract filter.
How about duplicates
Posted: Fri Mar 02, 2007 9:42 pm
by simicar
Thanks. It really works - but only when key word appears once per file. The problem appears, when there are two or more of them.
Ex. from - (extracting word
tool):
Code: Select all
TextPipe provides a single...
...one tool, rather than
command line options, debugging schemes,
TextPipe provides a single...
...one tool, rather than
command line options, debugging schemes,
TextPipe provides a single...
...one tool, rather than
command line options, debugging schemes,
I get:
Code: Select all
;Trial Input;TextPipe provides a single... ...one tool, rather than command line options, debugging schemes,TextPipe provides a single... ...one tool, rather than command line options, debugging schemes,TextPipe provides a single...;
;Trial Input;...one tool, rather than;
;Trial Input;command line options, debugging schemes,;
instead of (where the
Trial Inupt should be the same file):
Code: Select all
;Trial Input;TextPipe provides a single... ...one tool, rather than command line options, debugging schemes,;
;Trial Input;TextPipe provides a single... ...one tool, rather than command line options, debugging schemes,;
;Trial Input;TextPipe provides a single... ...one tool, rather than command line options, debugging schemes,;
Posted: Sun Mar 04, 2007 7:23 am
by DataMystic Support
Try this filter (note - this text comes from File\Export\Export to Clipboard):
Code: Select all
|--Extract lines matching [tool]
| [ ] Include line numbers
| [ ] Include filename
| [ ] Match case
| [ ] Count matches
| Pattern type: 0
| Context before: 1
| Context after: 1
|
|--EasyPattern [[ capture(0+ not cr or lf), cr, lf,\r\n capture(0+ not cr or lf), cr, lf,\r\n capture(longest 0+ not cr or lf), longest optional( cr, lf) ]] with [@inputFilename;$1 $2 $3;\r\n]
| [ ] Match case
| [ ] Whole words only
| [ ] Case sensitive replace
| [X] Prompt on replace
| [ ] Skip prompt if identical
| [ ] First only
| [ ] Extract matches
| Maximum text buffer size 4096