Page 1 of 1
Delete duplicate lines in a delimeted file
Posted: Sun May 15, 2016 5:29 am
by leorr11
Hi all,
I have a files with pipe delimited fields.
I`d like to delete lines which are duplicated in one field only.
I`ve been trying in several ways without result.
It seems that "delete duplicate lines" and "count duplicate lines" don`t work as a subfilter of "restrict fields"
Re: Delete duplicate lines in a delimeted file
Posted: Mon May 16, 2016 5:10 pm
by DataMystic Support
No, as each text fragment passed to Delete duplicate lines is considered a new 'file'.
Are the lines with duplicate fields next to each other? If so, use an EasyPattern like this to identify them:
Code: Select all
[ capture( pipefield), pipe, capture( pipefield), pipe, capture( pipefield), pipe, cr, lf
group1, pipe ]
This will find lines where field1 is repeated on the next line.
This next EasyPattern finds lines where field 3 is repeated on the next line:
Code: Select all
[ capture( pipefield), pipe, capture( pipefield), pipe, capture( pipefield), pipe, cr, lf
capture( pipefield), pipe, capture( pipefield), pipe, group3, pipe ]
Re: Delete duplicate lines in a delimeted file
Posted: Tue May 17, 2016 6:23 pm
by leorr11
thanks, it works
Re: Delete duplicate lines in a delimeted file
Posted: Mon Aug 22, 2016 3:46 pm
by JamesB
leorr11 wrote:Hi all,
I have a files with pipe delimited fields.
I`d like to delete lines which are duplicated in one field only.
I`ve been trying in several ways without result.
It seems that "delete duplicate lines" and "count duplicate lines" don`t work as a subfilter of "restrict fields"
I have a similar question, is there a way to delete duplicate lines but leave the original line? When I do this it seems to remove even the original.
Re: Delete duplicate lines in a delimeted file
Posted: Mon Aug 22, 2016 4:20 pm
by DataMystic Support
Which version of TP do you have?
Is Filter Library\Remove\Duplicate lines your only filter?