Page 1 of 1

Delete duplicate lines in a delimeted file

Posted: Sun May 15, 2016 5:29 am
by leorr11
Hi all,

I have a files with pipe delimited fields.
I`d like to delete lines which are duplicated in one field only.

I`ve been trying in several ways without result.

It seems that "delete duplicate lines" and "count duplicate lines" don`t work as a subfilter of "restrict fields"

Re: Delete duplicate lines in a delimeted file

Posted: Mon May 16, 2016 5:10 pm
by DataMystic Support
No, as each text fragment passed to Delete duplicate lines is considered a new 'file'.

Are the lines with duplicate fields next to each other? If so, use an EasyPattern like this to identify them:

Code: Select all

[ capture( pipefield), pipe, capture( pipefield), pipe, capture( pipefield), pipe, cr, lf
  group1, pipe ]
This will find lines where field1 is repeated on the next line.

This next EasyPattern finds lines where field 3 is repeated on the next line:

Code: Select all

[ capture( pipefield), pipe, capture( pipefield), pipe, capture( pipefield), pipe, cr, lf
  capture( pipefield), pipe, capture( pipefield), pipe, group3, pipe ]

Re: Delete duplicate lines in a delimeted file

Posted: Tue May 17, 2016 6:23 pm
by leorr11
thanks, it works

Re: Delete duplicate lines in a delimeted file

Posted: Mon Aug 22, 2016 3:46 pm
by JamesB
leorr11 wrote:Hi all,

I have a files with pipe delimited fields.
I`d like to delete lines which are duplicated in one field only.

I`ve been trying in several ways without result.

It seems that "delete duplicate lines" and "count duplicate lines" don`t work as a subfilter of "restrict fields"
I have a similar question, is there a way to delete duplicate lines but leave the original line? When I do this it seems to remove even the original.

Re: Delete duplicate lines in a delimeted file

Posted: Mon Aug 22, 2016 4:20 pm
by DataMystic Support
Which version of TP do you have?

Is Filter Library\Remove\Duplicate lines your only filter?