Remove partial duplicate lines from list
Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators
-
- Posts: 9
- Joined: Wed Apr 17, 2013 9:12 am
Remove partial duplicate lines from list
Is that possible to remove partial duplicate lines from a list? So, let's say I have in a file some sentences(one per line):
"Hi there. I want to go to school because I like it"
"Hi there. I want to go to school because I like it a lot"
"Hi there. I want to go to school because I like it a lot and I don't care about what others are saying"
I want after placing the filter to remain just with line 1. Lines 2 and 3 should be deleted because are starting with the same sentence + some extra words sentences"
Can I do this with TextPipe Pro?
Thanks
"Hi there. I want to go to school because I like it"
"Hi there. I want to go to school because I like it a lot"
"Hi there. I want to go to school because I like it a lot and I don't care about what others are saying"
I want after placing the filter to remain just with line 1. Lines 2 and 3 should be deleted because are starting with the same sentence + some extra words sentences"
Can I do this with TextPipe Pro?
Thanks
-
- Posts: 22
- Joined: Tue May 12, 2015 3:57 am
Re: Remove partial duplicate lines from list
Have just solved a similar task.
viewtopic.php?f=17&t=2195
The only difference is that I needed to find all unique abbreviations while you need to find all unique 20 (for instance) symbols in the beginning of every line.
So try this attachment. It's an attempt to adapt my filter to your demand. Look for explanation in filter's comments.
viewtopic.php?f=17&t=2195
The only difference is that I needed to find all unique abbreviations while you need to find all unique 20 (for instance) symbols in the beginning of every line.
So try this attachment. It's an attempt to adapt my filter to your demand. Look for explanation in filter's comments.
- Attachments
-
- removepartialduplicates.rar
- (1.43 KiB) Downloaded 1583 times
-
- Posts: 9
- Joined: Wed Apr 17, 2013 9:12 am
Re: Remove partial duplicate lines from list
Thank you.I will try it and get back to you.
Later edit: It seems I have some problems with the file donwloaded. It's not recognized by text pipe. Can you please activate your PM settings from your profile so I can send you a PM? thanks
Later edit: It seems I have some problems with the file donwloaded. It's not recognized by text pipe. Can you please activate your PM settings from your profile so I can send you a PM? thanks
-
- Posts: 22
- Joined: Tue May 12, 2015 3:57 am
Re: Remove partial duplicate lines from list
My PM is activated already
by the way did you unpack the rar before loading it to TP?
by the way did you unpack the rar before loading it to TP?
-
- Posts: 9
- Joined: Wed Apr 17, 2013 9:12 am
Re: Remove partial duplicate lines from list
Well it seems I cannot PM you.
Anyway, I have an old version of textpipe pro. And when I load the filter I get this error: http://prntscr.com/7a4xzt. After pressing ok this error: http://prntscr.com/7a4y85 and is crashing http://prntscr.com/7a4yig
Is there anyway to save the filer as for an old version of Textpipe Pro 9.1?:D
Anyway, I have an old version of textpipe pro. And when I load the filter I get this error: http://prntscr.com/7a4xzt. After pressing ok this error: http://prntscr.com/7a4y85 and is crashing http://prntscr.com/7a4yig
Is there anyway to save the filer as for an old version of Textpipe Pro 9.1?:D
-
- Posts: 22
- Joined: Tue May 12, 2015 3:57 am
Re: Remove partial duplicate lines from list
No. You may download trial TP pro 9.9 and everything will work fine. Then you can process your text either by trial version or just use it to look inside the filter attached. I may additionally encourage you that your task seems to be solved there.
-
- Posts: 9
- Joined: Wed Apr 17, 2013 9:12 am
Re: Remove partial duplicate lines from list
I installed 9.9 trial and I'm struggling in the last 3 days to make it work but I didn't have success.
i tried every combination in my mind and following the logic but I couldn't done the filter properly.
if you are kind enough and you have some spare time can you please look to the file I want to remove partial duplicates from? It will be a great help.
Thank you, appreciated.
https://www.sendspace.com/file/sb8v4t
i tried every combination in my mind and following the logic but I couldn't done the filter properly.
if you are kind enough and you have some spare time can you please look to the file I want to remove partial duplicates from? It will be a great help.
Thank you, appreciated.
https://www.sendspace.com/file/sb8v4t
-
- Posts: 22
- Joined: Tue May 12, 2015 3:57 am
Re: Remove partial duplicate lines from list
I dropped the file at TP window (with the filter already loaded) and pressed f9 to start processing. The latter succeded in about 4 minutes.
The result obtained by the filter is attached. Look inside. Is that what you want?
Everything worked as it was planned. Yet keep in mind - everything depends on what you personally understand under "partial duplicates". The current filter counts a line as "partial duplicate" if it's first 20 symbols are not unique.
https://www.sendspace.com/file/ljaahy
P.S. I've just found a much more fast analogue of my filter inside this program, already in-built
Look for it inside "Remove" block - It's called "Remove duplicate lines". I never looked inside it thinking that it compares just complete lines - but it has an option "length" defining the number of characters to compare (you just set it to 20 and achieve the same result in 5 seconds).
So you may try this filter -----
The result obtained by the filter is attached. Look inside. Is that what you want?
Everything worked as it was planned. Yet keep in mind - everything depends on what you personally understand under "partial duplicates". The current filter counts a line as "partial duplicate" if it's first 20 symbols are not unique.
https://www.sendspace.com/file/ljaahy
P.S. I've just found a much more fast analogue of my filter inside this program, already in-built
Look for it inside "Remove" block - It's called "Remove duplicate lines". I never looked inside it thinking that it compares just complete lines - but it has an option "length" defining the number of characters to compare (you just set it to 20 and achieve the same result in 5 seconds).
So you may try this filter -----
- Attachments
-
- remove-dups.rar
- (858 Bytes) Downloaded 1443 times
-
- Posts: 9
- Joined: Wed Apr 17, 2013 9:12 am
Re: Remove partial duplicate lines from list
Damn, I can't thank you enough. You are a great person that you used your time to help me with this problem creating that filter. And the thing the remove lines was just in front of me it's crazy. I used that filter forever but I never knew that "length" is actually an option for partial duplicates.
Thanks again man
Thanks again man