Relocating strings from one place to another

Get help with installation and running here.

Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators

Post Reply
gerd
Posts: 39
Joined: Wed Mar 12, 2008 10:52 pm

Relocating strings from one place to another

Post by gerd »

I am still struggling with the movement of strings from one place to another in a file. Therefore I constructed the following simple example.
My target is: I want to move the content which is between
<beginstring01> and </beginstring01> to the place where <newstring01> is located
and
<beginstring02> and </beginstring02> to the place where <newstring02> is located

Here is the example file:

Code: Select all

<html>

<beginstring01>
this is just "example text" 01 which I want to move to another place which is located further down.
</beginstring01>

<beginstring02>
this is just example text 02 Which I want to move to another place which is located further down.
</beginstring02>

Here is just other text to fill the gap. And here are the new locations I want the strings to appear:

<newstring01>

and here <newstring02>

</html>
It drives me crazy but I cannot make it. Therefore, I ask for help.
gerd
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Post by DataMystic Support »

Search for perl pattern:

<beginstring01>(.*)</beginstring01>(.*)<newstring01>

Replace with

$2$$1$
gerd
Posts: 39
Joined: Wed Mar 12, 2008 10:52 pm

Post by gerd »

Simon,
thanks a lot, it works fine. My mistake was that that I did not put "(.*)<newstring01>" in the SAME find pattern. Now I know how to move strings around.

Can you also show me how to copy (instead of move) the contents of <beginstring01>(.*)</beginstring01> to the place of <newstring01>. I would like to keep the content between the <beginstring01> tags and have it appear at the place of <newstring01>.

I looked for any filters and I played a little with this and that (e.g. send to subfilter). But I guess I always need an example on which I can refer to. Or is it not possible with TextPipe?
gerd
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Post by DataMystic Support »

Hi Gerd,

It's easy :-) To copy the string instead of moving it, search for perl pattern:

(<beginstring01>(.*)</beginstring01>)(.*)<newstring01>

Replace with

$1$$3$$2$
dfhtextpipe
Posts: 988
Joined: Sun Dec 09, 2007 2:49 am
Location: UK

What is the extra $ for? Is this documented in help?

Post by dfhtextpipe »

Simon,

This is the first time I have seen replacements like

Code: Select all

$1$$3$$2$
I had wondered how you replace a pattern with more than one subpatterns with nothing in between them.
  • Is this what the extra $ is for in each subpattern?
    Is this aspect documented in the help file?
Best regards,
David Haslam
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Post by DataMystic Support »

The extra $ is to disambiguate the two possibilities of

$1 followed by the literal '1', and
$11.

- $ marks the end of the captured variable name.

It is not required when there is other text around it e..g
here is my $1 separate to my $2.

If you had written
$1$3$2
this would be interpreted as
$1$ literal '3' $2

Sorry - I didn't understand your other question - can you please give an example?
simoninsing
Posts: 20
Joined: Fri Jun 05, 2009 11:11 pm

Re: Relocating strings from one place to another

Post by simoninsing »

I couldn't quite understand how to move/copy strings within a line. This function is probably what I need. I have say 800,000 lines of text, each with geographical place names and other bits and pieces within them. My ultimate aim is to determine the relevant Chinese province for each line. Here's an example:
"Changchun / China Life Insurance Company Limited, Changchun City, Chaoyang Branch Company"
For 90% of these 800,000 lines the province is explicitly stated, and these ones are not the problem. The problem is the 100,000 or so where the province is not in fact stated, such as the above example. In fact the province in that case is Jilin, but because of use of conflicting place names in China (eg. Chaoyang can be a district in Beijing or a city in Jilin) I need to develop some "rules" that will derive the correct province. In the above example, the relevant "rule" is "if you see Changchun and Chaoyang in the same line, then the province is Jilin". Now there are two obvious ways to do this, either by setting up a multiple character string search for each line which will look for two character strings and if they are present, stick some unique character string at the end of the line (say) [not sure if TP will do this for me ?]. Or I can simply ask TP to look for any of a list of character strings (Changchun, Chaoyang, and a hundred others) and then copy or move them to the end of the line, preferably with another character like "^" preceding them, so I can then take the output and dump into Excel to run some IF statements to see if my predermined string pairs are present in any lines.
Assistance greatly appreciated.

Simon D
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Relocating strings from one place to another

Post by DataMystic Support »

Why not restrict to lines matching perl pattern
Changchun(.*)Chaoyang|Chaoyang(.*)Changchun
ie either ordering, then add a subfilter to add a right margin of
Jilin
?
simoninsing
Posts: 20
Joined: Fri Jun 05, 2009 11:11 pm

Re: Relocating strings from one place to another

Post by simoninsing »

This looks highly promising and I think will have application to a few other challenges I face ... Thanks in anticipation.
simoninsing
Posts: 20
Joined: Fri Jun 05, 2009 11:11 pm

Re: Relocating strings from one place to another

Post by simoninsing »

...and that has indeed helped enormously, and has bumped me into the Perl world which is what I needed.

However sending the Replace command output to the end of the line is not working for me. Right margin as a sub-filter is not doing anything, and I can't find anything in the Perl documentation which tells you how to throw Replace command output somewhere specific (although plenty on ^ and $ used in the Search input side of the equation). Is there some easy way I can get the text that has been replaced dumped somewhere useful (i.e. end or beginning of the line ?)
User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Re: Relocating strings from one place to another

Post by DataMystic Support »

Ensure the pattern matches the remainder of the line so that Add Right Margin get to place it in the right spot. Use a perl pattern of:

Code: Select all

Changchun(.*)Chaoyang|Chaoyang(.*)Changchun.*$
Post Reply