Replacing bad HTML
Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators
Replacing bad HTML
I have a number of html pages converted from Word that have variations of bad paragraph endings peppered throughout that affect the space between paragraphs:
<br>
<br>
which should be replaced with
<p>
An exact match works, of course, but I don't trust the exact layout of this example to be universal, so I want to code an inclusive search between any pair of <br> tags ignoring whitespace with oneormore forced spaces (' ')
I've tried a number of EZ Pattern variations but am stumped and my trial runs always miss the pattern.
Here is the trial data:
<p>That means that from this great State of Michigan we want that part of the leadership. After all, you have the Senator who is the head of the Republican Policy Committee in the Senate body. By all means you must send him back and support him with the big delegation that you are capable of sending.<br>
<br>
You have nominated great State and national tickets, your Governor,<br>
your Senators, your Congressmen, your State officers.<br>
<br>
Thanx in Advance. Textpipe is a miracle worker!
<br>
<br>
which should be replaced with
<p>
An exact match works, of course, but I don't trust the exact layout of this example to be universal, so I want to code an inclusive search between any pair of <br> tags ignoring whitespace with oneormore forced spaces (' ')
I've tried a number of EZ Pattern variations but am stumped and my trial runs always miss the pattern.
Here is the trial data:
<p>That means that from this great State of Michigan we want that part of the leadership. After all, you have the Senator who is the head of the Republican Policy Committee in the Senate body. By all means you must send him back and support him with the big delegation that you are capable of sending.<br>
<br>
You have nominated great State and national tickets, your Governor,<br>
your Senators, your Congressmen, your State officers.<br>
<br>
Thanx in Advance. Textpipe is a miracle worker!
-Regards
Bernie Pobiak
Pubcomm Group NYC
Bernie Pobiak
Pubcomm Group NYC
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Thanks Bernie,
Just use
and replace with
<p>
Just use
Code: Select all
<br>[ 0+ whitespace or ' ' or cr or lf ]<br>
<p>
Almost there...
That makes sense - but I tried it and for the example below it replaces with many <P>, not a single one. (see result below)
How can it be limited to acting between the <br> tags only once?
Thanx, Simon!
b
New Result:
<p>That means that from this great State of Michigan we want that part of the leadership. After all, you have the Senator who is the head of the Republican Policy Committee in the Senate body. By all means you must send him back and support him with the big delegation that you are capable of sending.<p>
<p><p><p> <p>
You have nominated great State and national tickets, your Governor,<p>
your Senators, your Congressmen, your State officers.<p>
<p><p><p> <p>
Sample:
<p>That means that from this great State of Michigan we want that part of the leadership. After all, you have the Senator who is the head of the Republican Policy Committee in the Senate body. By all means you must send him back and support him with the big delegation that you are capable of sending.<br>
<br>
You have nominated great State and national tickets, your Governor,<br>
your Senators, your Congressmen, your State officers.<br>
<br>
How can it be limited to acting between the <br> tags only once?
Thanx, Simon!
b
New Result:
<p>That means that from this great State of Michigan we want that part of the leadership. After all, you have the Senator who is the head of the Republican Policy Committee in the Senate body. By all means you must send him back and support him with the big delegation that you are capable of sending.<p>
<p><p><p> <p>
You have nominated great State and national tickets, your Governor,<p>
your Senators, your Congressmen, your State officers.<p>
<p><p><p> <p>
Sample:
<p>That means that from this great State of Michigan we want that part of the leadership. After all, you have the Senator who is the head of the Republican Policy Committee in the Senate body. By all means you must send him back and support him with the big delegation that you are capable of sending.<br>
<br>
You have nominated great State and national tickets, your Governor,<br>
your Senators, your Congressmen, your State officers.<br>
<br>
-Regards
Bernie Pobiak
Pubcomm Group NYC
Bernie Pobiak
Pubcomm Group NYC
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Sorry, it should be:
Code: Select all
<br>[ longest 0+ whitespace or ' ' or cr or lf ]<br>
Hmmm... still multiple <p> result (see new results below). Is the application of a OneOrMore for the occurrances of possible?
New Result:
<p>That means that from this great State of Michigan we want that part of the leadership. After all, you have the Senator who is the head of the Republican Policy Committee in the Senate body. By all means you must send him back and support him with the big delegation that you are capable of sending.<p><p><p><p> <p>You have nominated grea
Thanx.
New Result:
<p>That means that from this great State of Michigan we want that part of the leadership. After all, you have the Senator who is the head of the Republican Policy Committee in the Senate body. By all means you must send him back and support him with the big delegation that you are capable of sending.<p><p><p><p> <p>You have nominated grea
Thanx.
-Regards
Bernie Pobiak
Pubcomm Group NYC
Bernie Pobiak
Pubcomm Group NYC
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact:
Sorry, second mistake.
It should be:
It should be:
Code: Select all
<br>[ longest 0+ (whitespace or ' ' or cr or lf) ]<br>
Bingo!
Perfect! That works exactly right! Thank you Simon!
So that I learn from the experience, let me try to break down the easy pattern:
<br>[ longest 0+ (whitespace or ' ' or cr or lf) ]<br>
means
Find occurrances where the are 2 <br> codes containing between them the highest number of repetitions of zero or more repetitions of either whitespace or ' ' or cr or lf
I think I get it. Thanx again!
So that I learn from the experience, let me try to break down the easy pattern:
<br>[ longest 0+ (whitespace or ' ' or cr or lf) ]<br>
means
Find occurrances where the are 2 <br> codes containing between them the highest number of repetitions of zero or more repetitions of either whitespace or ' ' or cr or lf
I think I get it. Thanx again!
-Regards
Bernie Pobiak
Pubcomm Group NYC
Bernie Pobiak
Pubcomm Group NYC
- DataMystic Support
- Site Admin
- Posts: 2227
- Joined: Mon Jun 30, 2003 12:32 pm
- Location: Melbourne, Australia
- Contact: