Page 1 of 1

Extraction from HTML Page for inclusion in CSV

Posted: Mon Mar 13, 2006 12:47 pm
by insidemagic
Hi:

I am evaluating textpipe and webpipe. I have 1300 articles written and published in html that I want to move into a Drupal site so I need to load them into mysql. I tried mark the portions of the source code for each page by things like <!-- item --> // <!-- enditem --> and <!-- content --> //<!-- endcontent -->.

I confess that I am not bright but at least I'm lazy. I want to be able to process through the webpages, pull out the text (html code) in between the ad hoc section markers and put them into a CSV file in separate columns headed by the description so all of the content for a given article would be in the column "Content" and the title for the same article would be under "Title" etc.

Is there a way to do this? I appreciate any help at all. The two products have blown me away with their stability and power. Outstanding!

BTW, I could avoid all of this if there was a way to convert my web pages into rss to input into Drupal.

Thank you in advance for any thoughts or suggestions!

Tim

Posted: Tue Mar 14, 2006 9:28 am
by DataMystic Support
Hi Tim,

Are these the only two markers?

If so, a judicous search/replace can do it. Can you email us a sample file so we can show you how?