Extraction from HTML Page for inclusion in CSV

Get help with installation and running here.

Moderators: DataMystic Support, Moderators, DataMystic Support, Moderators, DataMystic Support, Moderators

Post Reply
Posts: 1
Joined: Mon Mar 13, 2006 12:37 pm

Extraction from HTML Page for inclusion in CSV

Post by insidemagic »


I am evaluating textpipe and webpipe. I have 1300 articles written and published in html that I want to move into a Drupal site so I need to load them into mysql. I tried mark the portions of the source code for each page by things like <!-- item --> // <!-- enditem --> and <!-- content --> //<!-- endcontent -->.

I confess that I am not bright but at least I'm lazy. I want to be able to process through the webpages, pull out the text (html code) in between the ad hoc section markers and put them into a CSV file in separate columns headed by the description so all of the content for a given article would be in the column "Content" and the title for the same article would be under "Title" etc.

Is there a way to do this? I appreciate any help at all. The two products have blown me away with their stability and power. Outstanding!

BTW, I could avoid all of this if there was a way to convert my web pages into rss to input into Drupal.

Thank you in advance for any thoughts or suggestions!

User avatar
DataMystic Support
Site Admin
Posts: 2227
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia

Post by DataMystic Support »

Hi Tim,

Are these the only two markers?

If so, a judicous search/replace can do it. Can you email us a sample file so we can show you how?
Post Reply