Page 1 of 1

Extract Data From a table

Posted: Sat Nov 29, 2008 8:20 am
by buckley
Hello,

Im very interested in trying out textpipe after discovering it with offline explorer.

I don't want you to do my homework but I would like to check with you if what I'm aiming for is possible.

On this page http://www.humo.be/cps/rde/xchg/humo/hs ... m_Top.html you can find the rating given to a movie.
It is below the top 10.

Eg. Vinyan has 3.5 stars

Is it possible with TP to mump this data in a sql server everytime a new move shows up in the list?

Bascily my challegne boild down to parsing a table and detectig under wich rating the row hangs (I think)

Kind Regards, Tom

Re: Extract Data From a table

Posted: Wed Dec 03, 2008 7:10 pm
by buckley
Bump?

Re: Extract Data From a table

Posted: Thu Dec 04, 2008 6:54 am
by DataMystic Support
Sure Tom,

As you say, you need to extract just the relevant sections. Look at the whitepaper part of the site for a useful guide to web site mining.

The best approach would be to attempt to insert a new title into a database every time you poll the website, and use the db index to discard duplicate rows.

Re: Extract Data From a table

Posted: Thu Dec 04, 2008 8:16 am
by buckley
OK that makes sense (ignore duplicate values in the index)

What technique should I use to caputre the rating ?

rating ***
Movie 1
Movie 2
Movie 3
raing *
Move 4

=> What if Movie 3 was new and I need to store 3 starts with it?

Should I write procedure logic to capture this or is there another technique you can advice?

Regards, Tom

Re: Extract Data From a table

Posted: Thu Dec 04, 2008 10:15 am
by DataMystic Support
Hi Tom,

First generate an extract that contains just titles and ratings. Then use a Restrict to each line filter, and inside this 2 steps - capture the rating to a variable and then add it to each line with an Add Left Margin filter.