Page 1 of 1

downloading web-page by TP

Posted: Sun May 22, 2016 11:49 pm
by nikolas1612
is there any possibility to download by TP the source-code of the following web-page:

Code: Select all

https://sev.life/
Thanks

Re: downloading web-page by TP

Posted: Mon May 23, 2016 4:06 pm
by DataMystic Support
Yes, just put the url in the file list, and set the output filter to write to a new file.

Re: downloading web-page by TP

Posted: Mon May 23, 2016 6:47 pm
by nikolas1612
Hmmm
Did you actually try it?
The result is always blank, while option "analyze file" produces "Internet error 0" message.

Re: downloading web-page by TP

Posted: Mon May 23, 2016 10:30 pm
by dfhtextpipe
is not a file. It's a remote folder.

TP requires a list of files or a wildcard to be specified.

But can TP even input files over TSL or SSL ?

And how do we know that any files in that folder are free from malware?

David

Re: downloading web-page by TP

Posted: Mon May 23, 2016 11:55 pm
by nikolas1612
is not a file. It's a remote folder.
I know that, yet I was asking about downloading web-page (not a file). I cannot agree TP needs just files as I successfully downloaded web-pages with the links similar to the one you see above. The problem is just with some pages and this is one of them. I guess the matter is secure protocol used (https) but not for sure.

Re: downloading web-page by TP

Posted: Tue May 24, 2016 11:03 am
by DataMystic Support
Yes, TextPipe can download any url. It only checks for :// in the string, so https should still work.

No idea why https is not working.

Re: downloading web-page by TP

Posted: Sat Aug 06, 2016 4:19 am
by alnico
I also have issues with accessing sites using https

It seems like Textpipe is simply ignoring filenames beginning with https as there is no file input record created in the status tab when processing.
Furthermore, it skips these filenames in a split second (no time for even sending the request to the server).

Sometimes a https website can be accessed using http, if http resolves to https, but not always...

Here are a couple of restaurant sites for example:
testing in a browser...

--> This works:

http://seasalteatery.wordpress.com
resolves to
https://seasalteatery.wordpress.com
200 OK
Textpipe can access this website using filename: http://seasalteatery.wordpress.com


-->This does not work:

http://www.spoonandstable.com
resolves to
https://www.spoonandstable.com
301 moved permanently
Textpipe can not access this website using filename: http://www.spoonandstable.com
This is what is received:
<html>
<head><title>403 Forbidden</title></head>
<body bgcolor="white">
<center><h1>403 Forbidden</h1></center>
<hr><center>nginx</center>
</body>
</html>


Finally however, the problem really "seems" to be that Textpipes isn't even attempting to make a server request for https.

Brent
TP v8.9.9

Re: downloading web-page by TP

Posted: Sun Aug 07, 2016 11:03 pm
by DataMystic Support
I have confirmed the https redirection issue, and updated the code for Windows 10 where it now works perfectly.

Re: downloading web-page by TP

Posted: Tue Aug 09, 2016 12:14 am
by alnico
Thanks for fix Simon,

But this does not fix the bigger issue where https urls are skipped all together (unless this has been fixed since TP v8.9.9).
They simply are not even logged during processing the input file list. i.e. original request from nikolas1612 for https://sev.life/

Re: downloading web-page by TP

Posted: Tue Aug 09, 2016 10:07 pm
by DataMystic Support
Thanks - this was still an issue, and took a little bit to get it working.

Now it's all ready for TextPipe 10.0!