curl-users
Re: Building recursive list of URLs, no page downloading
Date: Thu, 30 Jan 2003 09:01:50 +1000
Wget would probably be a better bet as it has recursive things built into it.
Dunno how it would handle the cookies and such like tho.
You could do it all with curl, but you would have to build a script around it to process the pages as they download to get the links out of it.
>>> curl_at_davidcross.com 30/01/03 8:52:16 am >>>
I need to spider thousands of URLs at our company's websites to see how
many URLs there are there for the move to our new CMS.
Access to some sections of the websites are user/pass restricted and
authentication is performed through cookies, not standard HTTP/auth. so it
is essential that I can load cookie/s into this program.
Also, I do not need to actually download the URL, just note its URL and
move onto the next URL linked from the first page.
Not sure whether there is a way in curl, or perhaps wget?
Thanks for any suggestions,
David
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
**********************************************************************
This e-mail, including any attachments sent with it, is confidential
and for the sole use of the intended recipient(s). This confidentiality
is not waived or lost if you receive it and you are not the intended
recipient(s), or if it is transmitted/ received in error.
Any unauthorised use, alteration, disclosure, distribution or review
of this e-mail is prohibited. It may be subject to a statutory duty of
confidentiality if it relates to health service matters.
If you are not the intended recipient(s), or if you have received this
e-mail in error, you are asked to immediately notify the sender by
telephone or by return e-mail. You should also delete this e-mail
message and destroy any hard copies produced.
**********************************************************************
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld http://www.vasoftware.com
Received on 2003-01-30