curl-library
Re: libcurl stumbles on weird redirect location
Date: Sun, 25 Dec 2011 22:56:01 -0600
Hi Daniel
You are right about that http://www.officedepot.com;**jsessionid=**
0000EXsMRFMF5kwJo26qgOif31d:**13ddq0tfm is treated as search term by
Chrome BUT I am afraid I was not entirely clear that THIS url
http://www.officedepot.com/promo/redir.do?siteid=qpF0HYnRugA-63LcCNNtsuVrjXbS8XCqNA&url=http://www.officedepot.com
needs
to be used as seed one. The server responds with 301 with that weird
"Location" header. However when you try this seed url in chrome it
navigates correctly to new location. The mystery for me is that Chrome
seems to get well-formed location header in contrast to curl. Here is
Chrome log captured from chrome://net-internals/#events. Please see below...
Any special requests headers that make the difference? Wonder if I can
achieve similar effect with curl.
Start Time: Sun Dec 25 2011 22:32:59 GMT-0600 (CST)
t=1324873979670 [st= 0] +REQUEST_ALIVE [dt=396]
t=1324873979670 [st= 0] URL_REQUEST_START_JOB [dt= 0]
--> load_flags = 68223105
(ENABLE_LOAD_TIMING | MAIN_FRAME | MAYBE_USER_GESTURE | VALIDATE_CACHE
| VERIFY_EV_CERT)
--> method = "GET"
--> priority = 0
--> url =
"http://www.officedepot.com/promo/redir.do?siteid=qpF0HYnRugA-63LcCNNtsuVrjXbS8XCqNA&url=http://www.officedepot.com"
t=1324873979670 [st= 0] +URL_REQUEST_START_JOB [dt=116]
--> load_flags = 68223105
(ENABLE_LOAD_TIMING | MAIN_FRAME | MAYBE_USER_GESTURE | VALIDATE_CACHE
| VERIFY_EV_CERT)
--> method = "GET"
--> priority = 0
--> url =
"http://www.officedepot.com/promo/redir.do?siteid=qpF0HYnRugA-63LcCNNtsuVrjXbS8XCqNA&url=http://www.officedepot.com"
t=1324873979670 [st= 0] HTTP_CACHE_GET_BACKEND [dt= 0]
t=1324873979670 [st= 0] HTTP_CACHE_OPEN_ENTRY [dt= 1]
--> net_error = -2 (FAILED)
t=1324873979671 [st= 1] HTTP_CACHE_CREATE_ENTRY [dt= 5]
t=1324873979676 [st= 6] HTTP_CACHE_ADD_TO_ENTRY [dt= 0]
t=1324873979676 [st= 6] +HTTP_STREAM_REQUEST [dt= 3]
t=1324873979679 [st= 9] HTTP_STREAM_REQUEST_BOUND_TO_JOB
--> source_dependency =
{"id":42262,"type":11}
t=1324873979679 [st= 9] -HTTP_STREAM_REQUEST
t=1324873979679 [st= 9] +HTTP_TRANSACTION_SEND_REQUEST [dt= 0]
t=1324873979679 [st= 9] HTTP_TRANSACTION_SEND_REQUEST_HEADERS
--> GET
/promo/redir.do?siteid=qpF0HYnRugA-63LcCNNtsuVrjXbS8XCqNA&url=http://www.officedepot.com
HTTP/1.1
Host: www.officedepot.com
Connection: keep-alive
Cache-Control: max-age=0
User-Agent: Mozilla/5.0 (X11;
Linux i686) AppleWebKit/535.2 (KHTML, like Gecko) Ubuntu/11.10
Chromium/15.0.874.106 Chrome/15.0.874.106 Safari/535.2
Accept:
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset:
ISO-8859-1,utf-8;q=0.7,*;q=0.3
Cookie: [value was stripped]
t=1324873979679 [st= 9] -HTTP_TRANSACTION_SEND_REQUEST
t=1324873979679 [st= 9] +HTTP_TRANSACTION_READ_HEADERS [dt=106]
t=1324873979679 [st= 9] HTTP_STREAM_PARSER_READ_HEADERS [dt=106]
t=1324873979785 [st=115] HTTP_TRANSACTION_READ_RESPONSE_HEADERS
--> HTTP/1.1 301 Moved Permanently
Server: IBM_HTTP_Server
Pragma: No-cache
Cache-Control:
no-cache,no-store,max-age=0
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Location: http://www.officedepot.com
Content-Encoding: gzip
P3P: CP="ALL DEVa TAIa OUR BUS
UNI NAV STA PRE" policyref="http://www.officedepot.com/w3c/p3p.xml"
Content-Length: 20
Content-Type: text/html
Content-Language: en-US
Date: Mon, 26 Dec 2011 04:32:59 GMT
Connection: keep-alive
Vary: Accept-Encoding
Set-Cookie: [value was stripped]
Set-Cookie: [value was stripped]
t=1324873979785 [st=115] -HTTP_TRANSACTION_READ_HEADERS
t=1324873979785 [st=115] +HTTP_CACHE_WRITE_INFO [dt= 1]
t=1324873979786 [st=116] URL_REQUEST_REDIRECTED
--> location = "http://www.officedepot.com/"
t=1324873979786 [st=116] -URL_REQUEST_START_JOB
t=1324873979786 [st=116] URL_REQUEST_START_JOB [dt= 11]
--> load_flags = 68223105
(ENABLE_LOAD_TIMING | MAIN_FRAME | MAYBE_USER_GESTURE | VALIDATE_CACHE
| VERIFY_EV_CERT)
--> method = "GET"
--> priority = 0
--> url = "http://www.officedepot.com/"
t=1324873979800 [st=130] +URL_REQUEST_START_JOB [dt=238]
--> load_flags = 68223105
(ENABLE_LOAD_TIMING | MAIN_FRAME | MAYBE_USER_GESTURE | VALIDATE_CACHE
| VERIFY_EV_CERT)
--> method = "GET"
--> priority = 0
--> url = "http://www.officedepot.com/"
t=1324873979800 [st=130] HTTP_CACHE_GET_BACKEND [dt= 0]
t=1324873979800 [st=130] HTTP_CACHE_OPEN_ENTRY [dt= 0]
--> net_error = -2 (FAILED)
t=1324873979800 [st=130] HTTP_CACHE_CREATE_ENTRY [dt= 1]
t=1324873979801 [st=131] HTTP_CACHE_ADD_TO_ENTRY [dt= 0]
t=1324873979801 [st=131] +HTTP_STREAM_REQUEST [dt= 0]
t=1324873979801 [st=131] HTTP_STREAM_REQUEST_BOUND_TO_JOB
--> source_dependency =
{"id":42293,"type":11}
t=1324873979801 [st=131] -HTTP_STREAM_REQUEST
t=1324873979801 [st=131] +HTTP_TRANSACTION_SEND_REQUEST [dt= 0]
t=1324873979801 [st=131] HTTP_TRANSACTION_SEND_REQUEST_HEADERS
--> GET / HTTP/1.1
Host: www.officedepot.com
Connection: keep-alive
Cache-Control: max-age=0
User-Agent: Mozilla/5.0 (X11;
Linux i686) AppleWebKit/535.2 (KHTML, like Gecko) Ubuntu/11.10
Chromium/15.0.874.106 Chrome/15.0.874.106 Safari/535.2
Accept:
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset:
ISO-8859-1,utf-8;q=0.7,*;q=0.3
Cookie: [value was stripped]
t=1324873979801 [st=131] -HTTP_TRANSACTION_SEND_REQUEST
t=1324873979801 [st=131] +HTTP_TRANSACTION_READ_HEADERS [dt=236]
t=1324873979801 [st=131] HTTP_STREAM_PARSER_READ_HEADERS [dt=236]
t=1324873980037 [st=367] HTTP_TRANSACTION_READ_RESPONSE_HEADERS
--> HTTP/1.1 200 OK
Server: IBM_HTTP_Server
Pragma: No-cache
Cache-Control:
no-cache,no-store,max-age=0
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Content-Encoding: gzip
P3P: CP="ALL DEVa TAIa OUR BUS
UNI NAV STA PRE" policyref="http://www.officedepot.com/w3c/p3p.xml"
Content-Type: text/html; charset=UTF-8
Content-Language: en-US
Content-Length: 18021
Date: Mon, 26 Dec 2011 04:33:00 GMT
Connection: keep-alive
Vary: Accept-Encoding
t=1324873980037 [st=367] -HTTP_TRANSACTION_READ_HEADERS
t=1324873980037 [st=367] HTTP_CACHE_WRITE_INFO [dt= 1]
t=1324873980038 [st=368] -URL_REQUEST_START_JOB
t=1324873980038 [st=368] HTTP_TRANSACTION_READ_BODY [dt= 0]
t=1324873980039 [st=369] HTTP_TRANSACTION_READ_BODY [dt= 0]
t=1324873980040 [st=370] HTTP_TRANSACTION_READ_BODY [dt= 0]
t=1324873980040 [st=370] HTTP_TRANSACTION_READ_BODY [dt= 0]
t=1324873980041 [st=371] HTTP_TRANSACTION_READ_BODY [dt= 0]
t=1324873980041 [st=371] HTTP_TRANSACTION_READ_BODY [dt= 24]
t=1324873980066 [st=396] HTTP_TRANSACTION_READ_BODY [dt= 0]
t=1324873980066 [st=396] -REQUEST_ALIVE
On Sun, Dec 25, 2011 at 12:18 PM, Daniel Stenberg <daniel_at_haxx.se> wrote:
> On Sat, 24 Dec 2011, Alex Vinnik wrote:
>
> I am having a problem using libcurl in my Ruby app (Curb gem binds
>> directly to libcurl). Specifically libcurl can't follow a weird redirect
>> served by a web server. Server Location header "Location:
>> http://www.officedepot.com;**jsessionid=**0000EXsMRFMF5kwJo26qgOif31d:**13ddq0tfm"
>> doesn't look RFC complaint to me. For whatever reason jsessionid gets added
>> to the end of new location.
>>
>
> Wow. That's a broken URL that I've not seen used before.
>
> Somehow browsers can handle this redirect.
>>
>
> I pasted that URL into chrome, and it can't deal with it when given in the
> address bar at least. It treats it as a search string instead.
>
> I pasted it into Firefox's URL bar and it inserted a slash in front of the
> first semicolon by itself and then showed the site.
>
> So browsers at least aren't uniformly considering this a good address.
>
> if there is a way to work around this problem?
>>
>
> No, I can't think of any.
>
> Since this is a fairly big site and at least one of the major browsers
> support this format, I think we should consider supporting it. Even though
> it would be under protest.
>
> Any other opinions?
>
> --
>
> / daniel.haxx.se
> ------------------------------**------------------------------**-------
> List admin: http://cool.haxx.se/list/**listinfo/curl-library<http://cool.haxx.se/list/listinfo/curl-library>
> Etiquette: http://curl.haxx.se/mail/**etiquette.html<http://curl.haxx.se/mail/etiquette.html>
>
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2011-12-26