curl-library
Re: libcurl stumbles on weird redirect location
Date: Sun, 25 Dec 2011 22:56:01 -0600
Hi Daniel
You are right about that http://www.officedepot.com;**jsessionid=**
0000EXsMRFMF5kwJo26qgOif31d:**13ddq0tfm is treated as search term by
Chrome BUT I  am afraid I was not entirely clear  that THIS url
http://www.officedepot.com/promo/redir.do?siteid=qpF0HYnRugA-63LcCNNtsuVrjXbS8XCqNA&url=http://www.officedepot.com
needs
to be used as seed one. The server responds with 301 with that weird
"Location" header. However when you try this seed url in chrome it
navigates correctly to new location. The mystery  for me is that Chrome
seems to get well-formed location header in contrast to curl. Here is
Chrome log captured from chrome://net-internals/#events. Please see below...
Any special requests headers that make the difference? Wonder if I can
achieve similar effect with curl.
Start Time: Sun Dec 25 2011 22:32:59 GMT-0600 (CST)
t=1324873979670 [st=  0] +REQUEST_ALIVE                             [dt=396]
t=1324873979670 [st=  0]     URL_REQUEST_START_JOB                  [dt=  0]
                             --> load_flags = 68223105
(ENABLE_LOAD_TIMING | MAIN_FRAME | MAYBE_USER_GESTURE | VALIDATE_CACHE
| VERIFY_EV_CERT)
                             --> method = "GET"
                             --> priority = 0
                             --> url =
"http://www.officedepot.com/promo/redir.do?siteid=qpF0HYnRugA-63LcCNNtsuVrjXbS8XCqNA&url=http://www.officedepot.com"
t=1324873979670 [st=  0]    +URL_REQUEST_START_JOB                  [dt=116]
                             --> load_flags = 68223105
(ENABLE_LOAD_TIMING | MAIN_FRAME | MAYBE_USER_GESTURE | VALIDATE_CACHE
| VERIFY_EV_CERT)
                             --> method = "GET"
                             --> priority = 0
                             --> url =
"http://www.officedepot.com/promo/redir.do?siteid=qpF0HYnRugA-63LcCNNtsuVrjXbS8XCqNA&url=http://www.officedepot.com"
t=1324873979670 [st=  0]        HTTP_CACHE_GET_BACKEND              [dt=  0]
t=1324873979670 [st=  0]        HTTP_CACHE_OPEN_ENTRY               [dt=  1]
                                --> net_error = -2 (FAILED)
t=1324873979671 [st=  1]        HTTP_CACHE_CREATE_ENTRY             [dt=  5]
t=1324873979676 [st=  6]        HTTP_CACHE_ADD_TO_ENTRY             [dt=  0]
t=1324873979676 [st=  6]       +HTTP_STREAM_REQUEST                 [dt=  3]
t=1324873979679 [st=  9]           HTTP_STREAM_REQUEST_BOUND_TO_JOB
                                   --> source_dependency =
{"id":42262,"type":11}
t=1324873979679 [st=  9]       -HTTP_STREAM_REQUEST
t=1324873979679 [st=  9]       +HTTP_TRANSACTION_SEND_REQUEST       [dt=  0]
t=1324873979679 [st=  9]           HTTP_TRANSACTION_SEND_REQUEST_HEADERS
                                   --> GET
/promo/redir.do?siteid=qpF0HYnRugA-63LcCNNtsuVrjXbS8XCqNA&url=http://www.officedepot.com
HTTP/1.1
                                       Host: www.officedepot.com
                                       Connection: keep-alive
                                       Cache-Control: max-age=0
                                       User-Agent: Mozilla/5.0 (X11;
Linux i686) AppleWebKit/535.2 (KHTML, like Gecko) Ubuntu/11.10
Chromium/15.0.874.106 Chrome/15.0.874.106 Safari/535.2
                                       Accept:
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
                                       Accept-Encoding: gzip,deflate,sdch
                                       Accept-Language: en-US,en;q=0.8
                                       Accept-Charset:
ISO-8859-1,utf-8;q=0.7,*;q=0.3
                                       Cookie: [value was stripped]
t=1324873979679 [st=  9]       -HTTP_TRANSACTION_SEND_REQUEST
t=1324873979679 [st=  9]       +HTTP_TRANSACTION_READ_HEADERS       [dt=106]
t=1324873979679 [st=  9]           HTTP_STREAM_PARSER_READ_HEADERS  [dt=106]
t=1324873979785 [st=115]           HTTP_TRANSACTION_READ_RESPONSE_HEADERS
                                   --> HTTP/1.1 301 Moved Permanently
                                       Server: IBM_HTTP_Server
                                       Pragma: No-cache
                                       Cache-Control:
no-cache,no-store,max-age=0
                                       Expires: Thu, 01 Jan 1970 00:00:00 GMT
                                       Location: http://www.officedepot.com
                                       Content-Encoding: gzip
                                       P3P: CP="ALL DEVa TAIa OUR BUS
UNI NAV STA PRE" policyref="http://www.officedepot.com/w3c/p3p.xml"
                                       Content-Length: 20
                                       Content-Type: text/html
                                       Content-Language: en-US
                                       Date: Mon, 26 Dec 2011 04:32:59 GMT
                                       Connection: keep-alive
                                       Vary: Accept-Encoding
                                       Set-Cookie: [value was stripped]
                                       Set-Cookie: [value was stripped]
t=1324873979785 [st=115]       -HTTP_TRANSACTION_READ_HEADERS
t=1324873979785 [st=115]       +HTTP_CACHE_WRITE_INFO               [dt=  1]
t=1324873979786 [st=116]           URL_REQUEST_REDIRECTED
                                   --> location = "http://www.officedepot.com/"
t=1324873979786 [st=116]    -URL_REQUEST_START_JOB
t=1324873979786 [st=116]     URL_REQUEST_START_JOB                  [dt= 11]
                             --> load_flags = 68223105
(ENABLE_LOAD_TIMING | MAIN_FRAME | MAYBE_USER_GESTURE | VALIDATE_CACHE
| VERIFY_EV_CERT)
                             --> method = "GET"
                             --> priority = 0
                             --> url = "http://www.officedepot.com/"
t=1324873979800 [st=130]    +URL_REQUEST_START_JOB                  [dt=238]
                             --> load_flags = 68223105
(ENABLE_LOAD_TIMING | MAIN_FRAME | MAYBE_USER_GESTURE | VALIDATE_CACHE
| VERIFY_EV_CERT)
                             --> method = "GET"
                             --> priority = 0
                             --> url = "http://www.officedepot.com/"
t=1324873979800 [st=130]        HTTP_CACHE_GET_BACKEND              [dt=  0]
t=1324873979800 [st=130]        HTTP_CACHE_OPEN_ENTRY               [dt=  0]
                                --> net_error = -2 (FAILED)
t=1324873979800 [st=130]        HTTP_CACHE_CREATE_ENTRY             [dt=  1]
t=1324873979801 [st=131]        HTTP_CACHE_ADD_TO_ENTRY             [dt=  0]
t=1324873979801 [st=131]       +HTTP_STREAM_REQUEST                 [dt=  0]
t=1324873979801 [st=131]           HTTP_STREAM_REQUEST_BOUND_TO_JOB
                                   --> source_dependency =
{"id":42293,"type":11}
t=1324873979801 [st=131]       -HTTP_STREAM_REQUEST
t=1324873979801 [st=131]       +HTTP_TRANSACTION_SEND_REQUEST       [dt=  0]
t=1324873979801 [st=131]           HTTP_TRANSACTION_SEND_REQUEST_HEADERS
                                   --> GET / HTTP/1.1
                                       Host: www.officedepot.com
                                       Connection: keep-alive
                                       Cache-Control: max-age=0
                                       User-Agent: Mozilla/5.0 (X11;
Linux i686) AppleWebKit/535.2 (KHTML, like Gecko) Ubuntu/11.10
Chromium/15.0.874.106 Chrome/15.0.874.106 Safari/535.2
                                       Accept:
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
                                       Accept-Encoding: gzip,deflate,sdch
                                       Accept-Language: en-US,en;q=0.8
                                       Accept-Charset:
ISO-8859-1,utf-8;q=0.7,*;q=0.3
                                       Cookie: [value was stripped]
t=1324873979801 [st=131]       -HTTP_TRANSACTION_SEND_REQUEST
t=1324873979801 [st=131]       +HTTP_TRANSACTION_READ_HEADERS       [dt=236]
t=1324873979801 [st=131]           HTTP_STREAM_PARSER_READ_HEADERS  [dt=236]
t=1324873980037 [st=367]           HTTP_TRANSACTION_READ_RESPONSE_HEADERS
                                   --> HTTP/1.1 200 OK
                                       Server: IBM_HTTP_Server
                                       Pragma: No-cache
                                       Cache-Control:
no-cache,no-store,max-age=0
                                       Expires: Thu, 01 Jan 1970 00:00:00 GMT
                                       Content-Encoding: gzip
                                       P3P: CP="ALL DEVa TAIa OUR BUS
UNI NAV STA PRE" policyref="http://www.officedepot.com/w3c/p3p.xml"
                                       Content-Type: text/html; charset=UTF-8
                                       Content-Language: en-US
                                       Content-Length: 18021
                                       Date: Mon, 26 Dec 2011 04:33:00 GMT
                                       Connection: keep-alive
                                       Vary: Accept-Encoding
t=1324873980037 [st=367]       -HTTP_TRANSACTION_READ_HEADERS
t=1324873980037 [st=367]        HTTP_CACHE_WRITE_INFO               [dt=  1]
t=1324873980038 [st=368]    -URL_REQUEST_START_JOB
t=1324873980038 [st=368]     HTTP_TRANSACTION_READ_BODY             [dt=  0]
t=1324873980039 [st=369]     HTTP_TRANSACTION_READ_BODY             [dt=  0]
t=1324873980040 [st=370]     HTTP_TRANSACTION_READ_BODY             [dt=  0]
t=1324873980040 [st=370]     HTTP_TRANSACTION_READ_BODY             [dt=  0]
t=1324873980041 [st=371]     HTTP_TRANSACTION_READ_BODY             [dt=  0]
t=1324873980041 [st=371]     HTTP_TRANSACTION_READ_BODY             [dt= 24]
t=1324873980066 [st=396]     HTTP_TRANSACTION_READ_BODY             [dt=  0]
t=1324873980066 [st=396] -REQUEST_ALIVE
On Sun, Dec 25, 2011 at 12:18 PM, Daniel Stenberg <daniel_at_haxx.se> wrote:
> On Sat, 24 Dec 2011, Alex Vinnik wrote:
>
>  I am having a problem using libcurl in my Ruby app (Curb gem binds
>> directly to libcurl). Specifically libcurl can't follow a weird redirect
>> served by a web server. Server Location header "Location:
>> http://www.officedepot.com;**jsessionid=**0000EXsMRFMF5kwJo26qgOif31d:**13ddq0tfm"
>> doesn't look RFC complaint to me. For whatever reason jsessionid gets added
>> to the end of new location.
>>
>
> Wow. That's a broken URL that I've not seen used before.
>
>  Somehow browsers can handle this redirect.
>>
>
> I pasted that URL into chrome, and it can't deal with it when given in the
> address bar at least. It treats it as a search string instead.
>
> I pasted it into Firefox's URL bar and it inserted a slash in front of the
> first semicolon by itself and then showed the site.
>
> So browsers at least aren't uniformly considering this a good address.
>
>  if there is a way to work around this problem?
>>
>
> No, I can't think of any.
>
> Since this is a fairly big site and at least one of the major browsers
> support this format, I think we should consider supporting it. Even though
> it would be under protest.
>
> Any other opinions?
>
> --
>
>  / daniel.haxx.se
> ------------------------------**------------------------------**-------
> List admin: http://cool.haxx.se/list/**listinfo/curl-library<http://cool.haxx.se/list/listinfo/curl-library>
> Etiquette:  http://curl.haxx.se/mail/**etiquette.html<http://curl.haxx.se/mail/etiquette.html>
>
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette:  http://curl.haxx.se/mail/etiquette.html
Received on 2011-12-26