Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downloading with HTTP/3 produces broken files #7351

Closed
vl409 opened this issue Jul 6, 2021 · 9 comments
Closed

Downloading with HTTP/3 produces broken files #7351

vl409 opened this issue Jul 6, 2021 · 9 comments
Assignees
Labels

Comments

@vl409
Copy link

vl409 commented Jul 6, 2021

Downloading ~10Mb file at 10Gbit link produces broken download (garbage within content, see attachment:bug-data.tar.gz)
Downloading same file from same server using same curl binary over 100Mbit link is fine.
Downloading same file from same server using ngtcp2 client is fine.
The issue is reproducible.

The attachment contains:

  • digits.good - original file, plaintext with digits pattern
  • digits.broken - result of downloading with curl:
    export SSLKEYLOGFILE=ssl_key_file ./bin/curl -k --http3 -o digits.broken https://ip:port/digits
  • ssl_key_file - ssl keys produces by curl
  • capture.pcapng - captured packets on client side.
  • stream-from-wireshark - head of decoded quic stream from wireshark; demonstrates that captured stream is not broken (compare to digits.broken)

Versions:

curl from git, 0a0bc4a
./bin/curl -V curl 7.78.0-DEV (x86_64-pc-linux-gnu) libcurl/7.78.0-DEV OpenSSL/3.0.0 zlib/1.2.11 brotli/1.0.9 zstd/1.4.9 libidn2/2.3.1 libpsl/0.21.0 (+libidn2/2.3.0) nghttp2/1.41.0 ngtcp2/0.1.0-DEV nghttp3/0.1.0-DEV libgsasl/1.10.0 OpenLDAP/2.4.57 Release-Date: [unreleased] Protocols: dict file ftp ftps gopher gophers http https imap imaps ldap ldaps mqtt pop3 pop3s rtsp smb smbs smtp smtps telnet tftp Features: alt-svc AsynchDNS brotli gsasl HSTS HTTP2 HTTP3 HTTPS-proxy IDN IPv6 Largefile libz NTLM NTLM_WB PSL SSL TLS-SRP UnixSockets zstd

ngtctp2 from git, f60c3fb447e0b4605a99fa8cd9226330aae9014a
quic-tls from git, 50a5c6fec45aed8b49e28fd23aeaf590d39becc4
nghttp3 from git, bbc426ecdbc119fd7f12a17850f64c440844af01

curl is configured using the command:
cd curl autoreconf -fi LDFLAGS="-Wl,-rpath,<ssl path>/lib" ./configure --with-ssl=<...> --with-ngtcp2=<...> --with-nghttp3=<...> --enable-alt-svc --prefix=<...>

@bagder
Copy link
Member

bagder commented Dec 21, 2021

I cannot reproduce. I've repeatedly run curl built with both ngtcp2 and with quiche (all from current git) and neither seem to give any problem when downloading from a h3 server running on localhost.

I run nghttpx as a h3=>h1 proxy and I download a random 14MB file for testing.

@bagder bagder reopened this Dec 21, 2021
@bagder bagder self-assigned this Dec 21, 2021
@vl409
Copy link
Author

vl409 commented Dec 21, 2021

I believe this will be hard to reproduce on localhost, as no packets are dropped/reordered/delayed. Note there were no issues on 100mbit link, and it happened only on 10Gb with GSO enabled (nginx-quic was used as a server). I will try to reproduce it once again.

@vl409
Copy link
Author

vl409 commented Jan 30, 2022

ok, I've got it 100% reproducible.
My nginx config:

worker_processes 1;
daemon off;

error_log logs/error.log info;

events {
}


http {
    log_format quic '$remote_addr "$request" $status $body_bytes_sent "$http_user_agent" H3="$http3" RTIME="$request_time"';

    access_log logs/access.log quic;

    ssl_protocols       TLSv1.2 TLSv1.3;

    quic_gso on;
    http3_stream_buffer_size 16M;

    server {
        ssl_certificate_key localhost.key;
        ssl_certificate     localhost.crt;

        listen *:8443 ssl;
        listen *:8443 http3;

        location / {
            add_header X-H3 $http3;
            add_header Alt-Svc 'h3=":8443"';
        }
    }
}

I've made 2 runs of curl, then 2 runs of ngtcp2 against test.mov file,
access.log shows:

10.20.0.100 "GET /test.mov HTTP/3.0" 200 2721232298 "nghttp3/ngtcp2 client" H3="h3" RTIME="9.647"
10.20.0.100 "GET /test.mov HTTP/3.0" 200 2721232298 "nghttp3/ngtcp2 client" H3="h3" RTIME="9.717"
10.20.0.100 "GET /test.mov HTTP/3.0" 200 820131504 "curl/7.82.0-DEV" H3="h3" RTIME="62.401"
10.20.0.100 "GET /test.mov HTTP/3.0" 200 865192543 "curl/7.82.0-DEV" H3="h3" RTIME="62.404"

error.log regarding curl shows 2 timedout connections:

2022/01/31 00:13:38 [notice] 16186#0: using the "epoll" event method
2022/01/31 00:13:38 [notice] 16186#0: nginx/1.21.6 (8822:30cad5a0931e)
2022/01/31 00:13:38 [notice] 16186#0: built by gcc 9.3.0 (Gentoo 9.3.0-r2 p4) 
2022/01/31 00:13:38 [notice] 16186#0: OS: Linux 5.9.14-gentoo
2022/01/31 00:13:38 [notice] 16186#0: getrlimit(RLIMIT_NOFILE): 128000:128000
2022/01/31 00:13:38 [notice] 16186#0: start worker processes
2022/01/31 00:13:38 [notice] 16186#0: start worker process 16187
2022/01/31 00:13:40 [info] 16187#0: *1 quic unknown transport param id:0x2ab2, skipped while handling frames, client: 10.20.0.100, server: 0.0.0.0:8443
2022/01/31 00:13:50 [info] 16187#0: *7 quic unknown transport param id:0x2ab2, skipped while handling frames, client: 10.20.0.100, server: 0.0.0.0:8443
2022/01/31 00:14:15 [info] 16187#0: *13 quic unknown transport param id:0x2ab2, skipped while handling frames, client: 10.20.0.100, server: 0.0.0.0:8443
2022/01/31 00:14:32 [info] 16187#0: *19 quic unknown transport param id:0x2ab2, skipped while handling frames, client: 10.20.0.100, server: 0.0.0.0:8443
2022/01/31 00:14:43 [info] 16187#0: *6 client timed out (110: Connection timed out) while sending response to client, client: 10.20.0.100, server: , request: "GET /test.mov HTTP/3.0"
2022/01/31 00:14:52 [info] 16187#0: *12 client timed out (110: Connection timed out) while sending response to client, client: 10.20.0.100, server: , request: "GET /test.mov HTTP/3.0"
2022/01/31 00:14:57 [notice] 16187#0: signal 2 (SIGINT) received, exiting
2022/01/31 00:14:57 [notice] 16186#0: signal 2 (SIGINT) received, exiting
2022/01/31 00:14:57 [info] 16187#0: epoll_wait() failed (4: Interrupted system call)
2022/01/31 00:14:57 [notice] 16187#0: exiting
2022/01/31 00:14:57 [notice] 16187#0: exit
2022/01/31 00:14:57 [notice] 16186#0: signal 17 (SIGCHLD) received from 16187
2022/01/31 00:14:57 [notice] 16186#0: worker process 16187 exited with code 0
2022/01/31 00:14:57 [notice] 16186#0: exit

On the client side I have:

$ ./bench.sh 
running: ./curl-latest/curl/bin/curl -k --http3 -o /tmp/qbench/test.mov https://10.20.0.1:8443/test.mov
curl 7.81.0 (x86_64-pc-linux-gnu) libcurl/7.81.0 (GnuTLS/3.7.2) OpenSSL/1.1.1l zlib/1.2.11 nghttp2/1.45.1
Release-Date: 2022-01-05
Protocols: dict file ftp ftps http https imap imaps mqtt pop3 pop3s rtsp smtp smtps tftp 
Features: HTTP2 HTTPS-proxy IPv6 Largefile libz MultiSSL NTLM SSL TLS-SRP UnixSockets
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2595M  100 2595M    0     0  1024M      0  0:00:02  0:00:02 --:--:-- 1024M
rc=0
Completed in 2.543 seconds, 1020.51 Mb/s
filesize GOOD: 2721232141 2721232141
checksum BAD: got e7e3808f2c2323ac3e44826f6102d2ee expected 5acf60c62dac9382f2b5364942517ae4

$ ./bench.sh 
running: ./curl-latest/curl/bin/curl -k --http3 -o /tmp/qbench/test.mov https://10.20.0.1:8443/test.mov
curl 7.81.0 (x86_64-pc-linux-gnu) libcurl/7.81.0 (GnuTLS/3.7.2) OpenSSL/1.1.1l zlib/1.2.11 nghttp2/1.45.1
Release-Date: 2022-01-05
Protocols: dict file ftp ftps http https imap imaps mqtt pop3 pop3s rtsp smtp smtps tftp 
Features: HTTP2 HTTPS-proxy IPv6 Largefile libz MultiSSL NTLM SSL TLS-SRP UnixSockets
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2595M  100 2595M    0     0   973M      0  0:00:02  0:00:02 --:--:--  973M
rc=0
Completed in 2.675 seconds, 970.16 Mb/s
filesize GOOD: 2721232141 2721232141
checksum BAD: got d490d9a9c31230f091373796dd7228fb expected 5acf60c62dac9382f2b5364942517ae4

ngtcp2 runs are good:

$ ./bench.sh 
running: ./curl-latest/ngtcp2/client 10.20.0.1 8443 --exit-on-first-stream-close
     --no-quic-dump
     --no-http-dump
     --download=/tmp/qbench
     -q https://10.20.0.1:8443/test.mov
rc=0
Completed in 9.659 seconds, 268.68 Mb/s
filesize GOOD: 2721232141 2721232141
checksum GOOD: 5acf60c62dac9382f2b5364942517ae4 5acf60c62dac9382f2b5364942517ae4

$ ./bench.sh 
running: ./curl-latest/ngtcp2/client 10.20.0.1 8443 --exit-on-first-stream-close
     --no-quic-dump
     --no-http-dump
     --download=/tmp/qbench
     -q https://10.20.0.1:8443/test.mov
rc=0
Completed in 9.740 seconds, 266.44 Mb/s
filesize GOOD: 2721232141 2721232141
checksum GOOD: 5acf60c62dac9382f2b5364942517ae4 5acf60c62dac9382f2b5364942517ae4

unfortunately, enabling debug on server makes the issue gone (server has to write more than gig of logs).

@bagder
Copy link
Member

bagder commented Feb 24, 2022

curl:

Completed in 2.543 seconds, 1020.51 Mb/s

ngtcp2 client:

Completed in 9.659 seconds, 268.68 Mb/s

I think that massive difference says something. I can only guess that we're not doing something with the ngtcp2 API that we should, but I don't know what it is. @tatsuhiro-t, any ideas?

@bagder bagder removed the needs-info label Feb 24, 2022
@tatsuhiro-t
Copy link
Contributor

Curl is fast, that is nice, but it looks like curl does not pull all stream data and prematurely stops transfer.

@tatsuhiro-t
Copy link
Contributor

#8504 fixes this issue at least in my local test.

@bagder
Copy link
Member

bagder commented Feb 24, 2022

@vl409 can you check if that fix is good for you as well?

@vl409
Copy link
Author

vl409 commented Feb 24, 2022

Yes, the patch indeed fixes the issue - files are now downloaded without errors, thanks for fixing !

Below are some stats from 10 runs of ngtcp2 and curl:
ngtcp2:
N Min Max Median Avg Stddev
x 10 208.97 374.75 357.26 342.335 48.496656

curl:
N Min Max Median Avg Stddev
x 10 201.82 256.14 224.52 221.951 16.913805

@bagder bagder closed this as completed in 6e9373b Feb 24, 2022
@jay
Copy link
Member

jay commented Feb 24, 2022

ngtcp2: N Min Max Median Avg Stddev x 10 208.97 374.75 357.26 342.335 48.496656

curl: N Min Max Median Avg Stddev x 10 201.82 256.14 224.52 221.951 16.913805

what are these numbers and what is this supposed to be a comparison of

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

4 participants