Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression on parsing large config files #11431

Closed
aduh95 opened this issue Jul 12, 2023 · 5 comments
Closed

Regression on parsing large config files #11431

aduh95 opened this issue Jul 12, 2023 · 5 comments

Comments

@aduh95
Copy link

aduh95 commented Jul 12, 2023

I did this

We noticed the issue as sometimes requests with very large payload were failing on production when trying to bump the cURL version. I was able to create a minimal repro using a Node.js script:

import assert from 'node:assert'
import http from 'node:http'
import child_process from 'node:child_process'

const server = http
  .createServer((req, res) => {
    const { url, method } = req
    console.log({ url, method })
    let length = 0
    req.on('data', (chunk) => {
      length += chunk.length
      for (let i = 0; i < chunk.length; i++) {
        assert.strictEqual(chunk[i], 97)
      }
    })
    req.on('close', () => {
      assert.strictEqual(length, 0x400 * 99 + 1014)
      res.writeHead(200, { 'Content-Type': 'text/plain' })
      res.end('sure\n')
    })
  })
  .listen(8080)
  .unref()

await new Promise((resolve, reject) => {
  const cp = child_process.spawn('curl', ['--config', '-', 'http://[::1]:8080'])

  cp.on('exit', (code) => (code === 0 ? resolve() : reject(code)))

  cp.stdin.write('data = "')
  for (let j = 0; j < 99; j++) cp.stdin.write('a'.repeat(0x400))
  cp.stdin.write('a'.repeat(1014)) // When I pass 1013 here, everything works on all versions.
  cp.stdin.end('"\n')
})

The failure seems to appear as soon as the config file is strictly larger than 102399 bytes (≥100kiB), at which point curl ignores the data and sends an empty GET request instead of a POST request with the payload.

I expected the following

Passes on 7.72.0:

$ curl --version
curl 7.72.0 (aarch64-unknown-linux-gnu) libcurl/7.72.0 OpenSSL/1.0.2l zlib/1.2.11 c-ares/1.13.0 libssh2/1.8.0 nghttp2/1.20.0
Release-Date: 2020-08-19
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: AsynchDNS HTTP2 HTTPS-proxy IPv6 Largefile libz NTLM NTLM_WB SSL TLS-SRP UnixSockets
$ node repro.mjs
{ url: '/', method: 'POST' }

Starting from 7.73.0 and up (I didn't try all versions, only a few up to 8.1.1):

$ curl --version
curl 7.73.0 (aarch64-unknown-linux-gnu) libcurl/7.73.0 OpenSSL/1.0.2l zlib/1.2.11 c-ares/1.13.0 libssh2/1.8.0 nghttp2/1.20.0
Release-Date: 2020-10-14
Protocols: dict file ftp ftps gopher http https imap imaps mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: AsynchDNS HTTP2 HTTPS-proxy IPv6 Largefile libz NTLM NTLM_WB SSL TLS-SRP UnixSockets
$ node repro.mjs
{ url: '/', method: 'GET' }
node:assert:124
  throw new AssertionError(obj);
  ^

AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:

0 !== 102390

Tried with cURL 8.1.1 on macOS, the failure is a bit different – but I would still expect the code to succeed:

$ curl --version
curl 8.1.1 (aarch64-apple-darwin22.3.0) libcurl/8.1.1 OpenSSL/3.0.9 zlib/1.2.13 brotli/1.0.9 zstd/1.5.5 libidn2/2.3.4 libssh2/1.11.0 nghttp2/1.51.0
Release-Date: 2023-05-23
Protocols: dict file ftp ftps gopher gophers http https imap imaps mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS brotli GSS-API HSTS HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM NTLM_WB SPNEGO SSL threadsafe TLS-SRP UnixSockets zstd
$ node repro.mjs
Error: "curl --config -" exited with code 26

curl/libcurl version

See above.

https://github.com/curl/curl/compare/curl-7_72_0..curl-7_73_0

operating system

Bisected on Ubuntu to pin point which version introduced the regression. I was able to reproduce on macOS with cURL v8.1.1 as well

@bagder bagder self-assigned this Jul 13, 2023
@bagder
Copy link
Member

bagder commented Jul 13, 2023

The maximum line length as (arbitrarily) set to 100K in 47dd957.

Would it help if we bumped it to say 10M instead?

bagder added a commit that referenced this issue Jul 13, 2023
Bumped from 100K set in 47dd957

Reported-by: Antoine du Hamel
Fixes #11431
@aduh95
Copy link
Author

aduh95 commented Jul 13, 2023

It would certainly decrease the risk of real world application suffering from this, for sure! I'm wondering if a better solution would be to let config file supply multi-line params (either for lines that ends with a \, or heredoc style maybe), however maybe that's more difficult to implement.

@marcospgp
Copy link

Would it help if we bumped it to say 10M instead?

Would that mean allocating 10MB for every config? Wouldn't there be a way to start with a smaller size and allocate more memory later if needed?

Maybe starting by allocating something small like 256KB and doubling it every time one runs into an out of memory error?

@bagder
Copy link
Member

bagder commented Jul 13, 2023

Would that mean allocating 10MB for every config

No, it means that it allows the allocation to grow that big if necessary. It will still only load the amount of data that exists so this change will not make curl use any more data than before if the lines are shorter than 100K...

@bagder bagder closed this as completed in e53df4c Jul 13, 2023
@bagder
Copy link
Member

bagder commented Jul 13, 2023

@aduh95 with the pending #11346 work (targeted for 8.3.0) we can make longer data using variables without having to put everything on the same line so then we can actually make >10M data on <10M line lengths.

bch pushed a commit to bch/curl that referenced this issue Jul 19, 2023
Bumped from 100K set in 47dd957

Reported-by: Antoine du Hamel
Fixes curl#11431
Closes curl#11435
ptitSeb pushed a commit to wasix-org/curl that referenced this issue Sep 25, 2023
Bumped from 100K set in 47dd957

Reported-by: Antoine du Hamel
Fixes curl#11431
Closes curl#11435
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

3 participants