Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce CA certificate bundle reparsing #9620

Closed
wants to merge 7 commits into from

Conversation

tlsa
Copy link
Contributor

@tlsa tlsa commented Sep 29, 2022

This implements TODO 13.12:

When using the OpenSSL backend, curl will load and reparse the CA bundle at the creation of the "SSL context" when it sets up a connection to do a TLS handshake. A more effective way would be to somehow cache the CA bundle to avoid it having to be repeatedly reloaded and reparsed.

With this change, the first X509_STORE to be loaded exclusively via a CAfile is cached. Any subsequent SSL contexts that are created will reuse the same X509_STORE, provided they would also be loaded exclusively via the same CAfile.

@tlsa
Copy link
Contributor Author

tlsa commented Sep 29, 2022

In terms of performance improvement, to load the BBC News site in NetSurf:

LibCurl Version Total instruction fetch cost
Master 5,168,090,712
This branch 1,020,984,411

Note, these numbers include all the rest of the processing done to render the page, not just the work done in fetching over HTTPS connections.

To test, my command was:

valgrind --tool=callgrind ./nsfb -f sdl https://www.bbc.co.uk/news

I am just loading the one page and then exiting, so the page still requires that initial building of the X509_STORE before the page can load. On a more typical browsing session, subsequent pages load even faster because the X509_STORE is already built.

The profile for running against the master libcurl branch shows that the SSL_CTX_load_verify_file function is called 12 times and takes 87.37% of the total run time:

Screenshot from 2022-09-29 16-35-18

The profile for running against this libcurl branch shows that the X509_STORE_load_file function is called once and takes only 35.8% of the already 5x smaller total run time:

Screenshot from 2022-09-29 16-30-00

@tlsa
Copy link
Contributor Author

tlsa commented Sep 29, 2022

@bagder This is a draft merge request for now because it currently doesn't try to stat the file to see if it has changed, or optimise cases where the certificate store isn't loaded from a CAfile. Please let me know what you think of it so far, all suggestions/ideas/comments welcome.

@tlsa
Copy link
Contributor Author

tlsa commented Sep 29, 2022

The fuzzer test fails with:

/usr/bin/ld: openssl.c:(.text.set_up_x509_store[set_up_x509_store]+0x996): undefined reference to `SSL_CTX_set1_cert_store'
/usr/bin/ld: openssl.c:(.text.set_up_x509_store[set_up_x509_store]+0x12e4): undefined reference to `X509_STORE_up_ref'

It looks like both of those were added in OpenSSL 1.1.0. So we could make the caching dependent on OpenSSL >= 1.1.0.

Which version is the fuzzer build built against?

@tlsa tlsa force-pushed the optimise-openssl-x509-store branch 2 times, most recently from 9da89f4 to cc528ac Compare September 29, 2022 16:52
@tlsa
Copy link
Contributor Author

tlsa commented Sep 29, 2022

Which version is the fuzzer build built against?

From the log:

libssl-dev is already the newest version (1.1.1f-1ubuntu2.16).

That's weird, the documentation states:

The X509_STORE_up_ref(), X509_STORE_lock() and X509_STORE_unlock() functions were added in OpenSSL 1.1.0.

@bagder
Copy link
Member

bagder commented Sep 29, 2022

. So we could make the caching dependent on OpenSSL >= 1.1.0.

You should assume that people will use OpenSSL from even before 1.0.0, so yes the code needs to have the proper #ifdef conditions to only use the APIs that are available.

@bagder bagder added the TLS label Sep 29, 2022
Copy link
Member

@bagder bagder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without globals, you can also skip the mutex protections.

lib/vtls/openssl.c Outdated Show resolved Hide resolved
@tlsa
Copy link
Contributor Author

tlsa commented Sep 29, 2022

You should assume that people will use OpenSSL from even before 1.0.0, so yes the code needs to have the proper #ifdef conditions to only use the APIs that are available.

Yep, I'll add that. I'm just confused about the fuzzer test compilation failing when it seemingly has a sufficient openssl version.

@dfandrich
Copy link
Contributor

dfandrich commented Sep 29, 2022 via email

@dfandrich
Copy link
Contributor

dfandrich commented Sep 29, 2022 via email

@bagder
Copy link
Member

bagder commented Sep 29, 2022

The fuzzer builds OpenSSL from source:

git clone --branch OpenSSL_1_0_2m https://github.com/openssl/openssl

@tlsa tlsa force-pushed the optimise-openssl-x509-store branch 2 times, most recently from 12b8b6c to bba9232 Compare September 30, 2022 08:46
@tlsa
Copy link
Contributor Author

tlsa commented Sep 30, 2022

This is the biggest question IMHO. Until now, changes to the certificates will take effect on the next connection, but now they never will be. And, it's probably not straightforward to tell exactly when they've changed, since systems using a symlink tree based on hashes could have hundreds or even thousands of files to stat, not just one.

That's true, just stating isn't completely reliable.

Unless that can be somehow solved automatically in all cases, I think this feature is going to have to be hidden behind a setopt that 1) enables it, and 2) sets a caching policy. And you might be stuck with policies that are entirely time based or rely on explicit cache clearing, niether of which are ideal.

I'm happy to do this if we can reach a consensus on what it should actually look like.

Another option may be to let the client build the X509_STORE itself, and let it pass that in as an alternative to CURLOPT_CAINFO and friends. Although it would need to build a store of the right format for the TLS backend. This idea would make the static struct ossl_global_data ossl_global; and mutex go away too.

@tlsa tlsa force-pushed the optimise-openssl-x509-store branch from bba9232 to 4cead27 Compare September 30, 2022 08:59
@bagder
Copy link
Member

bagder commented Sep 30, 2022

Applications also have the option to use CURLOPT_CACERT_BLOB, which is an existing rather effective way to "cache it themselves"...

@tlsa
Copy link
Contributor Author

tlsa commented Sep 30, 2022

Applications also have the option to use CURLOPT_CACERT_BLOB, which is an existing rather effective way to "cache it themselves"...

Oh, my understanding was that the CURLOPT_CACERT_BLOB was just the raw data from the file, and it would still get parsed (the costly part) each time a connection is made.

@bagder
Copy link
Member

bagder commented Sep 30, 2022

Oh, my understanding was that the CURLOPT_CACERT_BLOB was just the raw data from the file, and it would still get parsed (the costly part) each time a connection is made.

Ah right, I didn't realize it was that part that is so costly. Never mind me! 😬

@tlsa tlsa force-pushed the optimise-openssl-x509-store branch 3 times, most recently from 83ee405 to 29752a4 Compare September 30, 2022 10:02
@tlsa
Copy link
Contributor Author

tlsa commented Sep 30, 2022

For what it's worth, I've experimented with CURLOPT_CAINFO_BLOB to confirm that, and found it makes no meaningful difference to performance on stock libcurl, compared to using CURLOPT_CAINFO. So it really is all in the parsing.

In terms of the speedup gained by caching the X509_STORE, over the course of a reasonable browsing session, it reduces the time spent in curl_multi_perform to less than a 20th of what it was.

@tlsa
Copy link
Contributor Author

tlsa commented Sep 30, 2022

I found that libressl claims openssl version 1.1.1, but does not have SSL_CTX_set1_cert_store.

So I changed it to calling X509_STORE_up_ref, followed by SSL_CTX_set_cert_store, which is available in version 1.1.0 even in libressl.

@tlsa
Copy link
Contributor Author

tlsa commented Oct 3, 2022

Another option may be to let the client build the X509_STORE itself, and let it pass that in as an alternative to CURLOPT_CAINFO and friends. Although it would need to build a store of the right format for the TLS backend. This idea would make the static struct ossl_global_data ossl_global; and mutex go away too.

I've been thinking some more, and I think this is the best option, for a number of reasons:

  1. Curl doesn't need to be responsible for knowing when the cached X509_STORE needs to be updated.
  2. No need for any global state in libcurl to keep hold of the cached X509_STORE.
  3. Simple premise is just passing in what the client wants to be used for certificate verification, like all the other options (so there's no need to have options to set the policy or anything like that).
  4. Doing it internally inside libcurl as I did for the proof of concept here, it was simple to check if e.g. the path had changed for a CURLOPT_CAINFO but for the other kinds it would be difficult. For example for CURLOPT_CAINFO_BLOB it would have had to keep the old blob around and memcmp.
  5. Originally I wanted clients to gain a 20x fetch speed up without needing to change any code, but on reflection I think it's safer to make it opt in.

However there's one thing I don't like, which is if we just pass in a pointer to the built X509_STORE, it needs to match the certificate store format for the SSL backend. I don't even know how compatible different OpenSSL backends (LibreSSL/BoringSSL) would be. So my proposal is basically to let libcurl build the X509_STORE and allow the client to request it.

The new API would be something like:

CURLcode curl_easy_create_castore(const CURL *curl, void **castore);
CURLcode curl_easy_destroy_castore(void *castore);

In more detail:

Client sets up an easy handle with whatever certificate store source they want, exactly as before. For example:

	CURLcode code;
	CURL *easy_handle;

	/* ... */

	code = curl_easy_setopt(easy_handle, CURLOPT_CAINFO,
	                        get_option_cafile());
	if (code != CURLE_OK)
		goto curl_easy_setopt_failed;

	code = curl_easy_setopt(easy_handle, CURLOPT_CAPATH,
	                        get_option_capath());
	if (code != CURLE_OK)
		goto curl_easy_setopt_failed;

And then, to opt in to the new cached store stuff, the client would get the cached store with the new API:

	void *castore = NULL;

	code = curl_easy_create_castore(easy_handle, CURLINFO info, void **castore);
	if (code != CURLE_OK) {
		/* We probably want a new CURLcode for something like
		 * `CURLE_SSL_BACKEND_UNIMPLEMENTED`. */
	}

And then set a new CURLOPT_CASTORE option:

	if (castore != NULL) {
		code = curl_easy_setopt(easy_handle, CURLOPT_CASTORE,
		                        castore);
		if (code != CURLE_OK)
			goto curl_easy_setopt_failed;
	}

There's no real need to worry about whether the used libcurl SSL backend implements the functionality either, because if it doesn't the easy_handle is already configured with the existing CA* options and will work as before.

Basically if a CASTORE is set it will use it, and if not it will use the old behaviour.

Does anyone have any thoughts? Reasons I shouldn't implement this? Better ideas?

@tlsa
Copy link
Contributor Author

tlsa commented Oct 3, 2022

There probably needs to be a CURLOPT_PROXY_CASTORE too, and a corresponding curl_easy_create_proxy_castore().

@tlsa
Copy link
Contributor Author

tlsa commented Oct 3, 2022

The idea of the curl_easy_create_castore() is it doesn't modify the easy handle at all and can be called before any connection is made.

@tlsa
Copy link
Contributor Author

tlsa commented Oct 3, 2022

Since the destroy call doesn't need an easy handle, maybe an include/curl/castore.h would be better:

CURLcode curl_castore_create(const CURL *curl, void **castore);
void curl_castore_destroy(void *castore);

@bagder
Copy link
Member

bagder commented Oct 4, 2022

I think we can simplify based on the common usage patterns.

  • most users use a single fixed file name for the CA bundle, that is updated (at the most) with a few months' interval
  • most easy handles use the same fixed file name, especially within the same multi handle

I could imagine a system that caches the CA context per multi handle for a given file name. Either as a cache for multiple file names (file name => cache *) or just keeping the most recently used one in memory. We could possibly set an expiry date on the cache at ... 24 hours? To make really long-living applications reload it every once in a while.

This would then by default make libcurl not load new CA certs when subsequent new connections are made, contrary to current functionality. We could add a bit to CURLOPT_SSL_OPTIONS for that purpose: either to flush the cache or to bypass it for that handle.

@tlsa tlsa force-pushed the optimise-openssl-x509-store branch from 828ea1b to 36a4caa Compare October 12, 2022 11:21
docs/libcurl/opts/CURLOPT_CA_CACHE_TIMEOUT.3 Outdated Show resolved Hide resolved
lib/urldata.h Outdated
Comment on lines 1758 to 1759
int ca_cache_timeout; /* Certificate store cache timeout (seconds) */
int dns_cache_timeout; /* DNS cache timeout (seconds) */
Copy link
Contributor Author

@tlsa tlsa Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, I put this in struct UserDefined, next to dns_cache_timeout.

But maybe you'd prefer it in struct ssl_general_config?

I don't think it makes sense to put it in struct ssl_config_data because it doesn't have a different value for the proxy case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided it belonged in struct ssl_general_config so I moved it.

@tlsa tlsa force-pushed the optimise-openssl-x509-store branch from 36a4caa to 9ac3132 Compare October 12, 2022 12:39
@bagder bagder added the feature-window A merge of this requires an open feature window label Oct 13, 2022
@tlsa tlsa force-pushed the optimise-openssl-x509-store branch from 9ac3132 to e7c57e6 Compare October 24, 2022 11:47
@tlsa
Copy link
Contributor Author

tlsa commented Oct 24, 2022

Rebased on master and retested. Still gives the optimisation.

free(mbackend->CAfile);
free(mbackend);
#else /* HAVE_SSL_X509_STORE_SHARE */
(void)mbackend;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CWE 416 advisory: potential use after free


ℹ️ Learn about @sonatype-lift commands

You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.

Command Usage
@sonatype-lift ignore Leave out the above finding from this PR
@sonatype-lift ignoreall Leave out all the existing findings from this PR
@sonatype-lift exclude <file|issue|path|tool> Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file

Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.


Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a false positive.

@tlsa tlsa force-pushed the optimise-openssl-x509-store branch 4 times, most recently from 53cdb3a to d12ff16 Compare October 30, 2022 13:26
@tlsa
Copy link
Contributor Author

tlsa commented Oct 30, 2022

Rebased on master and updated the date indocs/libcurl/opts/CURLOPT_CA_CACHE_TIMEOUT.3 to "21 Dec 2022".

@tlsa tlsa force-pushed the optimise-openssl-x509-store branch from d12ff16 to 6693d57 Compare November 3, 2022 11:20
@tlsa
Copy link
Contributor Author

tlsa commented Nov 3, 2022

@bagder Rebased on master to fix conflict. I dropped the commit with with the unrelated changes to lib/curl_threads.h because that was what was conflicting.

@tlsa tlsa requested a review from bagder November 3, 2022 11:23
ccawley2011 pushed a commit to ccawley2011/netsurf-toolchains that referenced this pull request Nov 4, 2022
Making HTTPS connections was slow because libcurl forces openssl
to rebuild the same X509_STORE from the same cabundle every time
a connection is made. This is a slow process and it can happen
many times per page.

These patches change libcurl to keep the built X509_STORE cached
and reuse it for subsequent connections.

These patches are being upstreamed here:

    curl/curl#9620
Copy link
Member

@bagder bagder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thanks for your work on this. I found but a single little nit.

#if defined(USE_THREADS_POSIX) || defined(USE_THREADS_WIN32)
#define HAVE_THREADS
#endif

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This chunk can be removed again too right? HAVE_THREADS does not seem to be used in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good spot! I'll fix in a moment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Badger Done!

Loading a certificate store from CAfile is a processor intensive
operation. Here we cache the x509 store when loaded from CAfile,
and reuse it if we are set up to load certificates from the same
CAfile.
Adds a new option to control the maximum time that a cached
certificate store may be retained for.

Currently only the OpenSSL backend implements support for
caching certificate stores.
@tlsa tlsa force-pushed the optimise-openssl-x509-store branch from 6693d57 to 4d1cc92 Compare November 7, 2022 09:30
@bagder bagder removed the feature-window A merge of this requires an open feature window label Nov 7, 2022
@bagder
Copy link
Member

bagder commented Nov 8, 2022

Thanks!

@bagder bagder closed this in 3c16697 Nov 8, 2022
bagder pushed a commit that referenced this pull request Nov 8, 2022
Adds a new option to control the maximum time that a cached
certificate store may be retained for.

Currently only the OpenSSL backend implements support for
caching certificate stores.

Closes #9620
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

None yet

4 participants