New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial implementation of rustls backend in curl. #6350
Conversation
Since I think Rust is pretty cool, I tried with a rewrite of your Makefile.Windows to build for Windows using:
Then trying your patch and using Edit: I figured the above out; your Then I managed to build with rustls in |
Yeah, I too would appreciate a complete instruction on how to proceed to build this. There are CI test failures now for test 1119 because |
* Write n bytes from buf to the provided fd, retrying short writes until | ||
* we finish or hit an error. Assumes fd is blocking and therefore doesn't | ||
* handle EAGAIN. Returns 0 for success or 1 for error. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds suspicious. libcurl should always use non-blocking sockets and this kind of loop is not a good sign.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that this should assume nonblocking (as the recv
function does) and handle EAGAIN. Fixed. A question about nonblocking I/O though: My understanding is that I should loop until either error or EAGAIN. Is that accurate, or should I return at the first short write?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews! I've pushed some fixes that I think address them. I've changed bzero->memset, and write/read->swrite/sread.
I also rewrote rustls_send to do the right thing for nonblocking I/O, and rewrote Curl_rustls_connect_nonblocking to take advantage of rustls_send and rustls_recv.
I updated the PR description with notes on how to build, copied here:
To build:
- Follow the instructions at https://github.com/abetterinternet/crustls#c-rustls to build and install crustls. You'll need Rust installed, and cbindgen.
- Run
./configure --with-rustls --with-default-ssl-backend=rustls && make
- The resulting
curl
tool uses rustls as the default backend, so you can use it to make various requests. - The rustls backend emits a lot of logs with
infof
, so you can pass-v
to see that it's working.
Thanks so much for writing that Makefile.windows, @gvanem ! I assume you don't mind if I add that to the crustls repo?
* Write n bytes from buf to the provided fd, retrying short writes until | ||
* we finish or hit an error. Assumes fd is blocking and therefore doesn't | ||
* handle EAGAIN. Returns 0 for success or 1 for error. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that this should assume nonblocking (as the recv
function does) and handle EAGAIN. Fixed. A question about nonblocking I/O though: My understanding is that I should loop until either error or EAGAIN. Is that accurate, or should I return at the first short write?
It looks like these test cases are failing in AppVeyor: FTP fetch with --proxy set to http:// and with --connect-to tests/data/test714 7 sec 647 ms I'll need to set up a Windows environment to diagnose these. If anyone has insight into what aspect of this change might be causing these errors, that would definitely help get me pointed in the right direction. |
lib/vtls/rustls.c
Outdated
DEBUGASSERT(tlswritten > 0); | ||
break; | ||
} | ||
perror("writing to socket"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using perror()
for sockets on Windows does not print the correct error. Add something like:
static void Curl_perror (const char *what)
{
char buf[300];
fprintf (stderr, "%s: %s", what, Curl_strerror(SOCKERRNO, buf, sizeof(buf)));
}
at top. @bagder Should something like this be in strerror.c
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be there
No, go ahead. Here is an updated version Makefile.Windows.
|
These are most likely unrelated. Unfortunately the testsuite is flaky on the various Windows CI hosts. Once you are satisfied with the PR, we can retry them until they succeed, just to be sure they are really not related. |
To build this as the only TLS backend (with OpenSSL disabled) I had to do a little patch: configure.acdiff --git a/configure.ac b/configure.ac
index 9bb10c5ee..19df8661b 100755
--- a/configure.ac
+++ b/configure.ac
@@ -2917,11 +2917,11 @@ if test -z "$ssl_backends" -o "x$OPT_NSS" != xno; then
fi dnl NSS not disabled
test -z "$ssl_msg" || ssl_backends="${ssl_backends:+$ssl_backends, }$ssl_msg"
fi
-case "x$OPENSSL_ENABLED$GNUTLS_ENABLED$NSS_ENABLED$MBEDTLS_ENABLED$WOLFSSL_ENABLED$SCHANNEL_ENABLED$SECURETRANSPORT_ENABLED$MESALINK_ENABLED$BEARSSL_ENABLED$AMISSL_ENABLED" in
+case "x$OPENSSL_ENABLED$GNUTLS_ENABLED$NSS_ENABLED$MBEDTLS_ENABLED$WOLFSSL_ENABLED$SCHANNEL_ENABLED$SECURETRANSPORT_ENABLED$MESALINK_ENABLED$BEARSSL_ENABLED$AMISSL_ENABLED$RUSTLS_ENABLED" in
x)
AC_MSG_WARN([SSL disabled, you will not be able to use HTTPS, FTPS, NTLM and more.])
AC_MSG_WARN([Use --with-ssl, --with-gnutls, --with-wolfssl, --with-mbedtls, --with-nss, --with-schannel, --with-secure-transport, --with-mesalink, --with-amissl or --with-bearssl to address this.])
;;
x1) lib/curl_setup.hdiff --git a/lib/curl_setup.h b/lib/curl_setup.h
index 22def2def..d2134a07e 100644
--- a/lib/curl_setup.h
+++ b/lib/curl_setup.h
@@ -610,11 +610,11 @@ int netware_init(void);
#if defined(USE_GNUTLS) || defined(USE_OPENSSL) || defined(USE_NSS) || \
defined(USE_MBEDTLS) || \
defined(USE_WOLFSSL) || defined(USE_SCHANNEL) || \
defined(USE_SECTRANSP) || defined(USE_GSKIT) || defined(USE_MESALINK) || \
- defined(USE_BEARSSL)
+ defined(USE_BEARSSL) || defined(USE_RUSTLS)
#define USE_SSL /* SSL support has been enabled */
#endif
/* Single point where USE_SPNEGO definition might be defined */
#if !defined(CURL_DISABLE_CRYPTO_AUTH) && \ |
lib/vtls/rustls.c
Outdated
static size_t | ||
Curl_rustls_version(char *buffer, size_t size) | ||
{ | ||
return msnprintf(buffer, size, "rustls"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there really no version number? I would strongly urge you to provide one, as this is often key material when debugging issues and tracking down suspicious behavior in user reports.
lib/vtls/rustls.c
Outdated
ssize_t plain_bytes_copied = 0; | ||
int rustls_result = 0; | ||
|
||
memset(tlsbuf, 0, sizeof(tlsbuf)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it a bit excessive to clear the entire buffer like this every time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, I've removed it. I did profile curl over a big transfer, and this did not seem to be a significant use of CPU time.
I was looking at the docs for read
and recv
, and while implicitly they only write to the provided buf, they don't seem to explicitly guarantee that they don't read from the provided buf, which would be unsafe if the buf was uninitialized. I'm thinking of these related Rust issues:
I'm working on improving the configurability of this backend (trusted roots, ciphersuites, etc). It looks like the existing backends do all their setup on a per-connection basis. That makes sense, since different connections might have different options. But the rustls ClientConfig says:
It seems like the best thing to do would be to cache built config objects based on their inputs. Does that make sense? Are there instances of other TLS backends doing the same thing? I think for the short term I might just create a ClientConfig per-connection, and treat the cache as a later optimization. |
Right. The configs is set per-transfer in the libcurl API, but the connection is then setup and verified using a specific set of TLS parameters. We then make sure that this connection can only be reused for subsequent transfers if the transfers' TLS config matches that connection's.
This is a very long-standing optimization detail that we've never gotten around to providing. It would make sense to have some sort of caching, at least for cases when the exact same config is used again - which is probably what 99% of all transfers do.
Makes perfect sense to me! |
Alright, I've added ClientConfig per-session, and I've made configuration of CAfile work. I think this is ready for another round of reviews. |
After the crustls update, I needed this diff --git a/configure.ac b/configure.ac
index 303274716..50b17b001 100755
--- a/configure.ac
+++ b/configure.ac
@@ -2720,11 +2720,11 @@ if test -z "$ssl_backends" -o "x$OPT_RUSTLS" != xno; then
fi
if test -z "$OPT_RUSTLS" ; then
dnl check for lib first without setting any new path
- AC_CHECK_LIB(crustls, rustls_client_config_new,
+ AC_CHECK_LIB(crustls, rustls_client_session_read,
dnl libcrustls found, set the variable
[
AC_DEFINE(USE_RUSTLS, 1, [if rustls is enabled])
AC_SUBST(USE_RUSTLS, [1])
RUSTLS_ENABLED=1
@@ -2748,11 +2748,11 @@ if test -z "$ssl_backends" -o "x$OPT_RUSTLS" != xno; then
LDFLAGS="$LDFLAGS $addld"
if test "$addcflags" != "-I/usr/include"; then
CPPFLAGS="$CPPFLAGS $addcflags"
fi
- AC_CHECK_LIB(crustls, rustls_client_config_new,
+ AC_CHECK_LIB(crustls, rustls_client_session_read,
[
AC_DEFINE(USE_RUSTLS, 1, [if rustls is enabled])
AC_SUBST(USE_RUSTLS, [1])
RUSTLS_ENABLED=1
USE_RUSTLS="yes" |
When I build this with the current master crustls, test 300 fails and leaks memory:
After the test, check
|
lib/vtls/rustls.c
Outdated
static const struct rustls_client_config *client_config = NULL; | ||
|
||
static int | ||
Curl_rustls_init(void) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed! We use the Curl_
prefix for symbols that are global within the library, and since this is static it should not use Curl_
(it goes for a few other functions too). Most TLS backends just go with a fixed private prefix. I would probably not recommend using rustls_
since that's the prefix the actual TLS library use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that local style is to use a Curl_<backend>_
prefix for the functions that wind up as fields of the Curl_ssl
struct:
$ grep Curl_.*version, lib/vtls/*.c
lib/vtls/bearssl.c: Curl_bearssl_version,
lib/vtls/gskit.c: Curl_gskit_version, /* version */
lib/vtls/gtls.c: Curl_gtls_version, /* version */
lib/vtls/mbedtls.c: Curl_mbedtls_version, /* version */
lib/vtls/mesalink.c: Curl_mesalink_version, /* version */
lib/vtls/nss.c: Curl_nss_version, /* version */
lib/vtls/openssl.c: Curl_ossl_version, /* version */
lib/vtls/schannel.c: Curl_schannel_version, /* version */
lib/vtls/sectransp.c: Curl_sectransp_version, /* version */
lib/vtls/vtls.c: Curl_multissl_version, /* version */
lib/vtls/wolfssl.c: Curl_wolfssl_version, /* version */
(each of those functions is static to the relevant file). Am I missing something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #6443 ...
The memory leak came from failure to free the client_config object if creation of a client session failed. I've fixed that in 37fdf9e. The failure of test case 300 is because the hostname is |
Gosh. And is that a design decision or will this be fixed soon? It seems like a rather big blow to users. IP-address certs might be slightly unusual but they've always been specified and functional and works with all other curl TLS backends... |
@Badger we were looking at sponsoring the work briansmith/webpki#54 but we just don't have the budget to do the funding ourselves at the moment. @briansmith just needs some support from the community to get this over the line. |
I'm surprised because it's A) not exactly complicated and B) been in the spec since forever C) it is a feature used publicly. |
It was a design decision, but has been changed and now needs implementation for the change: briansmith/webpki#54 (comment). |
At the time the webpki crate was first created, there were not good policies in place for people to get IP address certificates from CAs. That's improved in the interim. At the time, it was something that was cut in the interest of getting the first version done. There are probably going to be other areas where webpki is stricter or more limited than what you're used to. Often this affects test suites, and often test suites can be changed. For example, some projects changed their test suites so that all their certificates were in a chain of the form "Root -> Intermediate -> End Entity" (Web PKI standards require an intermediate) so that webpki will accept them, except for the tests that were explicitly testing support for chains of the form "Root -> End Entity" or just "End Entity". Similarly, webpki doesn't support the use of the "Common Name" field to specify the domain name or IP address; instead it only uses subjectAltName. Lots of people had to change their test certificates to move the domain name into subjectAltName, except for the tests that were specifically testing support for the invalid form. My priority right now is to improve the quality of webpki's support for things that are consistent with Web PKI standards. Definitely adding support for IP address in subjectAltName is one of those things. As you work on your Rustls integration, please keep track of what issues you run into regarding webpki. If webpki is accepting a certificate that it shouldn't accept, that's a very high priority issue for me to fix; please file issues when you encounter this. If it rejects a certificate that is valid and that conforms to Web PKI standards, that's a medium-priority issue for the webpki project; again, please file a separate issue for each one, after checking that there is not already an issue on file. As far as extensions to the Web PKI standards are concerned, those are low priority for the webpki project but they won't be dismissed without consideration. |
The curl test suite is not very thorough when it comes to TLS/cert testing, we mostly assume that the TLS libraries have their own test suites for that. However, once we merge this work into master it should be easy enough for a wider crowd of users to build and run curl+rustls builds against whatever they like and test it further that way. |
For fun, I did a build with Hyper and rustls (all from master branches)
... and I ran into this:
|
Rust has a common logging system in the |
That doesn't explain the panic though. Perhaps Hyper is also registering a global logger via the |
I retried the command line with a build without Hyper and I got the same panic then as well. |
This shows up specifically with I've removed env_logger (rustls/rustls-ffi#22), and confirmed that it fixes the panic. This is also a reminder that I need to add across-the-board panic handling and return errors, since panicking across an FFI boundary is undefined (https://blog.rust-lang.org/inside-rust/2020/02/27/ffi-unwind-design-meeting.html). Separately I'll work on a better facility to share logs from rustls to curl. |
Panics can never be the right choice for a library. If something goes wrong you need to return a proper error so that the user of the library can decide the next step. It is never the library's choice to abort. |
There's something wrong in the socket waiting/handling that makes it very slow. Try downloading from curl.se or even curl.haxx.se: plain debian curl with openssl
git master with crustls
On repeated invokes it seems to vary between 1 and 2.3 seconds, while the openssl build does it in 33-45 milliseconds. |
Ah, interesting. I can reproduce locally too. Adding timestamps with ts:
We can see two 1-second gaps:
Both of these are happening in
I could use some guidance here: Is it definitely the case that |
I've pushed a commit (f98c23d) that fixes one of the 1-second sleeps. However, there's still another 1-second sleep remaining. It seems we get to a point where we need to read from the socket, and reading would block, so we return |
Alright, figured it out! I needed an implementation of Curl_ssl_getsock. Most of the other SSL implementations all use the same implementation, which relies on connssl->connecting_state. rustls.c doesn't use (or update) the connecting_state state machine. As a quick fix, I added another clause to #ifdef for Curl_ssl_getsock, and a non-static function in rustls.c. But I think |
Sounds like an excellent idea! |
With #6558 merged, this just needs a little adjustment and then I think we're getting ready to merged this one too... Right? |
Yes, I think this is just about ready to merge! There are still some config settings that are ignored by this integration, like ciphersuite selection, TLS version selection, and disabling peer certificate validation, but I hope to add those to crustls and then send a followup PR to curl for them. One thing I'm slightly stuck on: when I run tests locally, 2100 consistently fails. If I build with openssl and no rustls, 2100 passes locally. This is a DOH test:
Here's the diff output after
|
I suspect that test is flaky. I cannot reproduce the rustls failure right now with your code and the lastest crustls on my Debian deb box! 😢 I can rerun the test 50 times in a row just fine. With and without valgrind. I noticed you merged master into your branch. We rather see you rebase your commits (and squash them suitably and then force-push) so that we can merge them without merge commits and maintain a linear history. Since this backend is opt-in and totally new, I won't mind if we merge it early to make it easier to continue working on it with additional PRs. |
Sounds good. I am happy to rebase. On my own repos, I usually squash a whole PR into a single commit. Is that okay here, or would you rather retain a series of commits that reflect the evolution of this branch during review? |
Test 2100 started passing for me again, which leans towards your idea of a flaky test. I did run it under gdb once and saw that it was going to do a null pointer dereference in Curl_connecthost because Curl_dns_entry *remotehost was null. I've included the log below under a details tag in case it's useful. If I can reproduce a few more times I'll file a separate issue.
```
~/curl/tests $ ./runtests.pl 2100 -g
********* System characteristics ********
* curl 7.75.1-DEV (x86_64-pc-linux-gnu)
* libcurl/7.75.1-DEV OpenSSL/1.1.1f (crustls/0.2.0/rustls/0.19.0) zlib/1.2.11
* Features: alt-svc Debug HTTPS-proxy IPv6 Largefile libz MultiSSL NTLM NTLM_WB SSL TLS-SRP TrackMemory UnixSockets
* Disabled:
* Host: carapace
* System: Linux carapace 5.8.0-41-generic #46~20.04.1-Ubuntu SMP Mon Jan 18 17:52:23 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
* OS: linux
* Servers: SSL HTTP-IPv6 HTTP-unix FTP-IPv6
* Env: Valgrind Libtool
* Seed: 257452
*****************************************
break cotest 2100...[HTTP GET using DOH]
nnecGNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Breakpoint 1, Curl_connecthost (data=0x5555555f6158, conn=0x5555555fb648, remotehost=0x5555555d4048) at connect.c:1360
Breakpoint 1, Curl_connecthost (data=0x5555555d9d08, conn=0x5555555f56c8, remotehost=0x0) at connect.c:1360
Quit anyway? (y or n) y
|
This adds a new TLS backend, rustls. It uses the C-to-rustls bindings from https://github.com/abetterinternet/crustls. Rustls is at https://github.com/ctz/rustls/. There is still a fair bit to be done, like sending CloseNotify on connection shutdown, respecting CAPATH, and properly indicating features like "supports TLS 1.3 ciphersuites." But it works well enough to make requests and receive responses. Blog post for context: https://www.abetterinternet.org/post/memory-safe-curl/
Rebased and (IMO) ready to merge. During a last round of testing I found and fixed a couple of memory leaks in error paths, and added a better error message for failure to create a session. I also realized I had missed a couple of calls to Curl_ssl_getsock in #6558. You can see the diffs in 4179224. I'm happy to pull those into their own PR if you'd like. |
Thanks for your work @jsha. This is now merged and can be experimented with easier by more people and we can take follow-up PRs against master. |
This adds a new TLS backend, rustls. It uses the C-to-rustls bindings
from https://github.com/abetterinternet/crustls.
Rustls is at https://github.com/ctz/rustls/.
Blog post for context:
https://www.abetterinternet.org/post/memory-safe-curl/
To build:
./configure --with-rustls --with-default-ssl-backend=rustls && make
curl
tool uses rustls as the default backend, so you can use it to make various requests.infof
, so you can pass-v
to see that it's working.