curl / Mailing Lists / curl-library / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: Above and beyond 32 protocols

From: Ævar Arnfjörð Bjarmason via curl-library <curl-library_at_lists.haxx.se>
Date: Mon, 13 Sep 2021 17:23:18 +0200

On Mon, Sep 13 2021, Timothe Litt via curl-library wrote:

> [[PGP Signed Part:Undecided]]
> On 13-Sep-21 07:01, Daniel Stenberg via curl-library wrote:
>
> # Feedback
>
> I'm all ears. Especially if you have alternative solutions to suggest or if you have an opinion on which way to go.
>
> This is not a problem we must solve *right now*, but I would feel better if we have an idea about how to address it when we get there. Because I'm
> convinced we will reach this point eventually.
>
> Here's an approach that has some short-term pain, but solves the problem permanently - including for protocol 65, 129, ...
>
> Switch to an expandable array of bits, similar to select()'s FDSETS:
>
> Deprecate the existing CURLOPT_PROTOCOLS and CURLOPT_REDIR_PROTOCOLS, replace with CURLOPT_PROTOCOLS_EXT and CURLOPT_REDIR_PROTOCOLS_EXT.
>
> Use indices rather than bitmasks for CURLPROTO_* (e.g. add CURLPROTO_DICTn, CURLPROTO_FTPn, etc)
>
> Switch from a long to a pointer to typedef struct { unsigned int size; uint8 bits[(CURL_PROTO_MAXn + 7)/8]} CURLPROTO.
>
> Provide some macros along the lines of FD_CLR/FD_ISSET/FD_SET/FD_ZERO, but instead of the FD_SETSIZE hack, use the 'size' value of the structure, which
> will increase every time you add 8 more protocols. But clients compiled earlier will have a smaller "size", so will not inadvertently enable new
> protocols.
>
> e.g. the user-visible functions might be something like:
>
> #define CURLPROTO_SET( str, bit ) do { ASSERT((bit) <= CURL_PROTO_MAXn && (bit) <= (str)->size); (str)->bits[(bit)>>3] |= 1u<<((bit)&7); } while(0) /*
> Could also provide a vararg function to set multiple */
>
> #define CURLPROTO_ISSET( str, bit ) ( ((bit) > CURL_PROTO_MAXn || (bit) > (str)->size))? 0 : (str)->bits[(bit)>>3] & 1u<<((bit)&7) )
>
> So specifying protocols looks something like:
>
> CURLPROTO allow = { sizeof( CURLPROTO ) }; /* The initialization could be a macro - e.g CURL_PROTO_DECL(allow); */
>
> CURLPROTO_ZERO(&allow); /* If a stack or malloc()'d variable */
>
> CURLPROTO_SET( &allow, CURL_PROTO_FTPn );
>
> CURLPROTO_SET(&allow, CURL_PROTO_HTTPSn);
>
> curl_easy_setopt( handle, CURLOPT_PROTOCOLS_EXT, &allow);
>
> Internally, use the current (usually larger) size so you don't have to bounds-check every reference; just memcpy min(libraryMAX, 'size' provided) to an
> internal structure. Convert the deprecated functions to set(or clear) the first few bits in the internal structure as specified; they should zero all
> bits 32+. (Be careful about endianisms.)
>
> There ought to be a function to return, in the same format, a structure listing all the protocols implemented by the current library.
>
> This scheme provides backward compatibility with infinite expandability. There's some overhead for the client, but these aren't critical path - they're
> probably setup once and tested never. In the library, the assertions will optimize out, and a compiler will optimize the bit references to be no more
> expensive than the current bit tests. The compatibility layer is pretty thin - it probably ends up being a cast & possible byteswap.
>
> With bit more thought (pun intended), you might be able to avoid introducing the new CURLPROTO_*n symbols - but at first blush, it seems expensive to do
> that while also exposing the existing API.
>
> Polishing is left as an exercise for the reader...

This sounds already off-topic for what Daniel asked about, which is how
to do this in a backwards-compatible way. I can't think of a good way to
do that off-hand.

But just on this: Isn't this a rather elaborate way to do what C gives
you these days (including I think, in C89) with a struct where you
specify the bits a given unsigned int variable should occupy? I.e.:

    struct curl_protocols {
        unsigned int http:1,
                     https:1,
                     ftp:1,
                     ftps:1;
    };

etc., you can keep adding to that at will, trusting the compiler to
expand it for you, and accessing it is going to be:

    proto->https = !!enabled;

I believe that's quite portable, e.g. it's widely used in the git.git
codebase, the first occurance made it in-tree in 2006. I think curl's
more widely ported than that, so YMMV.


-- 
Unsubscribe: https://lists.haxx.se/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html
Received on 2021-09-13