curl / Mailing Lists / curl-library / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: Above and beyond 32 protocols

From: Timothe Litt via curl-library <>
Date: Sat, 10 Sep 2022 08:31:21 -0400

On 10-Sep-22 07:44, Patrick Monnerat via curl-library wrote:
> On 9/13/21 13:01, Daniel Stenberg via curl-library wrote:
>> Hi team.
>> We added support for GOPHERS in late 2020. There's a new PR proposing
>> support for the ManageSieve protocol. We had a PR previously
>> suggesting Gemini support and the other day ICAP was brought up in a
>> discussion. WebSockets is another common one discussed.
>> I don't think it's crazy to imagine that we might add support for
>> more protocols going forward. Sooner or later.
>> This is not a problem we must solve *right now*, but I would feel
>> better if we have an idea about how to address it when we get there.
>> Because I'm convinced we will reach this point eventually.
> One year later, all protocol bits are used !
> In the meantime, CURLOPT_PROTOCOLS_STR has been added for caller's
> use, but this only translates to bits and the internal problem has not
> been resolved yet.
> IMO, using strings internally is much too expensive in overhead.
> Do we have now an idea how we want to extend this internally ?
> - Use a packed struct of bools. Requires C99 for initialization. Very
> clear code for constant protocols but hard to access for a run-time
> computed protocol number.
> - Use an array of 8-bit flags. Also requires C99 for initialization.
> - Use a packed array of flags. Almost impossible to initialize
> statically.
> - Use an array of protocol numbers. High run-time overhead.
> - Drop support for non-64bit curl_off_t.
> - Use a struct with a second set of flags (named CURLPROTO2_*)
> - Something else...
> Adding another protocol will only be possible after this problem is
> resolved.
> I could look at it for an implementation if I knew in which direction
> to go.
> BTW: the websockets protocols are not (yet) handled by protocol2num().
> Patrick

I rather like the array of protocol numbers.  The overhead needn't be
particularly high, especially considering the use cases.

For example:  At compile time (or even curl global init), sort the array
- which allows for a binary search to query support - bsearch() and
qsort() are standard. Further, any application is likely to query
protocol support infrequently (typically at initialization).  And is
also likely to be interested in only a few of the protocols.  So it
could (and could be encouraged to) cache the results in a compact,
application-specific way.  For a binary search, the number of probes to
find a protocol is  at most log(2)N.  So even with 256 protocols, 8. 
It's also easy to enumerate supported protocols with a linear scan.

Many of the same arguments apply to an array of  (pointers to) strings;
in addition to a simple ordered table/binary search, the hsearch_r()
family could be used.  But the overhead is higher, and intuitively not
likely worthwhile with short protocol names, and a relatively modest
number of protocols.  O(1) for a hash isn't much different from O(<10)
for a(n infrequent) binary search.

Either seems reasonable; numbers is simpler and more compact.

Neither enumerating nor querying protocol support should be critical
path items.  Over-optimization is not worthwhile.

Timothe Litt
ACM Distinguished Engineer
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.

Received on 2022-09-10