curl / Mailing Lists / curl-library / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: An API for extracing (HTTP) headers?

From: Timothe Litt <litt_at_acm.org>
Date: Tue, 22 Mar 2022 12:39:29 -0400


On 22-Mar-22 07:00, Daniel Stenberg wrote:
> On Tue, 22 Mar 2022, Timothe Litt via curl-library wrote:
>
>
>
>> curl_easy_header: did you consider returning an array of structures,
>> rather than just one?
>
> I did. I decided that it wouldn't improve the API but would make
> memory management somewhat more complicated. This way, we don't have
> to generate any arrays or lists, making the returned data easier to
> understand and document.
>
Documentation seems like a wash to me.  It would get rid of fields and
the index argument, but would have to say that the returned pointer is
to an array of structs, not just one.

>> This eliminates the BADINDEX error and amount/index in the
>> structure(s), and allows the application to make just one call
>> instead of one/instance.  This seems simpler and more efficient for
>> the application.
>
> I honestly don't think it makes much difference to applications as you
> would need to iterate over entries anyway and then it doesn't matter
> too much if you have to call libcurl again or if you can check an
> already extracted struct.
>
A user can avoid the index parameter and a separate local variable in
most cases (e.g. while( hout->flags & [VALID}) { twiddle(hout->value)
... ++hout; } ).  Fewer parameters is better, as is fewer members in the
struct.  A user doesn't have to understand something that isn't there...

A user doesn't have to worry about what happens if a new header arrives
between calls.  A single API call can guarantee a consistent result. 
Since you allow this to be called while headers are arriving, multiple
calls may return different counts - presumably only increasing.

This API probably isn't performance critical, but as a general rule one
assumes calls are expensive and that it's cheaper to put a loop in a
called routine than to call a routine in a loop.

> As I also implemented support extracting headers into the command line
> tool, I got a small change to actually work with the API a bit and I
> found it rather friendly and straight-forward. It felt good.
>
>> Since the library is holding all the headers, it should not increase
>> the amount of memory required.
>
> Yes it would since libcurl does not store the headers internally using
> the public struct.
>
OK.  But it's only the structs -- 1 (for the end marker) + #headers vs.
1.  You're holding the names and values, which hopefully are the large
items.

I prefer to make one call that gives me a consistent snapshot and has
fewer parameters/members to deal with.  That puts some complexity in the
library - but there's one library, and (hopefully) more than one user.

But it's a matter of style.  You have a working prototype, and it's your
API.

Perhaps others will weigh in.

One more thought - it might be useful to have a flag bit that indicates
when the last header has been received.  That way a user could tell if a
header is missing (vs. perhaps not arrived yet). Likewise whether
iterating over all has returned all in the response (vs. all received so
far).



-- 
Unsubscribe: https://lists.haxx.se/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html
Received on 2022-03-22