Buy commercial curl support from WolfSSL. We help you work
out your issues, debug your libcurl applications, use the API, port to new
platforms, add new features and more. With a team lead by the curl founder
himself.
Re: libcurl read-like interface
- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]
From: XSLT2.0 via curl-library <curl-library_at_cool.haxx.se>
Date: Sat, 26 Dec 2020 12:09:45 +0100
"Re-architecture"!
Instead of "only critique", here is a (possible) proposition to do both
"read-like" (efficiently) and the current stuff.
There should be a definition of what *semantically* is "read" or "write"
according to the request/use case.
The "do/doing/done" internal phase, should start with a "pre-rw" phase
defined as: whatever is needed to start the read or write phase defined
above.
"do/doing/done" = "pre-rw" + "rw phase" (+ EOF|trailers?)
[For the sake of simplifying the explanation, I am skipping the
"headers" phase here, including them in "pre-rw", they SHOULD of course
still be exposed coherently.]
Let me explain with an simple example: http/1.1 GET
In this case the "read" means reading the body.
"write" means nothing (returns an error).
Now we implement:
curl_easy_perform_rw()
It does the same thing than CURL_CONNECT_ONLY + the new "pre-rw"
phase, then returns to the caller.
In our example, the existing HTTP SETOPTs can still be used prior to
calling curl_easy_perform_rw(): location, user agent, cookies, specific
headers, proxies, authentication, etc...
In a "classic" programming scheme, this is sort of "openfile()".
The caller then has:
curl_easy_read():
Reads the http/1.1 body with the same semantic as read() -with or
without EAGAIN, both are possible, could be through a sort of direct_io
SETOPT flag to mimic the kernel-
This new "read" is now as efficient as it can be, ranging from
directly receiving the exact amount of data from the socket into
caller's provided buffer (plain http/1.?), to being on top of "filters"
plus an OpenSSL BIO stack with possible proxies in the mix.
http/1.1 GET is simple and in this case curl_easy_read() performs
about the same task as curl_easy_recv(). It still makes sense if we want
gzip/http 2, libcurl can simply add "filters" on top of the stack before
exposing curl_easy_read() (same principle as BIO filters for OpenSSL),
and it this case, curl_easy_read() would be curl_easy_recv() +
appropriate "filters".
curl_easy_write(): returns "error" in the example of http GET
curl_easy_eof()
curl_easy_cleanup(): stays to play the role of "closefile()".
curl_easy_perform() still exists (for 99,99% of those needing it of
course!) but is now (in our example):
curl_easy_perform(CURL *curl) {
curl_easy_perform_rw(curl);
while(!curl_easy_eof(curl)) {
sz = curl_easy_read(curl, internal_buf, maxsz);
res = curl->write_callback(internal_buf, 1, sz, userdata);
/* some error handling code */
}
}
Caveat (for simplification): the caller CANNOT mix
curl_read()/curl_write() and write/read callbacks in the same "perform".
So you see, quite a big challenge "reversing the stack" (callbacks on
top of BIO read, not the opposite) for a feature that nobody really
wanted so far!
There might be simpler solutions... they didn't cross my mind with the
few I know of libcurl.
Cheers
Alain
-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.se/mail/etiquette.html
Received on 2020-12-26
Date: Sat, 26 Dec 2020 12:09:45 +0100
"Re-architecture"!
Instead of "only critique", here is a (possible) proposition to do both
"read-like" (efficiently) and the current stuff.
There should be a definition of what *semantically* is "read" or "write"
according to the request/use case.
The "do/doing/done" internal phase, should start with a "pre-rw" phase
defined as: whatever is needed to start the read or write phase defined
above.
"do/doing/done" = "pre-rw" + "rw phase" (+ EOF|trailers?)
[For the sake of simplifying the explanation, I am skipping the
"headers" phase here, including them in "pre-rw", they SHOULD of course
still be exposed coherently.]
Let me explain with an simple example: http/1.1 GET
In this case the "read" means reading the body.
"write" means nothing (returns an error).
Now we implement:
curl_easy_perform_rw()
It does the same thing than CURL_CONNECT_ONLY + the new "pre-rw"
phase, then returns to the caller.
In our example, the existing HTTP SETOPTs can still be used prior to
calling curl_easy_perform_rw(): location, user agent, cookies, specific
headers, proxies, authentication, etc...
In a "classic" programming scheme, this is sort of "openfile()".
The caller then has:
curl_easy_read():
Reads the http/1.1 body with the same semantic as read() -with or
without EAGAIN, both are possible, could be through a sort of direct_io
SETOPT flag to mimic the kernel-
This new "read" is now as efficient as it can be, ranging from
directly receiving the exact amount of data from the socket into
caller's provided buffer (plain http/1.?), to being on top of "filters"
plus an OpenSSL BIO stack with possible proxies in the mix.
http/1.1 GET is simple and in this case curl_easy_read() performs
about the same task as curl_easy_recv(). It still makes sense if we want
gzip/http 2, libcurl can simply add "filters" on top of the stack before
exposing curl_easy_read() (same principle as BIO filters for OpenSSL),
and it this case, curl_easy_read() would be curl_easy_recv() +
appropriate "filters".
curl_easy_write(): returns "error" in the example of http GET
curl_easy_eof()
curl_easy_cleanup(): stays to play the role of "closefile()".
curl_easy_perform() still exists (for 99,99% of those needing it of
course!) but is now (in our example):
curl_easy_perform(CURL *curl) {
curl_easy_perform_rw(curl);
while(!curl_easy_eof(curl)) {
sz = curl_easy_read(curl, internal_buf, maxsz);
res = curl->write_callback(internal_buf, 1, sz, userdata);
/* some error handling code */
}
}
Caveat (for simplification): the caller CANNOT mix
curl_read()/curl_write() and write/read callbacks in the same "perform".
So you see, quite a big challenge "reversing the stack" (callbacks on
top of BIO read, not the opposite) for a feature that nobody really
wanted so far!
There might be simpler solutions... they didn't cross my mind with the
few I know of libcurl.
Cheers
Alain
-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.se/mail/etiquette.html
Received on 2020-12-26