cURL / Mailing Lists / curl-library / Single Mail

curl-library

RE: introduced an auth callback

From: Joe Mason <jmason_at_rim.com>
Date: Mon, 7 May 2012 00:01:37 +0000

> From: curl-library-bounces_at_cool.haxx.se [curl-library-bounces_at_cool.haxx.se] on
> behalf of Daniel Stenberg [daniel_at_haxx.se]
> Sent: Friday, May 04, 2012 4:27 PM
> To: libcurl development
> Subject: introduced an auth callback
>
> > I changed the interface quite a bit from what I proposed below, since I
> > decided that calling setopt from inside a callback is too weird. There's
> > too
> > much confusion about what you can and can't call.
>
> Cool. Yes, I agree with you that it is a bit of a wormcan that we probably are
> better off avoiding.

Although as you suggested, a "curl_cb_foo" namespace for functions which are intended to be called from callbacks might alleviate this. (Is curl_easy_pause the only thing that would fit in this namespace right now? Maybe we should add an alias.)

> > Instead, I'm using this callback signature:
> >
> > curl_auth_result (*curl_auth_callback)(curl_auth_type type,
> > curl_auth_scheme scheme,
> > const char *realm, (
> > unsigned retry_count,
> > const char **username,
> > const char **password,
> > void *userdata);
>
> We're getting an awful lot of arguments. I think we should consider filling
> out a struct and pass in a pointer to it, like:
>
> curl_auth_result (*curl_auth_callback)(struct curl_authcb *info,
> void *userdata);
>
> ... and then popular the 'curl_authcb' struct properly.

Yeah, I agree.

> I'm pretty sure we also need to pass in the exact URL that we tried to use
> when the need for auth was triggered. I'm thinking in cases where libcurl
> follows HTTP redirects and at some point down the line a URL wants auth...

Good point.

> I wonder if it would even be interesting from an application's point of view
> to provide a pointer to the URL info deconstructed, like hostname separately.

Probably. Does curl already have such a struct? If we need to add one, I think that might be out of scope of this patch, though.

> > Where curl_auth_result can be CURLAUTH_RESULT_CONTINUE,
> > CURLAUTH_RESULT_CANCEL or CURLAUTH_RESULT_PAUSE.
>
> I think we also need some more fatal error code like if it runs out of memory
> or similarly. Like CURLAUTH_RESULT_ERROR - due to errors it failed to do
> anything sensible and this tells libcurl to bail out.

What would curl do differently here from CURLAUTH_RESULT_CANCEL? Would the callback be able to set a more detailed error code?

> > username and password are in/out parameters: they point to the credentials
> > used in the last failed login attempt if there is one, or NULL if the
> > callback is being called because there are no credentials set. The user
> > is
> > expected to set new credentials with "*username = new_username; *password
> > =
> > new_password; return CURLAUTH_RESULT_CONTINUE". (The new strings will be
> > strdup'd.)
>
> I see a problem with this concept and I think we need to work on it a bit.
>
> Let me elaborate:
>
> An auth callback that wants to pass back a specific user name would need to
> point to (possibly malloced) memory somewhere to hold that name and while
> libcurl should indeed strdup that memory, how should the application code free
> the memory? It being a callback makes it tricky for an application to store
> that and free later, especially if the callback can get called again before
> the request is done as then it may need to return back a _second_ name that
> also might need to be allocated...

Ah. Aha. Yes, of course. I've been focusing on the CURLAUTH_RESULT_PAUSE case, since that's the one I intend to use, and only testing the simpler case with static strings...

> How do I suggest we do it instead?
>
> I don't have a really good solution, but I think we must let the application
> copy the data to a buffer libcurl holds so that the ownership gets easier and
> it reduces the malloc problem. Possibly the struct mentioned above will point
> to two buffers and hold two buffer sizes and then the application can copy at
> most that many bytes to the destination. I realize it opens up for some nasty
> mistakes and I would like an easier way but I can't really think of any atm.

I think with this in mind, I'd like to go back to the idea of calling a "curl_cb_setcredentials" function from the callback. Even though it would do 99% the same thing as curl_easy_setopt(CURLOPT_USERNAME), etc, making it a separate function would avoid confusion about whether setopt in general is legal to call from callbacks. Then we can use the same simple strdup scheme that curl_easy_setopt uses.

> > If they return CURLAUTH_RESULT_PAUSE, the username and password are
> > ignored
> > and the request is paused as if curl_easy_pause were called. It can then
> > be
> > resumed with the new function
> >
> > CURLcode curl_easy_resume_auth(CURL *handle, const char *username, const
> > char *password);
>
> I'm having more problems with this pause thing - especially since this adds a
> very unique and special-pupose function, but I also realize it might be
> necessary. The question is if a paused state is really that important instead
> of just saying cancel if the info isn't there and then force a renewed request
> later on when the information is indeed available. It would certainly make the
> API cleaner and simpler.

It would make the curl API simpler, but it would make the embedder's job more complicated. My design philosophy (which may not match libcurl's...) is to push as much complexity as possible into the lower level library where it can be dealt with once, rather than having each embedder have to deal with it separately and potentially make mistakes. The difficulty is deciding when you're pushing too much complexity down and complicating the library unjustifiably (as in the famous "worse is better" example with EINTR). In this case, since curl_easy_pause already exists, the implementation of the pause state isn't very complicated.

In this case, the embedder would need to save all the information used to construct the initial request and recreate it, versus just calling a single function to resume. Also if I'm reading the curl_easy_perform docs correctly, the embedder would need to keep track of when the handle is definitely finished with in order to reuse it to send another request. (On the other hand, use of pause/resume could lead to errors if the embedder loses track of whether it's paused or not, so maybe this is no better.)

As a practical matter, if you have to resend the request, for any auth type more complex than Basic, this will lead to an extra request/response roundtrip which is wasteful of network bandwidth, which is significant for mobile systems. (It's one of the problems I'm trying to solve by introducing this in the first place.)

> An alternative approach (that may requite slightly more internal state
> awareness) would make a curl_easy_pause(UNPAUSE) call make the auth callback
> get called again if it previously returned CURLAUTH_RESULT_PAUSE and then the
> transfer can continue like before.

This actually wouldn't require any more state awareness than implementing CURLE_BAD_STATE would.

I prefer having a separate function since, as a user of libcurl, I find it easier to pair up two different methods of pausing with two different methods of resuming. (The difference between an "auth pause" and a regular pause is that you need to set the credentials or cancel before resuming. I thought it was a good idea to put that right into the resume function call.) I felt that if I was writing an app which uses both curl_easy_pause and auth pause (for example on different streams in a multihandle), and I got mixed up about what state it was in, I'd find it less confusing to receive an explicit CURLE_BAD_STATE due to calling curl_easy_pause rather than curl_easy_resume_auth or vice versa, instead of having it accidentally resume reading/writing before I set the credential to use by calling curl_easy_pause on a stream that was using an "auth pause".

But that's a judgement call. Maybe I'm not typical. I can see an argument that using curl_easy_pause directly is easier to describe: "CURL_AUTHRESULT_PAUSE is just like CURL_WRITEFUNC_PAUSE. You can call curl_cb_setcredential before calling curl_easy_pause(UNPAUSE) - if you don't, it will cancel the auth on resume." Actually, that's so much easier to describe than the way I proposed that I just convinced myself. (And it's even easier if you can use curl_cb_setcredential at any time - though not from another thread - in which case the credential will be used for the next 401 or 407 that happens to be received. Don't see any technical reason this wouldn't work.)

> > If the username and password are NULL, this is treated as if the user had
> > returned CURLAUTH_RESULT_CANCEL, otherwise they're strdup'd and the
> > request
> > continues as if the user had returned CURLAUTH_RESULT_CONTINUE.
>
> I think CURLAUTH_RESULT_CONTINUE should imply that we use the name and
> password as provided and if they are NULL we use them blank.

I hadn't thought about blanks. Good idea. This does mean that curl_easy_resumeauth (or curl_cb_setcredential) can't just use two NULL's as a code for cancel. And adding an authresult parameter doesn't make much sense if it's being called when auth is not actually paused.

Really, the concept we want isn't that setting the credentials to NULL "cancels the current auth" - we have three states for each credential (not set at all, set to an empty string, and set to a value). And whenever we enter the auth callback, when we come back for it (either immediately or through pause/resume) if the credentials are not set, it cancels.

We could use "" to mean empty and NULL to mean cleared completely. Or NULL and "" could be synonyms, and we have a separate function to clear the credentials:

CURLcode curl_cb_setcredentials(CURL *curl, curl_auth_type type, const char *username, const char *password);
CURLcode curl_cb_clearcredentials(CURL *curl, curl_auth_type type);

And then to resume from a paused auth: "curl_cb_set_credential(curl, type, username, password); curl_easy_resume(UNPAUSE)". Or to decide to fail the auth: "curl_cb_clear_credentials(curl); curl_easy_resume(UNPAUSE)". Or from inside the auth callback, just call one of those two functions and then return.

What if the credentials are set to the same values that were just tried and failed? Should we cancel then as well, or blindly send the same credentials again and fail again? I favour the former. That way if the embedder forgets to call curl_cb_*, it just fails the auth instead of looping, possibly forever.

Joe
---------------------------------------------------------------------
This transmission (including any attachments) may contain confidential information, privileged material (including material protected by the solicitor-client or other applicable privileges), or constitute non-public information. Any use of this information by anyone other than the intended recipient is prohibited. If you have received this transmission in error, please immediately reply to the sender and delete this information from your system. Use, dissemination, distribution, or reproduction of this transmission by unintended recipients is not authorized and may be unlawful.

-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Received on 2012-05-07