Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transfer numbers for logging, --trace-ids #11185

Closed
wants to merge 9 commits into from
Closed

Conversation

icing
Copy link
Contributor

@icing icing commented May 23, 2023

Adding identifier to transfers and connections using the same connection cache (implicitly
the case when using the same multi handle). With type curl_off_t they are unique for all practical purposes.

Adding new CURLINFO_XFER_ID and CURLINFO_CONN_ID to the interface and documentation
to make them accessible for libcurl applications.

Adding --trace-ids to the curl command line to include the identifier in verbose log output. Example:

[4-x] * processing: https://curl.se/?4
...
[4-x] * STATE: INIT => CONNECT
...
[4-0] * STATE: PROTOCONNECT => DO
[4-0] * h2 [:method: GET]
...
[4-0] < HTTP/2 200
...
[4-x] * Expire cleared

This does not change any logging statements (yet). If we use these in the verbose output of curl itself, it would be suitable to change several statements that now contain connection id and a handle pointer.

lib/conncache.h Outdated
@@ -40,6 +40,7 @@ struct conncache {
struct Curl_hash hash;
size_t num_conn;
long next_connection_id;
long next_easy_id;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking we should maybe switch to curl_off_t here to be sure this is a 64 bit type, as long will be 32 bit on Windows - and the inconsistency might cause confusions down the line.

Or if we are fine with "just" 32 bits before wrapping, maybe we can go with unsigned int.

Whichever direction, I think we can apply the same thinking to the *connection_id fields, for the same reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both sound fine to me. With a 64 bit type, we reach uniqueness for all practical purposes, I guess (inside the same multi/conn cache).

The next question is then if we want to add this to all the info() statements or make this available to the logging functions. For the latter, it would sneak into the API and I am not sure that this is a good idea. But I do not know if someone is attached to the current log strings either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a signed 32 bit counter, there is still a theoretical chance that someone could add a handle every millisecond for 25 days and see it reach max. Going 64 bit takes away the entire discussion.

For the latter, it would sneak into the API and I am not sure that this is a good idea.

Maybe, since the DEBUGFUNCTION gets a CURL * as an argument, we can add a CURLINFO_* variable for this "easy_id" so that the app can opt to show this?

It doesn't solve the connection id, and it also then does not make the output of these id use a standard look as it will be left for the application.

But I do not know if someone is attached to the current log strings either.

I think many users are used to them at the very least, but we never promised that they would remain like this.

Copy link
Contributor Author

@icing icing May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, some more thinking between issue solving. A backwards compatible way to introduce this:

  • go curl_off_t for transfer and connection id (do we have printf format for that one btw?)
  • add CURLINFO_EASY_ID and CURLINFO_CONN_ID for the app to retrieve identifiers (-1 for unassigned, e.g. easy not part of multi yet or connection not assigned)
  • add --trace-id to curl that adds these to its verbose/trace output

Do we provide a standard implementation for apps to set as CURLOPT_DEBUGFUNCTION? Something along the lines of:

CURL_LOGFMT_DEFAULT   (0)
CURL_LOGFMT_IDS       (1 << 0) /* add EASY and CONN ids */
CURL_LOGFMT_TIME      (1 << 1) /* add timestamps */
CURL_LOGFMT_HEXDUMP   (1 << 2) /* dump data in hex table */

/* set DEBUGFUNCTION/DEBUGDATA to log to FILE for this easy handle */
curl_easy_set_log(CURL*, FILE*, int fmt)
/* set DEBUGFUNCTION/DEBUGDATA to log to FILE for all easy handles in multi
    that have no other setting already */
curl_multi_set_log(CURLM*, FILE*, int fmt)

lib/curl_log.h Outdated
@@ -124,15 +124,15 @@ void Curl_log_cf_debug(struct Curl_easy *data, struct Curl_cfilter *cf,
* [CONN-1-0][CF-SSL] this filter very much rocks
* on connection #1 with sockindex 0 for filter of type "SSL". */
#define DMSG(d,msg) \
"[CONN-%ld] "msg, (d)->conn->connection_id
"[CONN-%zd] "msg, (d)->conn->connection_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to use "%" CURL_FORMAT_CURL_OFF_T everywhere for outputting curl_off_t variables correctly. The z flag is for size_t and it can be different than curl_off_t !

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed these macros.

lib/multi.c Outdated
@@ -194,7 +194,7 @@ static void mstate(struct Curl_easy *data, CURLMstate state
#if defined(DEBUGBUILD) && !defined(CURL_DISABLE_VERBOSE_STRINGS)
if(data->mstate >= MSTATE_PENDING &&
data->mstate < MSTATE_COMPLETED) {
long connection_id = -5000;
curl_off_t connection_id = -5000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be -1 now to match the documentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, no longer logged explicitly.

lib/multi.c Outdated
@@ -3690,7 +3704,7 @@ void Curl_expire_clear(struct Curl_easy *data)
}

#ifdef DEBUGBUILD
infof(data, "Expire cleared (transfer %p)", data);
infof(data, "Expire cleared (easy handle %p)", data);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use the transfer id instead for consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the %p since --trace-ids takes care of that.

- add an `id` long to Curl_easy, -1 on init
- once added to a multi (or its own multi), it gets
  a non-negative number assigned by the connection cache
- `id` is unique among all transfers using the same
  cache until reaching LONG_MAX where it will wrap
  around. So, not unique eternally.
- curl_easy_getinfo() can be used to retrieve these
- CURLINFO_CONN_ID returns the connection id attached to
  data or, if none present, data->state.lastconnect_id
- variables and type declared in tool for write out
- it is not the identifier of the handle, but of the transfer
  done by it.
- removing all changes to `infof()` statements
- added and documented --trace-ids to prepend (after the timestamp)
  the transfer and connection identifiers to each verbose log line
- format is [n-m] with `n` being the transfer id and `m` being the
  connection id. In case there is not valid connection id, print 'x'.
- Log calls with a handle that has no transfer id yet, are written
  without any ids.
@icing icing changed the title Internal transfer numbers for logging transfer numbers for logging, --trace-ids Jun 11, 2023
- using CURL_FORMAT_OFF_T for connection ids
- removed curl_log.h macros to log connection ids in infof()
- changed CFLOG() implementation to only log the filter name
  since transfer and connection ids are now handled elsewhere
- removed some connection id explicit logging where it is
  now redundant
- removed some handle pointer logging
@bagder bagder closed this in e024d56 Jun 12, 2023
bagder pushed a commit that referenced this pull request Jun 12, 2023
- added and documented --trace-ids to prepend (after the timestamp)
  the transfer and connection identifiers to each verbose log line
- format is [n-m] with `n` being the transfer id and `m` being the
  connection id. In case there is not valid connection id, print 'x'.
- Log calls with a handle that has no transfer id yet, are written
  without any ids.

Closes #11185
bch pushed a commit to bch/curl that referenced this pull request Jul 19, 2023
- add an `id` long to Curl_easy, -1 on init
- once added to a multi (or its own multi), it gets
  a non-negative number assigned by the connection cache
- `id` is unique among all transfers using the same
  cache until reaching LONG_MAX where it will wrap
  around. So, not unique eternally.
- CURLINFO_CONN_ID returns the connection id attached to
  data or, if none present, data->state.lastconnect_id
- variables and type declared in tool for write out

Closes curl#11185
bch pushed a commit to bch/curl that referenced this pull request Jul 19, 2023
- added and documented --trace-ids to prepend (after the timestamp)
  the transfer and connection identifiers to each verbose log line
- format is [n-m] with `n` being the transfer id and `m` being the
  connection id. In case there is not valid connection id, print 'x'.
- Log calls with a handle that has no transfer id yet, are written
  without any ids.

Closes curl#11185
ptitSeb pushed a commit to wasix-org/curl that referenced this pull request Sep 25, 2023
- add an `id` long to Curl_easy, -1 on init
- once added to a multi (or its own multi), it gets
  a non-negative number assigned by the connection cache
- `id` is unique among all transfers using the same
  cache until reaching LONG_MAX where it will wrap
  around. So, not unique eternally.
- CURLINFO_CONN_ID returns the connection id attached to
  data or, if none present, data->state.lastconnect_id
- variables and type declared in tool for write out

Closes curl#11185
ptitSeb pushed a commit to wasix-org/curl that referenced this pull request Sep 25, 2023
- added and documented --trace-ids to prepend (after the timestamp)
  the transfer and connection identifiers to each verbose log line
- format is [n-m] with `n` being the transfer id and `m` being the
  connection id. In case there is not valid connection id, print 'x'.
- Log calls with a handle that has no transfer id yet, are written
  without any ids.

Closes curl#11185
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

2 participants