Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws_sigv4: fix headers computation for some unhandled cases. #7966

Closed

Conversation

outscale-mgo
Copy link
Contributor

Up to this PR, canonical header, and signed header was hardcoded
in the aws-sigv4 signature.

Those patch handle canonical headers and signed headers creation
as it is explain here:
https://docs.aws.amazon.com/general/latest/gr/sigv4-create-canonical-request.html

The algorithm tell that signed and canonical must contain at last host and x-amz-date.

So we check whatever they are present in the curl http headers list.
If they are, we use the one enter by curl user, otherwise we generate them.
then we to lower, and remove space from each http headers plus host and x-amz-date,
then sort them all by alphabetical order.

This patch also fix a bug with host header, which was ignoring the port.

Because I now handle the port in the signature,
I had to hardcode the host in the tests, in which the port was random,
which was making impossible to had a signature in the tests data.

char *date_full_hdr = curl_maprintf("x-%s-date:%s",
provider1_low, timestamp);
#ifdef DEBUGBUILD
char *force_host = getenv("CURL_FORCETIME");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the environment variable to force host called force time ?

lib/http_aws_sigv4.c Outdated Show resolved Hide resolved
#endif
else {
char *full_host = data->state.aptr.host ? data->state.aptr.host :
curl_maprintf("host:%s", hostname);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No space after the colon? And what if curl_maprintf() returns NULL?

In what situations is data->state.aptr.host NULL when this is called? What then if there's a custom port number or an IPv6 address host name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't know if not having data->state.aptr.host is possible, I've made this ternary out of fear of having full_host NULL, which would be a bug.
Should I remove the condition ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably wise to keep it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CanonicalHeaders must contain Lowercase(HeaderName) + ':' + Trimall(HeaderValue) + '\n'.
so no space before and after the Value.
I don't see why IPv6 wouldn't work with the current code ?
I guess I could try to add manually the port, if the ip is set using conn->host.name, though most case where a port is needed should be handle using data->state.aptr.host.
My guess is that this would be too much code for handling a too niche corner case, maybe add a comment saying we don't handle port if host is set using conn->host.name would be enough ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that data->state.aptr.host consists of \r\n on its end. You have to trim them in order not to create extra empty lines in CanonicalHeaders.

lib/http_aws_sigv4.c Show resolved Hide resolved
}

if(!Curl_checkheaders(data, date_hdr_key))
curl_slist_append(head, date_full_hdr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No return code check for a function that is fallible.

@bagder
Copy link
Member

bagder commented Nov 5, 2021

Why does it have to sort the headers?

I had to hardcode the host in the tests, in which the port was random, which was making impossible to had a signature in the tests data.

I think that's fine. Another way would be to replace the port number part only with a fixed number for the test run.

@outscale-mgo outscale-mgo changed the title Aws sigv4 headers Aws sigv4 headers refacto Nov 8, 2021
@outscale-mgo outscale-mgo force-pushed the aws-sigv4-headers-rebased-master branch 2 times, most recently from d5cbac7 to a12cd0f Compare November 8, 2021 18:35
@outscale-mgo outscale-mgo force-pushed the aws-sigv4-headers-rebased-master branch 7 times, most recently from 35950a2 to 510b082 Compare November 15, 2021 18:32
@outscale-mgo
Copy link
Contributor Author

Why does it have to sort the headers?

the "header" have to be sort in the canonical header and the signed header.
but if I'm correct, I don't change the existing headers order in data->set.headers

@outscale-mgo
Copy link
Contributor Author

note that I still need to converts sequential spaces to a single space, in order to be conform to the algorithm.

@outscale-mgo outscale-mgo force-pushed the aws-sigv4-headers-rebased-master branch from 510b082 to 72d8a3c Compare December 17, 2021 15:53
@outscale-mgo
Copy link
Contributor Author

should be done now

for(l = head; l; l = l->next) {
char *tmp = l->data, *tmp2;

Curl_strntolower(l->data, l->data, strlen(l->data));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to bother.

Here you lowercase the whole header while the specs require you to lowercase only the header name:

https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-header-based-auth.html

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, now the headers are lower-cased properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback :)

*tmp++ = ' '; /* if tab convert to space */
if (*tmp == ' ' || *tmp == '\t')
memmove(tmp, tmp2, strlen(tmp2));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be wrong here, but the spec says:

Trim() Remove any leading or trailing white-space. 

and this code seems to do extra work on intermediate spaces. Do you know any valid examples with space-containing header values?

Please note, that trailing \r\n (from data->state.aptr.host) also could be trimmed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, your right, I'm presently working on fixing this, thanks for the feedback.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Host: bug seems to be fixed now.

@outscale-mgo outscale-mgo force-pushed the aws-sigv4-headers-rebased-master branch from 72d8a3c to b595a41 Compare January 6, 2022 09:57
@matwey
Copy link

matwey commented Jan 8, 2022

@outscale-mgo sorry, it seems that this PR should be rebased onto current master, now it is conflicting.

@outscale-mgo outscale-mgo force-pushed the aws-sigv4-headers-rebased-master branch from b595a41 to 3ebc602 Compare January 8, 2022 12:17
}

for(l = head; l; l = l->next) {
char *tmp = *canonical_headers;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note, that I'm presently refactoring this code to use dynbuf instead of curl_maprintf for this chunk.

head = NULL;
}
#ifdef DEBUGBUILD
else if(force_host) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for information. I am not insisting on anything.

Take a look how I overcame the random port issue using CONNECT_TO feature: fc29695
I think this could have advantage that the real code path is tested in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've use your code, to replace my tests. (also keep you as the author of the test commit)

@outscale-mgo outscale-mgo force-pushed the aws-sigv4-headers-rebased-master branch from 3ebc602 to afdfd05 Compare January 13, 2022 14:01
tests/data/test1937 Show resolved Hide resolved
store = value;

/* skip leading whitespace */
while(*value && ISSPACE(*value))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I suggest using ISBLANK() instead? Or do you actually think this needs to also skip control codes and more?


while(*value) {
int space = 0;
while(*value && ISSPACE(*value)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too

if(data->state.aptr.host) {
size_t pos;

strncpy(full_host, data->state.aptr.host, FULL_HOST_LEN);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is arbitrarily cutting off the header, which then will make it fail. Shouldn't it rather return an error if the existing header is too long?

}
else {
msnprintf(full_host, FULL_HOST_LEN, "host:%s", hostname);
full_host[FULL_HOST_LEN - 1] = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

msnprintf() null-terminates the string, no need to do it again.

lib/http_aws_sigv4.c Show resolved Hide resolved
@bagder
Copy link
Member

bagder commented Sep 28, 2022

also: "This branch has conflicts that must be resolved"

matwey and others added 5 commits October 3, 2022 17:14
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Up to this patch, canonical header, and signed header was hardcoded
in the aws-sigv4 signature.

This patch handle canonical headers and signed headers creation
as it is explain here:
https://docs.aws.amazon.com/general/latest/gr/sigv4-create-canonical-request.html

The algo tell that signed and canonical must contain at last host and x-amz-date.

So we check whatever thoses are present in the curl http headers list.
If they are, we use the one enter by curl user, otherwise we generate them.
then we to lower, and remove space from each http headers plus host and x-amz-date,
then sort them all by alphabetical order.

This patch also fix a bug with host header, which was ignoring the port.

Because I now handle the port in the signature,
I had to hardcode the host in the tests, in which the port was random,
which was making imposible to had a signature in the tests data.

Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
@outscale-mgo outscale-mgo force-pushed the aws-sigv4-headers-rebased-master branch from bde238e to ad0ee7c Compare October 3, 2022 15:19
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Copy link
Member

@bagder bagder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think is is ready for merge. I could only find one nit, and that's in a comment.

/* string been x-PROVIDER-date:TIMESTAMP, I need +1 for ':' */
#define DATE_FULL_HDR_LEN (DATE_HDR_KEY_LEN + TIMESTAMP_SIZE + 1)

/* strlen(timestamp) need to be inferior to TIMESTAMP_SIZE */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose:

/* sizeof(timestamp) needs to be at least TIMESTAMP_SIZE  bytes */

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it's my english lacking here, but doesn't "need to be at last", mean that strlen(timestamp) must be 17 or more ? which would mean an overflow ?
I've just patch the comment to say:

strlen(timestamp) should be 16, and having it bigger than TIMESTAMP_SIZE could cause an overflow

it's more tedious, but more precise I guess.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strlen() is the length of a string, not the size of a buffer so that is just plain wrong to describe how large a buffer is. Also, the word "inferior" does not mean shorter or smaller, it means "worse" or "less good", so it makes no sense there.

My version of the comment describes how large the buffer needs to be. It needs to be at least 17 bytes. Right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but no more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've change it to that

/*
 * timestamp should point to a buffer of at last TIMESTAMP_SIZE bytes,
 * if it point to a buffer bigger than TIMESTAMP_SIZE bytes
 * code could overflow
 */

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain this. How would it "overflow" if the buffer is larger than that? The code assumes 17 bytes, it does not touch the bytes beyond that does it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, overflow was wrong sorry (I thought I use strcpy, but I use n functions),

msnprintf(date_full_hdr, DATE_FULL_HDR_LEN,
                "x-%s-date:%s", provider1, timestamp);

If timestamp is too big here, the signature will fail.

on the other hand timestamp been set like this:

if(!strftime(timestamp, sizeof(timestamp), "%Y%m%dT%H%M%SZ", &tm)) {

I don't think it can be something else than 17.
But if the code calling make_headers is modify, then timestamp could be too large for the buffer in which it is copy.

Copy link
Member

@bagder bagder Oct 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If timestamp is too big here, the signature will fail.

Yes, but the comment speaks of the buffer size for timestamp, now you are talking about the string put into that buffer. They are not the same.

if(!strftime(timestamp, sizeof(timestamp), "%Y%m%dT%H%M%SZ", &tm)) {

Right, and this code makes sure it doesn't overwrite the buffer. It still works perfectly fine with a larger buffer.

I claim your updated comment is wrong. The buffer needs to be at least TIMESTAMP_SIZE bytes big. If it is bigger, the code still works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I've update the comment, thanks for the review.

@outscale-mgo outscale-mgo force-pushed the aws-sigv4-headers-rebased-master branch 3 times, most recently from d513f31 to d11b05f Compare October 10, 2022 13:31
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
@outscale-mgo outscale-mgo force-pushed the aws-sigv4-headers-rebased-master branch from d11b05f to 96eb43c Compare October 10, 2022 13:49
@bagder bagder closed this in 29c4aa0 Oct 11, 2022
@bagder
Copy link
Member

bagder commented Oct 11, 2022

Thanks. I did squash this into a single commit before merging.

obonaventure pushed a commit to mptcp-apps/curl that referenced this pull request Oct 12, 2022
Handle canonical headers and signed headers creation as explained here:
https://docs.aws.amazon.com/general/latest/gr/sigv4-create-canonical-request.html

The algo tells that signed and canonical must contain at last host and
x-amz-date.

So we check whatever thoses are present in the curl http headers list.
If they are, we use the one enter by curl user, otherwise we generate
them.  then we to lower, and remove space from each http headers plus
host and x-amz-date, then sort them all by alphabetical order.

This patch also fix a bug with host header, which was ignoring the port.

Closes curl#7966
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants