Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

url: Fix parsing for when 'file' is the default protocol #1124

Closed
wants to merge 3 commits into from

Conversation

jay
Copy link
Member

@jay jay commented Nov 13, 2016

Follow-up to 3463408.

Prior to this set of changes hostnames were silently stripped and the
file scheme did not work properly as the default protocol.

Ref: https://curl.haxx.se/mail/lib-2016-11/0081.html


Also I thought the error message on file://foo/bar was confusing so I changed it

Before: curl: (3) Valid host name with slash missing in URL
After:  curl: (3) Invalid file://hostname/, expected localhost or 127.0.0.1 or none

@mention-bot
Copy link

@jay, thanks for your PR! By analyzing the history of the files in this pull request, we identified @bagder, @dfandrich and @captain-caveman2k to be potential reviewers.

protobuf, path)) &&
strcasecompare(protobuf, "file")) {
for(i = 0; i < 16 && data->change.url[i] &&
data->change.url[i] != '\n'; ++i) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function already verifies that there is no CR or LF in the URL, immediately before this code snippet so this newline check is superfluous.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it, I suppose the check for 16 bytes scheme could instead be modified to the longest supported scheme. Which I think is a draw between TELNET and GOPHER, at merely 6 bytes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'll work on a second draft. The reason I did the things you commented on the way I did is because I was following what is already in the function as a model.

data->change.url[i] != '\n'; ++i) {
if(data->change.url[i] == ':') {
if(!i) {
failf(data, "Bad URL");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even spell out that there was no colon/valid scheme in the URL?

@@ -4073,7 +4098,8 @@ static CURLcode parseurlandfillconn(struct Curl_easy *data,
char *ptr;
if(!checkprefix("localhost/", path) &&
!checkprefix("127.0.0.1/", path)) {
failf(data, "Valid host name with slash missing in URL");
failf(data, "Invalid file://hostname/, "
"expected localhost or 127.0.0.1 or none");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're back to what we discussed on the mailing list. I tried to make the message also convey that there needs to be a valid hostname and slash in the URL as this message will be shown for file://locahost as well, which does have a valid host name but no slash. But sure, error messages are hard and if you think this is better then I don't mind.

@jay
Copy link
Member Author

jay commented Dec 6, 2016

@bagder I made the changes you requested, however I did not change the magic numbers of 15/16 since I wanted to be consistent with the size of protobuf sscanf used elsewhere in the function.

Follow-up to 3463408.

Prior to 3463408 file:// hostnames were silently stripped.

Prior to this commit it did not work when a schemeless url was used with
file as the default protocol.

Ref: https://curl.haxx.se/mail/lib-2016-11/0081.html
Ref: curl#1124
Squash this into draft1

- Support --proto-default file c:/foo/bar.txt

- Support file://c:/foo/bar.txt

Assisted-by: Anatol Belski
@jay
Copy link
Member Author

jay commented Jan 9, 2017

@bagder can you check this again, I did a second draft to cover --proto-default file c:/foo/bar.txt and file://c:/foo/bar.txt. Note I didn't use isalpha for the drive letters since I wasn't sure it would be C locale. I could put the ('a' <= path[0] && path[0] <= 'z') || ('A' <= path[0] && path[0] <= 'Z') into a macro if preferred.

@weltling
Copy link
Contributor

weltling commented Jan 9, 2017

@jay the current version of your patch fully restores the backward compatibility. Please let me know, if i have to test any follow up changes later on.

Thanks.

Copy link
Member

@bagder bagder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks fine to me. I only have one little concern: this has no #ifdefs for windows or windows-like systems so what happens when you try this URL on a *nix system? Shouldn't it fail the URL parsing then?

@jay
Copy link
Member Author

jay commented Jan 11, 2017

I only have one little concern: this has no #ifdefs for windows or windows-like systems so what happens when you try this URL on a *nix system? Shouldn't it fail the URL parsing then?

I thought it would be cleaner, and it could fail on fopen if the OS doesn't support drive letters. If you want I can wrap those areas in #if defined(WIN32) || defined(MSDOS)

@bagder
Copy link
Member

bagder commented Jan 11, 2017

I thought it would be cleaner, and it could fail on fopen if the OS doesn't support drive letters.

You're right. The biggest difference is then what error code it would return back, if we avoid the unlikely event that there actually can be a file named like that on a non-windows machine.

IMO: I think it is safer that we return "malformed URL" earlier and pay the price with the additional #ifdef.

@jay
Copy link
Member Author

jay commented Jan 11, 2017

IMO: I think it is safer that we return "malformed URL" earlier and pay the price with the additional #ifdef.

Well how about immediately after the file url parsing fail if drive letter and not msdos/win. This way the flow is the same for all platforms but there's only one guard and we catch it as malformed rather than passing it around. For example see draft3.

Squash me into previous draft

- Fail when a file:// drive letter is detected and not MSDOS/Windows.
@jay jay force-pushed the fix_file_default-protocol branch from 2d18159 to dd99723 Compare January 11, 2017 20:49
@bagder
Copy link
Member

bagder commented Jan 12, 2017

That's perfectly fine with me and makes the error message even clearer. 👍

@jay jay closed this in 1d4202a Jan 12, 2017
@jay jay deleted the fix_file_default-protocol branch January 12, 2017 20:41
@lock lock bot locked as resolved and limited conversation to collaborators Jan 19, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants