Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

curl no longer accepts a space in a url, but "dict" protocol doesn't allow %20 #10298

Closed
dekerser opened this issue Jan 15, 2023 · 12 comments
Closed
Assignees

Comments

@dekerser
Copy link

So new curl versions don't accept a space, but do support the dict protocol:

$ curl --version | cut -d' ' -f 1-2
curl 7.87.0
Release-Date: 2022-12-21
Protocols: dict
Features: alt-svc
 
$ curl 'DICT://dict.org/SHOW DB'
curl: (3) URL using bad/illegal format or missing URL

But the DICT protocol requires the space, and doesn't allow a "%20"

$ curl 'DICT://dict.org/SHOW%20DB'
220 dict.dict.org dictd 1.12.1/rf on Linux 4.19.0-10-amd64 <auth.mime> <163438761.16210.1673796575@dict.dict.org>
250 ok
500 unknown command
221 bye [d/m/c = 0/0/0; 0.000r 0.000u 0.000s]

So this works for older versions:

$ curl --version | cut -d' ' -f 1-2
curl 7.74.0
Release-Date: 2020-12-09
Protocols: dict
Features: alt-svc

$ /bin/curl "dict://dict.org/SHOW DB" 2>/dev/null | head -n 5
220 dict.dict.org dictd 1.12.1/rf on Linux 4.19.0-10-amd64 <auth.mime> <163441100.19811.1673797845@dict.dict.org>
250 ok
110 166 databases present
gcide "The Collaborative International Dictionary of English v.0.48"
wn "WordNet (r) 3.0 (2006)"

Shouldn't there be an exception for the DICT protocol?

@dfandrich
Copy link
Contributor

curl deals in URLs and URLs don't allow spaces, so if they're allowed, spaces must be % encoded. However, RFC2229 doesn't define a URL format for sending the SHOW DB command. If it worked in curl in the past, it was a happy accident.

@dekerser
Copy link
Author

Don't agree here, and I was afraid for this answer:

  • Curl implements the "dict" protocol.
  • So dict is a separate protocol, the RFC says "SHOW DB", so this is what it should be no encoding.
  • You can easily skip the space test if the protocol is "dict"

@dfandrich
Copy link
Contributor

dfandrich commented Jan 16, 2023 via email

@jay jay added the DICT label Jan 16, 2023
@dekerser
Copy link
Author

dekerser commented Jan 17, 2023

What I meant was that the dict protocol is supported by curl.
The "SHOW DB" command is, and many others commands are client side commands according to the RFC.
Indeed command approach seems more like a sendmail command exchange, and is called a "command pipeline".
So by enforcing the "space" as "%20" curl no longer allows a large part of the dict commands. So this means curl no longer supports the full dict protocol. You should mention this in your documentation and show in the --version output that dict protocol is only partial. Or you can make an exception and allow the subset of commands in RFC2229.

Mind you, there is a undocumented feature which allows the commands (which I didn't know before). Which is to replace "<space>" with ":" e.g.:

 $ curl dict://dict.org/SHOW:DB 2>/dev/null | head -n 5
 220 dict.dict.org dictd 1.12.1/rf on Linux 4.19.0-10-amd64 <auth.mime> <163757157.16019.1673978319@dict.dict.org>
 250 ok
 110 166 databases present
 gcide "The Collaborative International Dictionary of English v.0.48"
 wn "WordNet (r) 3.0 (2006)" 

For me this workaround is good enough. But for 'curl' I think you should choose between the above two options.
If you want to ignore this issue (third option), at least the workaround is documented 🙂. If you choose for the third option closing this issue is fine by me.

@dfandrich
Copy link
Contributor

dfandrich commented Jan 17, 2023 via email

@rsbeckerca
Copy link
Contributor

Perhaps it might make sense to have curl have an option to translate %20 to a in the URL. So something along the lines of:

curl --space-url "dict://dict.org/SHOW%20DB"

would result in a URL "dict://dict.org/SHOW DB" internally, assuming the internal URL parse won't have an issue with that.

@dfandrich
Copy link
Contributor

dfandrich commented Jan 17, 2023 via email

@rsbeckerca
Copy link
Contributor

A new option really isn't necessary. If we decide to preserve the old behaviour, then curl can just decode the URL path itself before sending it to the server. It's just that this isn't a valid dict URL so all bets are off in how it's handled.

I guess it's too big a deal to ask all of the Dict server project to fix their URLs to conform to modern URL rules. :(

@bagder
Copy link
Member

bagder commented Jan 28, 2023

The fact that curl doesn't URL decode the path when using it for DICT looks like a bug to me.

@bagder bagder self-assigned this Jan 28, 2023
@dfandrich
Copy link
Contributor

I was thinking about this and, from a curl perspective doing this with curl -X 'show db' dict://dict.org/ is probably the cleanest. -X is there to modify the request that would normally be sent which is basically what's desired here. But, are you saying that people are handing out dict urls in the wild like dict://dict.org/show db?

@rsbeckerca
Copy link
Contributor

I was thinking about this and, from a curl perspective doing this with curl -X 'show db' dict://dict.org/ is probably the cleanest. -X is there to modify the request that would normally be sent which is basically what's desired here. But, are you saying that people are handing out dict urls in the wild like dict://dict.org/show db?

I personally wonder how many software products could handle the embedded space cleanly in a URL. Probably nothing in the Office suite or browsers.

@bagder
Copy link
Member

bagder commented Jan 28, 2023

From my view it is not actually a question specifically about space or %20, but about treating URL encoded paths correctly in general for DICT:

As described above, this command line and URL work: curl 'DICT://dict.org/SHOW:DB'

As per how URLs works, we should then be able to also pass this with parts of the path URL-encoded, like if we encode the colon as %3a: curl 'DICT://dict.org/SHOW%3aDB'. This does not work. Replace the S with %53:

curl 'DICT://dict.org/%53HOW:DB' - does not work either. I say this is a bug.

bagder added a commit that referenced this issue Jan 28, 2023
Reported-by: dekerser on github
Fixes #10298
bagder added a commit that referenced this issue Jan 29, 2023
Reported-by: dekerser on github
Fixes #10298
Closes #10354
@bagder bagder closed this as completed in 0c3d542 Jan 29, 2023
bch pushed a commit to bch/curl that referenced this issue Jul 19, 2023
Reported-by: dekerser on github
Fixes curl#10298
Closes curl#10354
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

5 participants