curl / Mailing Lists / curl-library / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: A canonical URL host name dilemma

From: Daniel Stenberg via curl-library <>
Date: Mon, 11 Oct 2021 08:19:52 +0200 (CEST)

On Sun, 10 Oct 2021, Ray Satiro via curl-library wrote:

> If someone passes what is the disadvantage to storing
> it as and only returning it like that?

I don't think there is any disadvantage for that case. The possible
disadvantage rather comes when you use non-ASCII for weird cases like because that will now return the "raw" string "\"
for the host. (not that I can think of a reason anyone would provide a name
like that)

> Is it really necessary to store the encoded version as well when it's ascii
> only?

That's not what's being done. It was only mentioned as an option but it's not
an option anyone likes.

> I've never heard of anyone doing percent encoded ascii hostnames before.

Percent encoded ascii hostnames seems to be very rare in general, which
probably is the reason this hasn't been reported before. After all, curl has
parsed a few URLs over the years and this hasn't been reported until now! [*]

However, the URL syntax says they can be provided like that so we should
support it. To make our parser behave more in line with the spec and more
similar to how other parsers (are expected to) behave.

[*] = this issue has four "reported-by" names in the bug report simply because
those are the four authors of a paper on URL parsers, their differences and
associated problems with that, that is in the works and that I have reviewed.

  | Commercial curl support up to 24x7 is available!
  | Private help, bug fixes, support, ports, new features
Received on 2021-10-11