Buy commercial curl support from WolfSSL. We help you work
out your issues, debug your libcurl applications, use the API, port to new
platforms, add new features and more. With a team lead by the curl founder
himself.
Re: URL query syntax
- Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]
From: Timothe Litt <litt_at_acm.org>
Date: Mon, 2 Oct 2023 06:28:21 -0400
On 02-Oct-23 06:02, Bachir Bendrissou via curl-users wrote:
> Hi,
>
> Are there any query strings that are invalid and should be rejected by
> the Curl parser?
>
> Curl seems to accept all sorts of strings in the query segment. For
> example:
>
> *"https://example.com/?a=ba=a"
>
> *
> The URL is accepted with no errors reported, despite missing a delimiter.
>
> Is this correct?
>
> Thank you,
> Bachir
>
See
RFC 1738, 3986 (& their updates).
https://www.rfc-editor.org/rfc/rfc1738
https://www.rfc-editor.org/rfc/rfc3986.html
> 3.4 <https://www.rfc-editor.org/rfc/rfc3986.html#section-3.4>. Query
>
> The query component contains non-hierarchical data that, along with
> data in the path component (Section 3.3 <https://www.rfc-editor.org/rfc/rfc3986.html#section-3.3>), serves to identify a
> resource within the scope of the URI's scheme and naming authority
> (if any). The query component is indicated by the first question
> mark ("?") character and terminated by a number sign ("#") character
> or by the end of the URI.
>
>
>
> Berners-Lee, et al. Standards Track [Page 23]
> ------------------------------------------------------------------------
>
> RFC 3986 <https://www.rfc-editor.org/rfc/rfc3986> URI Generic Syntax
> January 2005
>
>
> query = *( pchar / "/" / "?" )
>
> The characters slash ("/") and question mark ("?") may represent data
> within the query component. Beware that some older, erroneous
> implementations may not handle such data correctly when it is used as
> the base URI for relative references (Section 5.1 <https://www.rfc-editor.org/rfc/rfc3986.html#section-5.1>), apparently
> because they fail to distinguish query data from path data when
> looking for hierarchical separators. However, as query components
> are often used to carry identifying information in the form of
> "key=value" pairs and one frequently used value is a reference to
> another URI, it is sometimes better for usability to avoid percent-
> encoding those characters.
Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.
Received on 2023-10-02
Date: Mon, 2 Oct 2023 06:28:21 -0400
On 02-Oct-23 06:02, Bachir Bendrissou via curl-users wrote:
> Hi,
>
> Are there any query strings that are invalid and should be rejected by
> the Curl parser?
>
> Curl seems to accept all sorts of strings in the query segment. For
> example:
>
> *"https://example.com/?a=ba=a"
>
> *
> The URL is accepted with no errors reported, despite missing a delimiter.
>
> Is this correct?
>
> Thank you,
> Bachir
>
See
RFC 1738, 3986 (& their updates).
https://www.rfc-editor.org/rfc/rfc1738
https://www.rfc-editor.org/rfc/rfc3986.html
> 3.4 <https://www.rfc-editor.org/rfc/rfc3986.html#section-3.4>. Query
>
> The query component contains non-hierarchical data that, along with
> data in the path component (Section 3.3 <https://www.rfc-editor.org/rfc/rfc3986.html#section-3.3>), serves to identify a
> resource within the scope of the URI's scheme and naming authority
> (if any). The query component is indicated by the first question
> mark ("?") character and terminated by a number sign ("#") character
> or by the end of the URI.
>
>
>
> Berners-Lee, et al. Standards Track [Page 23]
> ------------------------------------------------------------------------
>
> RFC 3986 <https://www.rfc-editor.org/rfc/rfc3986> URI Generic Syntax
> January 2005
>
>
> query = *( pchar / "/" / "?" )
>
> The characters slash ("/") and question mark ("?") may represent data
> within the query component. Beware that some older, erroneous
> implementations may not handle such data correctly when it is used as
> the base URI for relative references (Section 5.1 <https://www.rfc-editor.org/rfc/rfc3986.html#section-5.1>), apparently
> because they fail to distinguish query data from path data when
> looking for hierarchical separators. However, as query components
> are often used to carry identifying information in the form of
> "key=value" pairs and one frequently used value is a reference to
> another URI, it is sometimes better for usability to avoid percent-
> encoding those characters.
Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.
-- Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-users Etiquette: https://curl.se/mail/etiquette.html
- application/pgp-signature attachment: OpenPGP digital signature