curl / Mailing Lists / curl-users / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: How do I read a regex address?

From: ToddAndMargo via curl-users <curl-users_at_cool.haxx.se>
Date: Thu, 15 Jul 2021 13:13:35 -0700

On 7/15/21 12:43 PM, Jeremy Nicoll via curl-users wrote:
> On Thu, 15 Jul 2021, at 19:16, ToddAndMargo via curl-users wrote:
>> On 7/15/21 10:29 AM, Jeremy Nicoll via curl-users wrote:
>>> On Thu, 15 Jul 2021, at 17:47, ToddAndMargo via curl-users wrote:
>>>
>>>> Fedora 34
>>>> Xfce 4.14
>>>> curl-7.76.1-4.fc34.x86_64
>>>>
>>>> The below address works in a browser, but not curl.
>>>
>>> Does it really? Are you saying you type (or c&p)
>>>
>>> uupdump.net/known.php?q=regex:[2-9]\d{4}\
>>
>> Oh yes: tested with both Firefox and Vivaldi. See
>> the second picture link below
>
> Ah. OK.
>
> Well, I see you asked curl for verbose output. Were there any clues in that?
>
> I tried doing this in Firefox. I went to the site's home page then opened FF's
> developer tools, then clicked the "Network" tab in tools, then clicked the
> "Dev channel" button.
>
> The FF network tools page shows the url (the way you'd think of it). Right click
> that and choose copy as... and pick an option. For curl (windows) it gave me
> what would be the equivalent command in curl on a windows system:
>
> curl "https://uupdump.net/known.php?q=regex:[2-9]\\d{4}\\." --globoff -H "User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:90.0) Gecko/20100101 Firefox/90.0" -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" -H "Accept-Language: en-GB,en;q=0.5" --compressed -H "Alt-Used: uupdump.net" -H "Connection: keep-alive" -H "Referer: https://uupdump.net/" -H "Upgrade-Insecure-Requests: 1" -H "Sec-Fetch-Dest: document" -H "Sec-Fetch-Mode: navigate" -H "Sec-Fetch-Site: same-origin" -H "Sec-Fetch-User: ?1" -H "TE: trailers"
>
> I don't know how much of the -H operands are neededm but look at the
> first part:
>
> curl "https://uupdump.net/known.php?q=regex:[2-9]\\d{4}\\."
>
> The backslashes have been escaped.
>
> That's to say, without escaping they probably wouldn't make it out of a
> Windows command/terminal window without being changed from what
> you'd see in a browser url line.
>
> The curl (posix) version is
>
> curl 'https://uupdump.net/known.php?q=regex:[2-9]\d{4}\.' --globoff -H 'User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:90.0) Gecko/20100101 Firefox/90.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' -H 'Accept-Language: en-GB,en;q=0.5' --compressed -H 'Alt-Used: uupdump.net' -H 'Connection: keep-alive' -H 'Referer: https://uupdump.net/' -H 'Upgrade-Insecure-Requests: 1' -H 'Sec-Fetch-Dest: document' -H 'Sec-Fetch-Mode: navigate' -H 'Sec-Fetch-Site: same-origin' -H 'Sec-Fetch-User: ?1' -H 'TE: trailers'
>
> which doesn't show escaping of backslashes. So maybe it's not that.
>
> That "--globoff" might be relevant. That seems to tell curl not to do
> globbing (ie ?? not expand the regex stuff itself). In this site's case
> that regex has to make its way LITERALLY to the php code on the
> server.
>
> So I'd suggest you try, at least
>
> curl 'https://uupdump.net/known.php?q=regex:[2-9]\d{4}\.' --globoff
>
> You might need a (fake) user-agent header too.
>


Hi Jeremy,

Globoff did the trick! Thank you!

man curl:

-g, --globoff
     This option switches off the "URL globbing parser". When
     you set this option, you can specify URLs that contain
     the letters {}[] without having them being interpreted
     by curl itself. Note that these letters are not normal
     legal URL contents but they should be encoded according
     to the URI standard.

:-)

-T
-----------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
Etiquette: https://curl.haxx.se/mail/etiquette.html
Received on 2021-07-15