curl / Mailing Lists / curl-users / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Re: How do I read a regex address?

From: Jeremy Nicoll via curl-users <>
Date: Thu, 15 Jul 2021 20:43:35 +0100

On Thu, 15 Jul 2021, at 19:16, ToddAndMargo via curl-users wrote:
> On 7/15/21 10:29 AM, Jeremy Nicoll via curl-users wrote:
> > On Thu, 15 Jul 2021, at 17:47, ToddAndMargo via curl-users wrote:
> >
> >> Fedora 34
> >> Xfce 4.14
> >> curl-7.76.1-4.fc34.x86_64
> >>
> >> The below address works in a browser, but not curl.
> >
> > Does it really? Are you saying you type (or c&p)
> >
> >[2-9]\d{4}\
> Oh yes: tested with both Firefox and Vivaldi. See
> the second picture link below

Ah. OK.

Well, I see you asked curl for verbose output. Were there any clues in that?

I tried doing this in Firefox. I went to the site's home page then opened FF's
developer tools, then clicked the "Network" tab in tools, then clicked the
"Dev channel" button.

The FF network tools page shows the url (the way you'd think of it). Right click
that and choose copy as... and pick an option. For curl (windows) it gave me
what would be the equivalent command in curl on a windows system:

curl "[2-9]\\d{4}\\." --globoff -H "User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:90.0) Gecko/20100101 Firefox/90.0" -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" -H "Accept-Language: en-GB,en;q=0.5" --compressed -H "Alt-Used:" -H "Connection: keep-alive" -H "Referer:" -H "Upgrade-Insecure-Requests: 1" -H "Sec-Fetch-Dest: document" -H "Sec-Fetch-Mode: navigate" -H "Sec-Fetch-Site: same-origin" -H "Sec-Fetch-User: ?1" -H "TE: trailers"

I don't know how much of the -H operands are neededm but look at the
first part:

   curl "[2-9]\\d{4}\\."

The backslashes have been escaped.

That's to say, without escaping they probably wouldn't make it out of a
Windows command/terminal window without being changed from what
you'd see in a browser url line.

The curl (posix) version is

curl '[2-9]\d{4}\.' --globoff -H 'User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:90.0) Gecko/20100101 Firefox/90.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' -H 'Accept-Language: en-GB,en;q=0.5' --compressed -H 'Alt-Used:' -H 'Connection: keep-alive' -H 'Referer:' -H 'Upgrade-Insecure-Requests: 1' -H 'Sec-Fetch-Dest: document' -H 'Sec-Fetch-Mode: navigate' -H 'Sec-Fetch-Site: same-origin' -H 'Sec-Fetch-User: ?1' -H 'TE: trailers'

which doesn't show escaping of backslashes. So maybe it's not that.

That "--globoff" might be relevant. That seems to tell curl not to do
globbing (ie ?? not expand the regex stuff itself). In this site's case
that regex has to make its way LITERALLY to the php code on the

So I'd suggest you try, at least

curl '[2-9]\d{4}\.' --globoff

You might need a (fake) user-agent header too.

Jeremy Nicoll - my opinions are my own.
Received on 2021-07-15