curl / Mailing Lists / curl-users / Single Mail
Buy commercial curl support from WolfSSL. We help you work out your issues, debug your libcurl applications, use the API, port to new platforms, add new features and more. With a team lead by the curl founder himself.

Add more code generating flags like --libcurl: --python --javascript, etc.

From: Boris Verkhovskiy via curl-users <curl-users_at_lists.haxx.se>
Date: Thu, 23 Sep 2021 22:56:30 -0600

Google Chrome, Firefox and Safari have a Network tab in their Dev
Tools that lets you view the network requests they've made and all 3
let you copy those requests as Curl commands. For example, here's
Chrome's code that generates a Curl command from a request (both Bash
and Windows cmd are supported):

https://github.com/ChromeDevTools/devtools-frontend/blob/04d4b64221e472bcbd5d1de16bef59c2cb9f8d02/front_end/panels/network/NetworkLogView.ts#L2033

When you want to scrape a site, you often can't just do

curl 'http://api.example.com?param=value'

because you're almost always missing a cookie or because many sites
block requests made with a non-browser User-Agent header. Instead of
spending time figuring out what the exact problem is by
trial-and-error and writing code that requests a cookie, you fire up
your browser, navigate to the page that displays the data you're
trying to scrape, find the request you're interested in in the Network
tab of the Dev Tools, right click it and click "copy as cURL". At this
point you almost always have a curl command that actually returns data
instead of an error page, at least until the cookie expires (often the
user-agent sniffing is the issue and then the request works for a long
time).

The next problem is that you don't want to write the rest of your code
that parses and manipulates the returned data in Bash. You *could*
stick the curl command in a string and then use whatever way the
programming language you want to use has of executing shell commands,
but that comes with drawbacks, such as error handling (you'd rather
get the return code and catch specific HTTP exceptions using your
languages syntax for doing that instead of parsing curl's
stdout/stderr) and if you want to modify a value, say change '{"page":
1}' to '{"page": 2}' in the JSON you're POSTing to the API you have to
do string manipulation (then you have to worry about escaping quotes
in a bash command etc.) instead of using dictionary access or
whatever. So people have to rewrite the curl command in their
programming language. That's tedious and not real programming, so
there are websites that convert Curl commands into programs in other
programming languages for you. There's

curlconverter: https://curl.trillworks.com/

which is quite popular (5k github stars and personally I've probably
used it as much as I've used curl itself) and can output Python,
JavaScript, Go, Rust, PHP, Java, R, Elixir, Dart or MATLAB, with
varying degrees of quality and support of Curl's options. There's also

curl-to-Go: https://mholt.github.io/curl-to-go/

which only outputs Go but does a more thorough job of supporting
various Curl arguments and parses Bash a bit better.

Curl commands have become a de-facto serialization format for Web
requests. Unlike pcap files, they are human readable and executable.

Over the past couple weeks I've been working on improving both of
these projects' bash parsing and making them parse *all* of curl's
options the way curl parses them. Both are written in JavaScript and
both run in browsers, but curlconverter also has a command line
interface, which currently you use like this

curlconverter --language python 'curl -X POST example.com'

I wanted to change it so that it would be a drop-in replacement for
curl, so you'd paste a curl command, move your cursor to the beginning
of the command line, add "converter" after "curl" and get a program
instead of making the request (because wrapping your curl command in
quotes is tedious and error prone if your command contains quotes
itself).

curlconverter -X POST example.com

but because curlconverter supports multiple languages, we need a way
to specify which one to output. I considered a number of options,
which I laid out in this comment
https://github.com/NickCarneiro/curlconverter/pull/285#issuecomment-926325378
, but ultimately I think the best bet would be to make `curlconverter`
act exactly like `curl` except with a few more options over top. But
then, why don't we just add this code generation feature to curl
itself? Ultimately you'd want curlconverter's parsing to exactly match
curl's, which means you'd have to convert curl's argument parsing code
to JS. For example, right now curlconverter's parsing works in two
steps: it first parses the arguments into a JS object then converts
that object into a request, but this doesn't order stuff correctly,
for example if you have

curl --data one --data-urlencode two --data-binary three --data four

you'll end up with

{
data: ['one', 'four'],
data-urlencode: ['two'],
data-binary: ['three']
}

and you can't put that together in the correct order. Whereas Curl
adds each argument in the order it appears, so if we want to get small
details like that right, we'll have to handle each argument as it
arises in the input arguments instead of doing it in two steps.
There's also the danger that if I add a `--javascript` flag, if Curl
ever adds its own `--javascript` flag, that would be a problem. So I
thought, why am I duplicating all this work just to effectively
overlay some command line flags over curl? Can't I just convince
Daniel that this would be useful?

It should then be possible to make Curl's conversion code run in the
browser by compiling it to WASM. Although, curconverter's job is a bit
harder than curl's because we also want to notice when the input
command line contains $BASH_VARIABLES and generate code that gets
those from the env whereas curl doesn't need to worry about that since
it's a bash command already... but we can think about that later.

TL;DR: I don't have any hope of you doing this, so I guess the purpose
of my rant was to make the observation that Curl commands are a
serialization format. Just something to think about. Thank you for
coming to my TED talk.
-- 
Unsubscribe: https://lists.haxx.se/listinfo/curl-users
Etiquette:   https://curl.haxx.se/mail/etiquette.html
Received on 2021-09-24