Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broader IPFS and IPNS URL support like http/s #12152

Closed
wants to merge 1 commit into from

Conversation

markg85
Copy link
Contributor

@markg85 markg85 commented Oct 18, 2023

Commit msg:

Previously just ipfs://<cid> and ipns://<cid> was supported. Which is too strict for some usecases.

This patch allows paths and query arguments to be used too. Making this work according to normal http semantics: ipfs://<cid>/foo/bar?key=val
ipns://<cid>/foo/bar?key=val

The gateway url support is changed.
It now only supports gateways in the form of:
http://<gateway>/foo/bar
http://<gateway>

Query arguments here are explicitly not allowed and trigger an intended malformed url error.

There also was a crash when IPFS_PATH was set with a non trailing forward slash. This has been fixed.

Lastly, a load of test cases have been added to verify the above.

This fixes #12148 and a couple more issues that had been found.

I've added a lot of testcases to verify the fixes.
Still, the testing isn't super elaborate (say everything for ipfs, then another time for ipns and then everything 3x for the 3 different ways to set a gateway). The reasoning is that much code is following the same path so an issue popping up for ipns will be there for ipfs too. As long as that stays as-is then those current tests cover a lot :)

cc @lidel @Stebalien

Copy link

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basic feedback but I'm not familiar with curl's coding guidelines.

src/tool_operate.c Outdated Show resolved Hide resolved
src/tool_operate.c Outdated Show resolved Hide resolved
src/tool_operate.c Outdated Show resolved Hide resolved
src/tool_operate.c Outdated Show resolved Hide resolved
src/tool_operate.c Outdated Show resolved Hide resolved
@markg85
Copy link
Contributor Author

markg85 commented Oct 19, 2023

@bagder

[100%] Linking C executable curl
/usr/bin/ld: CMakeFiles/curl.dir/tool_operate.c.o: in function `ensure_trailing':
tool_operate.c:(.text+0x1384): undefined reference to `Curl_saferealloc'
collect2: error: ld returned 1 exit status
gmake[2]: *** [src/CMakeFiles/curl.dir/build.make:873: src/curl] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:526: src/CMakeFiles/curl.dir/all] Error 2
gmake: *** [Makefile:136: all] Error 2
Error: Process completed with exit code 2.

Any idea why many compilers can't seem to find Curl_saferealloc? I did add the header for it strdup.h. Is this function exclusive to libcurl and can't be used in the tool?

For reference, on my local pc (gcc (GCC) 13.2.1 20230801) it works just fine.

Edit
Nevermind, it is part of libcurl.
I'll just use the normal realloc instead.

@bagder
Copy link
Member

bagder commented Oct 19, 2023

I'll just use the normal realloc instead.

I recommend using a dynbuf instead. Because it is less easy to mess up.

src/tool_operate.c Outdated Show resolved Hide resolved
src/tool_operate.c Outdated Show resolved Hide resolved
src/tool_operate.c Outdated Show resolved Hide resolved
src/tool_operate.c Outdated Show resolved Hide resolved
src/tool_operate.c Outdated Show resolved Hide resolved
src/tool_operate.c Outdated Show resolved Hide resolved
tests/data/DISABLED Outdated Show resolved Hide resolved
src/tool_operate.c Outdated Show resolved Hide resolved
src/tool_operate.c Outdated Show resolved Hide resolved
src/tool_operate.c Outdated Show resolved Hide resolved
@markg85
Copy link
Contributor Author

markg85 commented Oct 20, 2023

Hi @bagder,

Just flattened it all to one commit.
It's probably ready to get merged now.

If it's ok with you, i'd prefer to wait merging till a second functional-review has been done by @lidel.
Is merging on Monday still fine to make it in the 8.5.0 release?

@bagder
Copy link
Member

bagder commented Oct 21, 2023

Is merging on Monday still fine to make it in the 8.5.0 release?

This is a bugfix, we can merge such up until December 6...

src/tool_operate.c Outdated Show resolved Hide resolved
src/tool_operate.c Outdated Show resolved Hide resolved
src/tool_operate.c Outdated Show resolved Hide resolved
Previously just ipfs://<cid> and ipns://<cid> was supported.
Which is too strict for some usecases.

This patch allows paths and query arguments to be used too.
Making this work according to normal http semantics:
ipfs://<cid>/foo/bar?key=val
ipns://<cid>/foo/bar?key=val

The gateway url support is changed.
It now only supports gateways in the form of:
http://<gateway>/foo/bar
http://<gateway>

Query arguments here are explicitly not allowed and trigger
an intended malformed url error.

There also was a crash when IPFS_PATH was set with a non
trailing forward slash. This has been fixed.

Lastly, a load of test cases have been added to verify the above.
@markg85
Copy link
Contributor Author

markg85 commented Nov 3, 2023

Hi @bagder,

I just updated the patch with a little off-site review feedback. Only thing changed is the CID in the testcases (from Qm..., base58) to be of the "v1" (bafyt... base32) format. It's only to discourage the Qm... format and not keep copying it over in new testcases.

Feel free to merge it!

@mkauf
Copy link
Contributor

mkauf commented Nov 7, 2023

There are failed tests for this change in the autobuilds. E. g. test 724: "Malformed URL". https://curl.se/dev/log.cgi?id=20231107030851-1844341#prob1 or https://curl.se/dev/log.cgi?id=20231107033909-1867421#prob1

@bagder
Copy link
Member

bagder commented Nov 7, 2023

I did some further tidying up in 01d9b8b so it may change things a little.

@markg85
Copy link
Contributor Author

markg85 commented Nov 7, 2023

I'm happy some testcases fail, means I've added enough cases to catch edits that might change the behavior :)

@mkauf
Copy link
Contributor

mkauf commented Nov 11, 2023

@markg85 / @bagder

  • There is an endianness bug. Proposed bugfix:
diff --git a/src/tool_ipfs.c b/src/tool_ipfs.c
index 271fff2cd..435d1697c 100644
--- a/src/tool_ipfs.c
+++ b/src/tool_ipfs.c
@@ -104,7 +104,8 @@ static char *ipfs_gateway(void)

     /* get the first line of the gateway file, ignore the rest */
     while((c = getc(gateway_file)) != EOF && c != '\n' && c != '\r') {
-      if(curlx_dyn_addn(&dyn, &c, 1))
+      char c_char = (char)c;
+      if(curlx_dyn_addn(&dyn, &c_char, 1))
         goto fail;
     }
  • Some tests set the environment variable IPFS_DATA. But this variable is not checked at all in the curl tool code. Probably you wanted to set the environment variable IPFS_PATH.

  • Typo in the tests: Change "traling" to "trailing"

  • Wrong link in the documentation: Change "#Automatic-gateway-detection" to "#automatic-gateway-detection" in docs/IPFS.md

@markg85
Copy link
Contributor Author

markg85 commented Nov 11, 2023

Hi @mkauf

Thank you for that review, that's much appreciated!

There is an endianness bug. Proposed bugfix:

I'll fix it with that suggestion.
Meanwhile, could you explain the reasoning why adding an extra temp variable with c casts as char magically fixes endianess? Internally - in curlx_dyn_addn - the input is casted to unsigned char *. I'd assume that cast + the size of 1 (1 byte) would all be enough to catch endian differences, no?

Some tests set the environment variable IPFS_DATA. But this variable is not checked at all in the curl tool code. Probably you wanted to set the environment variable IPFS_PATH.

Very good catch! While IPFS_DATA does exist, the intended use here is IPFS_PATH. Will fix the testcases.
The real thing that is being tested there though (tests 736, 737 and 738) isn't even the environment variable. It's making sure IPFS knows where to look for it's config file. Which by default is in $HOME/.ipfs. In other terms, tests 736 and 738 aren't testing properly and 737 just tests what's the default already.

Typo in the tests: Change "traling" to "trailing"

Fixed.

Wrong link in the documentation: Change "#Automatic-gateway-detection" to "#automatic-gateway-detection" in docs/IPFS.md

Fixed.

PR incoming.
And it's up: #12152

bagder pushed a commit that referenced this pull request Nov 11, 2023
- Fixed endianness bug in gateway file parsing
- Use IPFS_PATH in tests where IPFS_DATA was used
- Fixed typos from traling -> trailing
- Fixed broken link in IPFS.md

Follow-up to 859e88f

Reported-by: Michael Kaufmann
Bug: #12152 (comment)
Closes #12305
@mkauf
Copy link
Contributor

mkauf commented Nov 12, 2023

@markg85

I'd assume that cast + the size of 1 (1 byte) would all be enough to catch endian differences, no?

No, because the pointer still points to an int. It still points to the same memory address as before.

These answers explain it in more detail:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

IPFS/IPNS URLs ignore the path
4 participants