Skip to content

spacecheck.pl: verify tests/data/test* for non-ASCII chars #17329

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

vszakats
Copy link
Member

@vszakats vszakats commented May 12, 2025

Exclude test data files (4 of them) based on existing feature tags:
codeset-utf8 and Unicode.

Add the new keyword non-ascii to mark remaining exceptions (9 files).

Follow-up to 838dc53 #17247

vszakats added 2 commits May 12, 2025 19:40
Exclude test data files (4 of them) based on existing feature tags:
`codeset-utf8` and `Unicode`.

Add a new feature 'codeset-non-ascii' to mark remaining exceptions
(9 files).

Follow-up to 838dc53 curl#17247
@vszakats vszakats added the tests label May 12, 2025
@github-actions github-actions bot added the CI Continuous Integration label May 12, 2025
@vszakats vszakats changed the title spacecheck.pl: verify tests/data/test* files for non-ASCII chars spacecheck.pl: verify tests/data/test* for non-ASCII chars May 12, 2025
@dfandrich
Copy link
Contributor

This new feature should be documented in tests/FILEFORMAT.md preferably with a description since it's not obvious what it does by the name.

@bagder
Copy link
Member

bagder commented May 12, 2025

When you make this a feature, you make sure that all curl builds that are built without this feature (all of them) will skip these tests.

I propose you don't abuse the feature tag for this tagging of test files. Perhaps use <keywords> instead?

@vszakats
Copy link
Member Author

Ah, indeed, I forgot about the side-effect.

Settled with making it a keyword : non-ascii.

@testclutch

This comment was marked as outdated.

@bagder
Copy link
Member

bagder commented May 13, 2025

Instead of skipping the UTF check for these files, we can fix them to use hex encoding instead of UTF-8. I made a test shot of this in #17331

@vszakats vszakats closed this in 9243ed5 May 13, 2025
@vszakats vszakats deleted the uni2 branch May 13, 2025 06:49
vszakats added a commit that referenced this pull request May 13, 2025
- replace ß (scharfes S) with links.
- replace § (section sign) with links.
- replace 🙏 emoji with `:pray:`.
 Supported by GitHub, Forgejo/Gitea and most likely GitLab.
- docs/libcurl/curl_mprintf.md: replace Unicode ± with `{+|-}`.
- docs/CIPHERS.md: URL encode Unicode in URLs.
- lib1560: use hex encoding in `räksmörgås.se`.
- unit1307: use hex encoding in `Lindmätarv`.
- drop LATIN SMALL LETTER A WITH ACUTE exception.
  No longer appears in tests.

This leaves the single character exception: `ö`
And file exceptions holding contributor names.

Follow-up to 9243ed5 #17329
Follow-up to 838dc53 #17247

Closes #17335
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration tests
Development

Successfully merging this pull request may close these issues.

None yet

4 participants