Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test 1165: improve pattern matching #12903

Closed
wants to merge 1 commit into from

Conversation

Karlson2k
Copy link
Contributor

  • Fix possible part of the symbol match.
  • Fix excluded digits at the end of the symbols (CURL_DISABLE_POP3 was checked as CURL_DISABLE_POP)

Discovered while working on PR #12897

@Karlson2k
Copy link
Contributor Author

Two failed checks seems to be unrelated.

@@ -44,7 +44,7 @@ sub scanconf {
my ($f)=@_;
open S, "<$f";
while(<S>) {
if(/(CURL_DISABLE_[A-Z_]+)/g) {
if(/(?:^|[^A-Za-z0-9\-])(CURL_DISABLE_[A-Z0-9_]+)(?:$|[^A-Za-z0-9\-])/g) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot be correct, since it puts the wrong part into $1.

Why not just extend the pattern? Surely it doesn't do any false positive matches elsewhere?

 if(/(CURL_DISABLE_[A-Z0-9_]+)/g) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. :)
It is non-capturing group because of ?:
(?:^|[^A-Za-z0-9\-]) is matched but not captured.
The same for (?:$|[^A-Za-z0-9\-]).

It works correctly, I tested it with modified script version with printing all matched results.
Right now there are no false positives here even without the start and end patterns. But this is the test, it must catch incorrect code changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks, that goes beyond what I knew. I still think it's overly complicated for no good reason.

@@ -67,9 +67,9 @@ sub scanconf_cmake {
my ($f)=@_;
open S, "<$f";
while(<S>) {
if(/(CURL_DISABLE_[A-Z_]+)/g) {
if(/(?:^|[^A-Za-z0-9\-])(CURL_DISABLE_[A-Z0-9_]+)(?:$|[^A-Za-z0-9\-])/g) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works as well.
I don't create PRs without proper local testing.
I added debug printing of the results and results are fine.

@@ -85,7 +85,7 @@ sub scan_file {
my ($source)=@_;
open F, "<$source";
while(<F>) {
while(s/(CURL_DISABLE_[A-Z_]+)//) {
while(s/(?:^|[^A-Za-z0-9\-])(CURL_DISABLE_[A-Z0-9_]+)(?:$|[^A-Za-z0-9\-])//) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is correct too.

@Karlson2k Karlson2k changed the title test 1165: improv patter matching test 1165: improve patter matching Feb 8, 2024
@Karlson2k
Copy link
Contributor Author

Karlson2k commented Feb 9, 2024

@bagder The code works correctly.
You don't have questions about documentation pattern matching. It must get all values from documentation. As tests are passed everywhere, it means that matching symbols were found in autoconf, cmake and sources. Otherwise tests would fail.
These non-capturing groups are actually a poor-man replacement for \b (word boundary) which was added in Perl 5.22 As a bonus - it does not depend on used locale.

Let me know, please, if you still think that code should be simplified by removing non-capturing groups.

@Karlson2k
Copy link
Contributor Author

Force-pushed with correct commit text (and rebased)

@bagder
Copy link
Member

bagder commented Feb 9, 2024

Let me know, please, if you still think that code should be simplified by removing non-capturing groups.

I think so. The regexes should be as simple as possible to help readers understand then.

* Fix excluded digits at the end of the symbols ('CURL_DISABLE_POP3'
  was checked as 'CURL_DISABLE_POP')
@Karlson2k
Copy link
Contributor Author

Updated with non-capturing groups removed, as requested.

@Karlson2k
Copy link
Contributor Author

Two build failures seem to be unrelated.

@bagder bagder closed this in b08200d Feb 9, 2024
@bagder
Copy link
Member

bagder commented Feb 9, 2024

Thanks!

@Karlson2k Karlson2k deleted the test1165_fix_01 branch February 9, 2024 22:58
@Karlson2k Karlson2k changed the title test 1165: improve patter matching test 1165: improve pattern matching Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

2 participants