Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wildcard URL option causes infinite loop #11775

Closed
lkordos opened this issue Aug 31, 2023 · 3 comments
Closed

Wildcard URL option causes infinite loop #11775

lkordos opened this issue Aug 31, 2023 · 3 comments
Assignees

Comments

@lkordos
Copy link

lkordos commented Aug 31, 2023

I did this

I run the code:

void Test()
{
   CURL* handle;
   struct callback_data data { 0 };

   int rc = curl_global_init(CURL_GLOBAL_ALL);
   handle = curl_easy_init();
   curl_easy_setopt(handle, CURLOPT_WILDCARDMATCH, 1L);
   CURLcode cc = curl_easy_setopt(handle, CURLOPT_CHUNK_BGN_FUNCTION, file_is_coming);
   curl_easy_setopt(handle, CURLOPT_CHUNK_END_FUNCTION, file_is_downloaded);
   curl_easy_setopt(handle, CURLOPT_WRITEFUNCTION, write_it);
   curl_easy_setopt(handle, CURLOPT_CHUNK_DATA, &data);
   curl_easy_setopt(handle, CURLOPT_WRITEDATA, &data);

   // Tried the following URLs:
   std::string const srcFullPath = "ftp://ftp.aa.com/dir/*.parquet";
   //                            = "ftp://ftp.aa.com/dir/*";
   curl_easy_setopt(handle, CURLOPT_URL, srcFullPath.data());
   rc = curl_easy_perform(handle);
   curl_easy_cleanup(handle);
   curl_global_cleanup();
   return;
}

I expected the following

I expected the parquet data files will be downloaded. The files were downloaded, but after downloading the last file from the remote directory, the download didn't stop. Instead, the download continued with the first file in the directory.

When I changed the option curl_easy_setopt(handle, CURLOPT_WILDCARDMATCH, 1L); to curl_easy_setopt(handle, CURLOPT_WILDCARDMATCH, 0L); and replaced the wildcard in URL with full name of a single file, the single file was downloaded as expected.

curl/libcurl version

curl 8.2.1

operating system

Windows 11 Pro

Using MS VC++ 2022, std::c++20

@bagder bagder added the FTP label Sep 1, 2023
@lkordos
Copy link
Author

lkordos commented Sep 6, 2023

I found an old issue which is maybe related to this issue:

Infinite loop in curl_fnmatch #2015

@dfandrich
Copy link
Contributor

I can reproduce this on git HEAD. It continuously downloads all matching files; when it gets to the end of the wildcards, it starts over again. I've turned the code above into a self-sufficient reproduction example that shows it. In ver. 7.74.0 it runs through files once then stops, and in 8.4.1-dev it loops forever (or, at least a few times before I get bored and stop it).

11775-repro.c.txt

@dfandrich
Copy link
Contributor

I bisected this to commit 843b3ba (ver. 8.1.0). I won't have time to look at it any more in the near future.

@bagder bagder self-assigned this Oct 19, 2023
bagder added a commit that referenced this issue Oct 19, 2023
To avoid the state machine to start over and redownload all the files
*again*.

Reported-by: lkordos on github
Regression from 843b3ba (shipped in 8.1.0)
Bisect-by: Dan Fandrich
Fixes #11775
@bagder bagder closed this as completed in df9aea2 Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

3 participants