Tests take a long time to run #10818

dfandrich · 2023-03-22T17:40:50Z

Running the close to 1600 tests in the test suite takes a long time and slows developer productivity. This issue will consolidate efforts to speed up running tests, primarily by allowing them to run in parallel. The roadmap will be along this design proposal.

kdudka · 2023-03-23T07:52:09Z

I like the idea! Did you consider also running multiple HTTP(S) servers in parallel? It is probably not something to begin with. On the other hand, it would be good to design the solution in a way that such an extension could be introduced in the future. Some users build (lib)curl with a configuration where HTTPS is the only enabled protocol and they would not gain anything if the parallelization used a single server instance for each protocol.

bagder · 2023-03-23T07:56:35Z

I think the individual "runners" should be able to run all test servers, which should make it possible to run all runtests test cases in parallel. HTTP and HTTPS are two of the most tested protocols so it is important that they can be highly parallelized.

dfandrich · 2023-03-23T16:57:19Z

Daniel's explanation is right. My design will see many copies (possibly O(hundreds)) of the business end of runtests.pl running, each one starting whatever servers are necessary to execute perhaps a couple of dozen tests. If each test runner ends up being given one HTTPS test among its dozen, then each runner will start its own copy of an HTTPS server. With some intelligence, I hope to avoid that by passing HTTPS tests to runners that are specializing in HTTPS and FTP test to runners specializing in FTP so avoid the inefficiencies of starting too many of the same kind of server. When I first looked at this problem last decade, I prototyped a simpler solution that allowed running tests of different protocols only in parallel, which only had one server of each type running globally. But, since 55% of tests are for HTTP/HTTPS, that solution could never drop the testing time even in half so I didn't pursue it.

These functions scan through the entire test file every time to find the right section, so they can be slow for large test files. Ref: #10818

Unlike some other languages that just copy a pointer, perl copies the entire array contents which takes time for a large array. Ref: #10818

Namely: - Verify that this test case should be run - Start the servers needed to run this test case - Check that test environment is fine to run this test case - Prepare the test environment to run this test case - Run the test command - Clean up after test command - Verify test succeeded Ref: #10818

This takes it from a 1200 line behemoth into something more manageable. The content and order of the functions is taken almost directly from singletest() so the diff sans whitespace is quite short. Ref: #10818

Ref: #10818

Use the feature map stored in the hash table instead. Most of the variables were only used once, to set the value in the hash table. Ref: #10818

These functions scan through the entire test file every time to find the right section, so they can be slow for large test files. Ref: #10818

Unlike some other languages that just copy a pointer, perl copies the entire array contents which takes time for a large array. Ref: #10818

Namely: - Verify that this test case should be run - Start the servers needed to run this test case - Check that test environment is fine to run this test case - Prepare the test environment to run this test case - Run the test command - Clean up after test command - Verify test succeeded Ref: #10818

This takes it from a 1200 line behemoth into something more manageable. The content and order of the functions is taken almost directly from singletest() so the diff sans whitespace is quite short. Ref: #10818

Ref: #10818

Use the feature map stored in the hash table instead. Most of the variables were only used once, to set the value in the hash table. Ref: #10818

This takes it from a 1200 line behemoth into something more manageable. The content and order of the functions is taken almost directly from singletest() so the diff sans whitespace is quite short. Ref: #10818

Ref: #10818

Use the feature map stored in the hash table instead. Most of the variables were only used once, to set the value in the hash table. Ref: #10818

These functions scan through the entire test file every time to find the right section, so they can be slow for large test files. Ref: #10818

Unlike some other languages that just copy a pointer, perl copies the entire array contents which takes time for a large array. Ref: #10818

Namely: - Verify that this test case should be run - Start the servers needed to run this test case - Check that test environment is fine to run this test case - Prepare the test environment to run this test case - Run the test command - Clean up after test command - Verify test succeeded Ref: #10818

This takes it from a 1200 line behemoth into something more manageable. The content and order of the functions is taken almost directly from singletest() so the diff sans whitespace is quite short. Ref: #10818

Ref: #10818

This simplifies error handling in the test verification code and makes it more consistent. Ref: #10818

Use the feature map stored in the hash table instead. Most of the variables were only used only once, to set the value in the hash table. Ref: #10818

The refactored code calls these functions with the same arguments more often, so this prevents redundant test case file parsing. Ref: #10818 Closes #10833

These functions scan through the entire test file every time to find the right section, so they can be slow for large test files. Ref: #10818

Unlike some other languages that just copy a pointer, perl copies the entire array contents which takes time for a large array. Ref: #10818

Namely: - Verify that this test case should be run - Start the servers needed to run this test case - Check that test environment is fine to run this test case - Prepare the test environment to run this test case - Run the test command - Clean up after test command - Verify test succeeded Ref: #10818

Some variables are expanded to arrays and hashes so that multiple runners can be used for running tests. Ref: curl#10818

The main test loop is now able to handle multiple runners, or no additional runner processes at all. At most one process is still created, however. Ref: curl#10818

Each runner needs a unique random seed to reduce the chance of port number collisions. The new scheme uses a consistent per-runner source of randomness which results in deterministic behaviour, as it did before. Ref: curl#10818

This must be done so variables pick up the runner's unique $LOGDIR. Ref: curl#10818

This allows all messages relating to a single test case to be displayed together at the end of the test. Ref: curl#10818

Such as what happens with the --repeat option. Some functions are changed to pass the runner ID instead of relying on the non-unique test number. Ref: curl#10818

Parallel testing is enabled by using a nonzero value for the -j option to runtests.pl. Performant values seem to be about 7*num CPU cores, or 1.3*num CPU cores if Valgrind is in use. Flaky tests due to improper log locking (bug curl#11231) are exacerbated while parallel testing, so it is not enabled by default yet. Fixes curl#10818 Closes curl#11246

Reported-by: Daniel Stenberg Ref: curl#10818 Closes curl#11255

The test-ci target now uses 2 processes by default, but the amount of parallelism is tuned for each CI service and build environment based on results of a number of test runs. Some CI services use super- oversubscribed build machines that can barely run the curl tests already with no parallelism without frequently failing with timing-induced failures. These continue to be run without parallelism. Other services provide two fast, unloaded cores and these run with 14 processes, which is a good default for this kind of environment. Here's a summary of the number of test processes by CI service: Appveyor - 2 (Windows MSVC), 1 (others) Azure - 2 Circle CI - 14 Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows) GitHub Actions - 3 (macOS), 2 (Linux) Some of these are a bit conservative to keep timing-induced flakiness down. The net result is that the first test results should arrive only 3 minutes after a commit submission. Ref: #10818 Closes #11510

Some variables are expanded to arrays and hashes so that multiple runners can be used for running tests. Ref: curl#10818

The main test loop is now able to handle multiple runners, or no additional runner processes at all. At most one process is still created, however. Ref: curl#10818

Each runner needs a unique random seed to reduce the chance of port number collisions. The new scheme uses a consistent per-runner source of randomness which results in deterministic behaviour, as it did before. Ref: curl#10818

This must be done so variables pick up the runner's unique $LOGDIR. Ref: curl#10818

This allows all messages relating to a single test case to be displayed together at the end of the test. Ref: curl#10818

Such as what happens with the --repeat option. Some functions are changed to pass the runner ID instead of relying on the non-unique test number. Ref: curl#10818

Parallel testing is enabled by using a nonzero value for the -j option to runtests.pl. Performant values seem to be about 7*num CPU cores, or 1.3*num CPU cores if Valgrind is in use. Flaky tests due to improper log locking (bug curl#11231) are exacerbated while parallel testing, so it is not enabled by default yet. Fixes curl#10818 Closes curl#11246

Reported-by: Daniel Stenberg Ref: curl#10818 Closes curl#11255

The test-ci target now uses 2 processes by default, but the amount of parallelism is tuned for each CI service and build environment based on results of a number of test runs. Some CI services use super- oversubscribed build machines that can barely run the curl tests already with no parallelism without frequently failing with timing-induced failures. These continue to be run without parallelism. Other services provide two fast, unloaded cores and these run with 14 processes, which is a good default for this kind of environment. Here's a summary of the number of test processes by CI service: Appveyor - 2 (Windows MSVC), 1 (others) Azure - 2 Circle CI - 14 Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows) GitHub Actions - 3 (macOS), 2 (Linux) Some of these are a bit conservative to keep timing-induced flakiness down. The net result is that the first test results should arrive only 3 minutes after a commit submission. Ref: #10818 Closes #11510

The test-ci target now uses 2 processes by default, but the amount of parallelism is tuned for each CI service and build environment based on results of a number of test runs. Some CI services use super- oversubscribed build machines that can barely run the curl tests already with no parallelism without frequently failing with timing-induced failures. These continue to be run without parallelism. Other services provide two fast, unloaded cores and these run with 14 processes, which is a good default for this kind of environment. Here's a summary of the number of test processes by CI service: TODO: completely remove the 2 here: Appveyor - 2 (Windows MSVC), 1 (others) Azure - 2 Circle CI - 14 Cirrus - 28 (macOS), 14 (Linux), 7 (FreeBSD), 5 (macOS torture), 2 (Windows) GitHub Actions - 3 (macOS), 2 (Linux) Some of these are a bit conservative to keep timing-induced flakiness down. The net result is that the first test results should arrive only 3 minutes after a commit submission. Ref: #10818 Closes #11510

dfandrich added the tests label Mar 22, 2023

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: reduce redundant calls to getpart/getpartattr

7f54cfc

These functions scan through the entire test file every time to find the right section, so they can be slow for large test files. Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: stop copying a few arrays where not needed

c2233bb

Unlike some other languages that just copy a pointer, perl copies the entire array contents which takes time for a large array. Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: more refactoring for clarify

9051ba2

Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: remove duplicated feature variables

ca1abf7

Use the feature map stored in the hash table instead. Most of the variables were only used once, to set the value in the hash table. Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: reduce redundant calls to getpart/getpartattr

ea37a83

These functions scan through the entire test file every time to find the right section, so they can be slow for large test files. Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: stop copying a few arrays where not needed

544738c

Unlike some other languages that just copy a pointer, perl copies the entire array contents which takes time for a large array. Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: more refactoring for clarify

1ac9167

Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: remove duplicated feature variables

78cecba

Use the feature map stored in the hash table instead. Most of the variables were only used once, to set the value in the hash table. Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: more refactoring for clarify

5be68ec

Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: remove duplicated feature variables

13d7594

Use the feature map stored in the hash table instead. Most of the variables were only used once, to set the value in the hash table. Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: reduce redundant calls to getpart/getpartattr

c279467

These functions scan through the entire test file every time to find the right section, so they can be slow for large test files. Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: stop copying a few arrays where not needed

805524b

Unlike some other languages that just copy a pointer, perl copies the entire array contents which takes time for a large array. Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: more refactoring for clarity

c94e873

Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: also ignore test file problems when ignoring results

f869f2b

This simplifies error handling in the test verification code and makes it more consistent. Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: remove duplicated feature variables

9efb6b1

Use the feature map stored in the hash table instead. Most of the variables were only used only once, to set the value in the hash table. Ref: #10818

dfandrich added a commit that referenced this issue Mar 24, 2023

runtests: memoize the getpart* subroutines to speed up access

63fd696

The refactored code calls these functions with the same arguments more often, so this prevents redundant test case file parsing. Ref: #10818 Closes #10833

dfandrich added a commit that referenced this issue Mar 25, 2023

runtests: reduce redundant calls to getpart/getpartattr

fa11cf6

These functions scan through the entire test file every time to find the right section, so they can be slow for large test files. Ref: #10818

dfandrich added a commit that referenced this issue Mar 25, 2023

runtests: stop copying a few arrays where not needed

a07bd0e

Unlike some other languages that just copy a pointer, perl copies the entire array contents which takes time for a large array. Ref: #10818

bch pushed a commit to bch/curl that referenced this issue Jul 19, 2023

runtests: prepare main test loop for multiple runners

6ebcbc5

Some variables are expanded to arrays and hashes so that multiple runners can be used for running tests. Ref: curl#10818

bch pushed a commit to bch/curl that referenced this issue Jul 19, 2023

runtests: call initserverconfig() in the runner

b964519

This must be done so variables pick up the runner's unique $LOGDIR. Ref: curl#10818

bch pushed a commit to bch/curl that referenced this issue Jul 19, 2023

runtests: buffer logmsg while running singletest()

fb5938a

This allows all messages relating to a single test case to be displayed together at the end of the test. Ref: curl#10818

bch pushed a commit to bch/curl that referenced this issue Jul 19, 2023

runtests: document the -j parallel testing option

055a8ca

Reported-by: Daniel Stenberg Ref: curl#10818 Closes curl#11255

dfandrich mentioned this issue Jul 24, 2023

CI: enable parallel testing in CI builds #11510

Open

ptitSeb pushed a commit to wasix-org/curl that referenced this issue Sep 25, 2023

runtests: prepare main test loop for multiple runners

2708d31

Some variables are expanded to arrays and hashes so that multiple runners can be used for running tests. Ref: curl#10818

ptitSeb pushed a commit to wasix-org/curl that referenced this issue Sep 25, 2023

runtests: call initserverconfig() in the runner

8abee6a

This must be done so variables pick up the runner's unique $LOGDIR. Ref: curl#10818

ptitSeb pushed a commit to wasix-org/curl that referenced this issue Sep 25, 2023

runtests: buffer logmsg while running singletest()

17c3486

This allows all messages relating to a single test case to be displayed together at the end of the test. Ref: curl#10818

ptitSeb pushed a commit to wasix-org/curl that referenced this issue Sep 25, 2023

runtests: document the -j parallel testing option

2719f2c

Reported-by: Daniel Stenberg Ref: curl#10818 Closes curl#11255

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests take a long time to run #10818

Tests take a long time to run #10818

dfandrich commented Mar 22, 2023

kdudka commented Mar 23, 2023

bagder commented Mar 23, 2023

dfandrich commented Mar 23, 2023 via email

Tests take a long time to run #10818

Tests take a long time to run #10818

Comments

dfandrich commented Mar 22, 2023

kdudka commented Mar 23, 2023

bagder commented Mar 23, 2023

dfandrich commented Mar 23, 2023 via email