curl-users
Re: Multiple `make check' failures of curl-7.18.X
Date: Sun, 17 Aug 2008 00:31:42 -0500 (CDT)
>> stunnel exited with 16777215!
>> RUN: failed to start the HTTPS server
>> test 350...FAILED
>> 
>> It seems like the first failed test is 350.  This is not true though.  In 
>> fact, it is 300.  What happens is, when test 300 runs, runtests.pl calls 
>> httpsserver.pl, which fails with the "stunnel exited ..." message, and no 
>> information about test 300 ever appears on STDOUT.  This is misleading.  I 
>> guess it would be useful if the line
>> 1863:    logmsg sprintf("test %03d...", $testnum);
>> could be executed earlier, as soon as it becomes clear that the test should 
>> not be skipped.
> 
> I must admit that I can't see the problem in the code. Can you post a patch 
> for this as well?
Well, it turned out the problem is not what I thought it to be, but yes, there 
is a problem.  I do not have a patch for it, but I will at least try to 
describe it the best I can.
Note that everything described below can be recreated under two conditions:
1. Use unpatched runtests.pl, with line 390 being
         if( -x "$_/$cmd") {
2. stunnel must fail.  If it does not fail on your system, do:
   cd $SRCDIR/tests
   mkdir stunnel
   PATH=`pwd`:${PATH}; export PATH
First, why there is no real problem where I thought there was one.  My original 
statement was that 300 was the first one to fail.  This is not correct.  As I 
fugured running runtests.pl in debug mode, test 300 is the first one that 
requires stunnel.  stunnel fails, and as a result, the test is skipped, but 
nothing is written to the terminal.  Below is a piece of code from singeltest() 
where this is happening:
    1851     if($why && !$listonly) {
    1852         # there's a problem, count it as "skipped"
    1853         $skipped++;
    1854         $skipped{$why}++;
    1855         $teststat[$testnum]=$why; # store reason for this test case
    1856
    1857         if(!$short) {
    1858             printf "test %03d SKIPPED: $why\n", $testnum;
    1859         }
    1860
    1861         return -1;
    1862     }
I would like to suggest adding some short (mostly cosmetic, for users' benefit) 
printout as follows:
andyt_at_mate: tests> diff -u runtests.pl.orig runtests.pl
--- runtests.pl.orig    2008-08-11 23:32:57.702335761 -0500
+++ runtests.pl 2008-08-16 21:27:11.777188016 -0500
@@ -1856,6 +1856,8 @@
          if(!$short) {
              printf "test %03d SKIPPED: $why\n", $testnum;
+        } else {
+            logmsg sprintf("test %03d... SKIPPED\n", $testnum);
          }
          return -1;
Now for the real problem.  In the same setup, I am running three tests, 298, 
300, and 350.  298 is the last successful test, 300 is the first skipped 
because of stunnel, and 350 is the first that actually returns failure.  Note 
that I am giving below only relevant output.
./stunnel.pl 298 300 350
test 298...[FTP CWD with --ftp-method nocwd]
-d-p-e-- OK (1   out of 3  , remaining: 00:04)
stunnel exited with 16777215!
RUN: failed to start the HTTPS server
test 300 SKIPPED: failed starting HTTPS server (stunnel)
test 350...[FTP root dir list multicwd]
  data FAILED:
--- log/check-expected  2008-08-16 21:34:54.861863327 -0500
+++ log/check-generated 2008-08-16 21:34:54.861863327 -0500
@@ -1,11 +0,0 @@
-total 20
-drwxr-xr-x   8 98       98           512 Oct 22 13:06 .
-drwxr-xr-x   8 98       98           512 Oct 22 13:06 ..
-drwxr-xr-x   2 98       98           512 May  2  1996 .NeXT
--r--r--r--   1 0        1             35 Jul 16  1996 README
-lrwxrwxrwx   1 0        1              7 Dec  9  1999 bin -> usr/bin
-dr-xr-xr-x   2 0        1            512 Oct  1  1997 dev
-drwxrwxrwx   2 98       98           512 May 29 16:04 download.html
-dr-xr-xr-x   2 0        1            512 Nov 30  1995 etc
-drwxrwxrwx   2 98       1            512 Oct 30 14:33 pub
-dr-xr-xr-x   5 0        1            512 Oct  1  1997 usr
  - abort tests
./runtests.pl 350
test 350...[FTP root dir list multicwd]
-d-p-e-- OK (1   out of 1  , remaining: 00:00)
./runtests.pl 298 350
test 298...[FTP CWD with --ftp-method nocwd]
-d-p-e-- OK (1   out of 2  , remaining: 00:03)
test 350...[FTP root dir list multicwd]
-d-p-e-- OK (2   out of 2  , remaining: 00:00)
./runtests.pl 300 350
stunnel exited with 16777215!
RUN: failed to start the HTTPS server
test 300 SKIPPED: failed starting HTTPS server (stunnel)
test 350...[FTP root dir list multicwd]
-d-p-e-- OK (2   out of 2  , remaining: 00:00)
That's it.  Test 350 fails only if both 298 and 300 are tried and 300 is 
skipped.  Seems like a weird cleanup problem.  This is what caused a huge 
number of failures in my original posting.
------------------------------------------------------
Much later, after a lot more debugging, I believe I found the culprit. What 
happens in this three-test scenario is:
1. Test 298 starts ftp server (port 8992) and leaves it running.
2. Test 300 starts http server (port 8990).
3. Test 300 fails to start https server.
4. Test 300 calls stopservers($verbose), which kills both ftp and http servers. 
However, it leaves %run intact!!!
5. Test 350 calls startservers(@what), where $what[0] is 'ftp'.
6. ftp does not even try to start because all PIDs are defined:
   DB<19>
main::startservers(./runtests.pl:2501):
2501:           if($what eq "ftp") {
   DB<19>
main::startservers(./runtests.pl:2502):
2502:               if(!$run{'ftp'}) {
   DB<19> p $run{'ftp'}
25210 25210
   DB<20>
main::startservers(./runtests.pl:2677):
2677:       return 0;
   DB<20> for (keys %run) {print "$_: $run{$_}\n";}
ftp: 25210 25210
http: 25281 25281
It is questionable whether or not test 300 should kill ftp server (which it 
neither started nor uses).  I will leave this question for your consideration.
So the real problem, the way I see it, is with the stopservers().  After 
killing off all servers, it should undef %run.  I have examined the code, and 
could not find any place where %run would be needed after stopservers().
I have made all these changes and re-ran `make test' without getting any 
errors.  After that, I removed my first patch which ensures that proper stunnel 
is called, and ran all tests again.  Many more tests are skipped but there are 
no failures.  So I think my analysis is correct.
Finally, below is a cumulative patch, against runtest.pl as it is shipped with 
curl-7.18.2:
----------------------------------------------------------------
andyt_at_mate: tests> diff -u runtests.pl.orig runtests.pl
--- runtests.pl.orig    2008-08-11 23:32:57.702335761 -0500
+++ runtests.pl 2008-08-17 00:09:12.806834418 -0500
@@ -387,7 +387,7 @@
      my @paths=(split(":", $ENV{'PATH'}), "/usr/sbin", "/usr/local/sbin",
                 "/sbin", "/usr/bin", "/usr/local/bin" );
      for(@paths) {
-        if( -x "$_/$cmd") {
+        if( -x "$_/$cmd" && ! -d "$_/$cmd") {
              return "$_/$cmd";
          }
      }
@@ -1856,6 +1856,8 @@
          if(!$short) {
              printf "test %03d SKIPPED: $why\n", $testnum;
+        } else {
+            logmsg sprintf("test %03d...SKIPPED\n", $testnum);
          }
          return -1;
@@ -2482,6 +2484,7 @@
              $prev = $pid;
          }
      }
+    undef %run;
      ftpkillslaves($verbose);
  }
----------------------------------------------------------------
Regards,
Andy
Dr Andy Tsouladze
Sr Unix/Storage SysAdmin
Information Resources
-------------------------------------------------------------------
List admin: http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-users
FAQ:        http://curl.haxx.se/docs/faq.html
Etiquette:  http://curl.haxx.se/mail/etiquette.html
Received on 2008-08-17