curl-library
Crash on shared asynchronous resolver results
Date: Thu, 01 Sep 2011 20:57:31 +0400
Hi Daniel,
I am continuing to work on porting libcurl to the marmalade (former
AirplaySDK) environment (www.madewithmarmalade.com).
I've met irregular application crash in case of sharing DNS entries when
the asynchronous (c-ares particularly) resolver is used. Unfortunately,
I can not yet reproduce the crash on "big" platform like linux or windows.
The crash itself is caused by stack destruction, so I can not determine
source of the problem directly.
In the application, I am trying to make many simultaneous HTTP requests
using multi-handle in asynchronous manner.
The crash never appears if I am using direct IP address and passing
'Host: bla-bla-bla' header myself.
The crash never appears if I am using one multi-handle per one
easy-handle, and not using share-handle.
The crash appears if I am using share-handle to share DNS entries
between easy-handles, whether I am using one multi-handle for all
easy-handlers, or many multi-handles, one multi-handle per one easy-handle.
The crash continues to appear if I am replacing c-ares by my own
marmalade-specific resolver simulating c-ares behavior.
The crash continues to appear if I am replacing real resolving by
hard-coded IP address, returned from the resolver after some number of
main circle turns, but DISAPPEARS, if I am returning the hard-coded IP
address IMMEDIATELY calling callback inside resolver request call.
I've compiled libcurl with DEBUGBUILD option to see infof() messages,
and turned marmalade trace messages on, and found some deterministic
sequence leading to crash.
First of all, the crash happens always after the following sequence of
infof() messages:
* 0x9f83a38 is at send pipe head!
* STATE: CONNECT => WAITRESOLVE handle 0x9f95528; (connection #0)
* About to connect() to jams.doroga.tv port 80 (#0)
* Trying 78.40.184.246... * STATE: WAITRESOLVE => WAITCONNECT handle 0x9f95528; (connection #0)
* About to connect() to jams.doroga.tv port 80 (#1)
Note about the * STATE message on the same line of * Trying message. The
* Trying message should be finished by some result, as I found in
sources, but isn't.
The * Trying ... * STATE sequence appears rare in traces, but sometimes
finishes correctly. The next line *About to connect() ... is a marker to
crash - it happens always, if the sequence of * Trying ... * STATE is
followed by * About to connect() line.
Inserting some number of debug printf-s, I've found the following
detailed trace:
Curl_resolver_init ...
ERROR: S3E_FILE_ERR_NOT_FOUND in s3eFileOpen - Path dev/urandom not found
Curl_resolver_init success
Curl_resolver_getaddrinfo for 0x9f95650/0x9f83a38 ...
Curl_resolver_getaddrinfo for 0x9f95650/0x9f83a38 success
Curl_resolver_is_resolved for 0x9f95650/0x9f83a38 ...
waitperform for 0x9f95650/0x9f83a38 ...
waitperform for 0x9f95650/0x9f83a38 success
Curl_resolver_is_resolved for 0x9f95650/0x9f83a38 success
* 0x9f83a38 is at send pipe head!
* STATE: CONNECT => WAITRESOLVE handle 0x9f95528; (connection #0)
Curl_resolver_init ...
ERROR: S3E_FILE_ERR_NOT_FOUND in s3eFileOpen - Path dev/urandom not found
Curl_resolver_init success
Curl_resolver_is_resolved for 0x9f95650/0x9f83a38 ...
waitperform for 0x9f95650/0x9f83a38 ...
query_completed_cb for 0x9f95650/0x9f83a38 ...
query_completed_cb for 0x9f95650/0x9f83a38 success
waitperform for 0x9f95650/0x9f83a38 success
destroy_async_data 0x9f959f4 ...
destroy_async_data 0x9f959f4 success
Curl_resolver_is_resolved for 0x9f95650/0x9f83a38 success
Curl_resolver_getsock for 0x9f95650/0x9f83a38 ...
Curl_resolver_getsock for 0x9f95650/0x9f83a38 success
* About to connect() to jams.doroga.tv port 80 (#0)
ERROR: S3E_SOCKET_ERR_INPROGRESS in s3eSocketConnect
* Trying 78.40.184.246... * STATE: WAITRESOLVE => WAITCONNECT handle 0x9f95528; (connection #0)
* About to connect() to jams.doroga.tv port 80 (#1)
ERROR: S3E_SOCKET_ERR_INPROGRESS in s3eSocketConnect
Really the crash happens on bad return from the waitconnect() call on
line connect.c:980 because of damaged stack. So, before it is happening,
some code makes damage.
Addresses reported by my printf-s are conn and conn->data values, except
destroy_async_data, which reports the async parameter passed to it.
Lines ended by '...' was inserted at the top of the call, while
'success' or 'failure' - immediately before return statements.
The marmalade trace directly before the crash tells that two
simultaneous connections are ready to be reported to the upper level.
Here the full trace:
31/08/11 14:54:13.642: [0xfa0] SOCKET: s3eInetAton: '8.8.8.8'
ERROR: S3E_FILE_ERR_NOT_FOUND in s3eFileOpen - Path dev/urandom not found
31/08/11 14:54:13.841: [0xfa0] SOCKET: new: (PF_INET, SOCK_DGRAM, IPPROTO_UDP) 5188
31/08/11 14:54:13.848: [0xfa0] SOCKET: s3eSocketCreate -> p=0x1039ea5c id=3030
31/08/11 14:54:13.852: [0xfa0] SOCKET: s3eSocketConnect: 0x1039ea5c -> 8.8.8.8:53
31/08/11 14:54:13.863: [0xfa0] SOCKET: NotifyConnect: 3030 0x1039ea5c
31/08/11 14:54:13.870: [0xfa0] SOCKET: Connect: SUCCESS
* 0x9f73a38 is at send pipe head!
* STATE: CONNECT => WAITRESOLVE handle 0x9f85528; (connection #0)
31/08/11 14:54:13.936: [0xfa0] SOCKET: s3eInetAton: '8.8.8.8'
ERROR: S3E_FILE_ERR_NOT_FOUND in s3eFileOpen - Path dev/urandom not found
31/08/11 14:54:14.144: [0xfa0] SOCKET: s3eSocketClose: 0x1039ea5c open=1
31/08/11 14:54:14.147: [0xfa0] SOCKET: close: 5188 0x1039ea5c
* About to connect() to jams.doroga.tv port 80 (#0)
31/08/11 14:54:14.191: [0xfa0] SOCKET: new: (PF_INET, SOCK_STREAM, IPPROTO_TCP) 5188
31/08/11 14:54:14.195: [0xfa0] SOCKET: s3eSocketCreate -> p=0x1039eb74 id=3031
31/08/11 14:54:14.199: [0xfa0] SOCKET: s3eSocketConnect: 0x1039eb74 -> 78.40.184.246:80
ERROR: S3E_SOCKET_ERR_INPROGRESS in s3eSocketConnect
31/08/11 14:54:14.206: [0xfa0] SOCKET: TryConnect error: 10036
31/08/11 14:54:14.211: [0xfa0] SOCKET: Connect: FAILED (1001: S3E_SOCKET_ERR_INPROGRESS in s3eSocketConnect)
* Trying 78.40.184.246... * STATE: WAITRESOLVE => WAITCONNECT handle 0x9f85528; (connection #0)
* About to connect() to jams.doroga.tv port 80 (#1)
31/08/11 14:54:14.236: [0xfa0] SOCKET: new: (PF_INET, SOCK_STREAM, IPPROTO_TCP) 5180
31/08/11 14:54:14.241: [0xfa0] SOCKET: s3eSocketCreate -> p=0x1039c98c id=3000
31/08/11 14:54:14.257: [0xfa0] SOCKET: s3eSocketConnect: 0x1039c98c -> 78.40.184.246:80
ERROR: S3E_SOCKET_ERR_INPROGRESS in s3eSocketConnect
31/08/11 14:54:14.273: [0xfa0] SOCKET: TryConnect error: 10036
31/08/11 14:54:14.278: [0xfa0] SOCKET: Connect: FAILED (1001: S3E_SOCKET_ERR_INPROGRESS in s3eSocketConnect)
31/08/11 14:54:14.282: [0xfa0] SOCKET: TryConnect error: 10056
31/08/11 14:54:14.284: [0xfa0] SOCKET: NotifyConnect: 3031 0x1039eb74
31/08/11 14:54:14.288: [0xfa0] SOCKET: TryConnect error: 10056
31/08/11 14:54:14.333: [0xfa0] SOCKET: NotifyConnect: 3000 0x1039c98c
First-chance exception at 0x00000000 in s3e_simulator_debug.exe: 0xC0000005: Access violation reading location 0x00000000.
31/08/11 14:54:14.352: [0xfa0] DEFAULT: s3eDeviceHandleException: 1 0x00000000 0x00000000
Memory exception (11) attempting to access: 00000000
S3E runtime is not handling this exception
Unhandled exception at 0x00000000 in s3e_simulator_debug.exe: 0xC0000005: Access violation reading location 0x00000000.
The program '[576] s3e_simulator_debug.exe: Native' has exited with code 0 (0x0).
Some notes to explain marmalade traces.
The dev/urandom file is searched by c-ares. It happens when the c-ares
handle is created.
The NotifyConnect is probably some callback inside a marmalade library,
and Connect: SUCCESS is reported when the successfull connection is
reached the upper level. The both, UDP and TCP protocols are 'connected'
in marmalade.
You can see two consequent NotifyConnect messages between last infof()
message and the crash. It means, that two established connections are
ready to be reported to the upper level from the marmalade environment,
but probably not yet.
The libcurl library version where the problem has been met: 7.21.7. The
7.21.3 version also has the same problem.
The curl_config.h file (manually created) is applied. The curl_build.h
and setup.h files are slightly changed to apply to the marmalade
environment. They are applied also. The full source code of the porting
project, as well as examples where the crash happens, are hosted on the
GitHub at http://github.com/marmalade/libcurl
Please help me resolve the problem.
Regards,
Vsevolod Novikov
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
- text/x-chdr attachment: setup.h
- text/x-chdr attachment: curlbuild.h
- text/x-chdr attachment: curl_config.h