cURL / Mailing Lists / curl-library / Single Mail


Really weird problem with CURLOPT_WRITEFUNCTION

From: Joshua McCracken <>
Date: Wed, 23 Dec 2009 00:52:50 -0500

Hey all!

Working on some code to parse data sent back from servers using
CURLOPT_WRITEFUNCTION and I've run into a very strange problem. The code for
the callback function is as follows:

struct writestruct{
char *wdatas;
size_t size;


int get_url_data(void *buff, size_t size, size_t bytes, void *userp)
size_t actual_size = size * bytes;
struct writestruct *memory = (struct writestruct *)userp;
char *temp;

             //wanted to see why I was segfaulting
        printf("Allocating memory: %ul\r\n", memory->size + actual_size);

            memory->wdatas = malloc(memory->size + actual_size + 1);
            if (memory->wdatas)
            { //time to copy data from buffer to struct
                memcpy(&(memory->wdatas[memory->size]), buff, actual_size);
                memory->size += actual_size;
                memory->wdatas[memory->size] = 0;//null terminate

            temp = strtok(datas.errstr, ", ");
            parse(buff, temp);
            while (temp != NULL)
                temp = strtok(NULL, ", ");
                parse(buff, temp);

           return actual_size;

Now, the url I was testing this on is one for self registration
Long story short, my software is designed to automatically enumerate
usernames in web applications using a variety of methodologies (self
registration functionality, comparing content length fields, and apache
homedir method among others), then perform a dictionary attack against the
accounts that it finds. I've been testing some of the enumeration
functionality on a certain major email service provider (not the password
cracking functionality) just because I need a realistic scenario to play
with to understand how it'd work. This prob isn't a huge deal. It's not like
usernames aren't pretty much public knowledge anyways, particularly when it
comes to email where the first part of the address is almost always the

Anyways, I wanted to parse the page that gets returned for strings
indicative of the presence of captchas because since it has proxy support
(thanks to libcurl) I was thinking it could use tor and restart the daemon
when it detects a captcha so as to obtain a 'new identity', thus evading the
dreaded captchas that more intelligent service providers throw at you when
you send too many username availability queries their way.

So here's where it gets interesting. It's segfaulting. After playing around
with gdb and strace and adding some code to see what exactly was going on, I
was able to determine that when allocating memory for wdatas, it's trying to
allocate 3,087,549,552 bytes. That's over 3 gigs and the page I'm working
with is nowhere near that size lol. Shouldn't even be a kilobyte. Not sure
what's going on here. Anyone have any ideas? As you can see, I modeled my
code very closely with a working implementation showed here:

 I actually went back and ended up making it pretty exact so this is a
little odd.

For the sake of clarification, I'm not doing anything illegal here. If I
was, I most certainly wouldn't be telling everyone on a mailing list about
it. I do, however, believe in disclosure. And computers and IT generally
fascinate me, especially when it comes to information security. So there you
go xD


Freelance Writer @
Owner @

List admin:
Received on 2009-12-23