Project Hiper Roadmap
How we intend/hope this project to proceed:
1. Measure and Benchmark Existing API
- gather many unique existing URLs (a lot more than 10,000)
- mostly to unique hosts, but not necessarily all unique. For the sake of
comparing with a future pipelining version, the test script should be
able to ask for the same document Y times, at least in a later stage.
- Write up a curl-using script and use these services:
http://randomurl.com/body.php
http://random.yahoo.com/fast/ryl
http://www.uroulette.com/visit
- Store the tens of thousands of random URLs in a text file
- Write up a multi interface using program that extracts N random URLs from
the list and downloads them simultaneously.
The random seed and N must be able to be specified to be able to repeat the
exact same set in a subsequent invoke, but in general testing we must vary
the seed in order to not exhaust the same hosts.
libcurl should most likely be built with c-ares support.
Use debug builds to be able to generated memory-trace logs to fully track
and check amount of memory used per connection.
Compare the memory usage measurements with the amount of memory installed
in the development machines so that we remain testing within the amount of
available system RAM.
- Measure time from select() returning to it being called again. Run
multiple times - do many measurements.
2. Implement the curl_multi_socket API
- Port the test program from phase 1 to the new API.
- Run measurements and benchmarks
3. Implement HTTP pipelining support
Make sure that:
- the test program enables pipelining
- the test program talks to a fair amount of servers that support pipelining
- a fair amount of documents are fetched from the same servers so that
pipelining actually can get activated
To make benchmarks, identify high-latency high-bandwith servers that allow
pipelining to get a feel for "optimal" performance boosts.