MISC-17: Image Tests

Issue Information

Issue Type: Sub-task
Priority: Major
Status: Open

Reported By:
Ben Tasker
Assigned To:
Ben Tasker
Project: Miscellaneous (MISC)
Resolution: Unresolved
Affects Version: TorCDN,
Target version: TorCDN,

Created: 2016-01-17 14:27:27
Time Spent Working
360 minutes
360 minutes
0 minutes
Child of: MISC-12: Optimising Video Delivery for Tor / Building a Tor based CDN

Note: I set this going before raising an issue

Need to run a test of how the infrastructure behaves with multiple small files being requested (in this case Image and HTML files) - as if the files were static content being called as a result of visiting a webpage.

The aim is to have multiple clients making requests over a fairly prolonged period.

Each request needs to be identifiable in the server logs so that it can be compared back to the logs the client is keeping. Using the X-Downstream header should be sufficient for this.

Part of the aim of this test is to see how well (if at all) requests balance across the edge, so both Edge devices need to be online.

Issue Links

Toggle State Changes


The requests were set running a few days ago with variations on the following command
SERIAL=0; while [ $SERIAL -lt 500000 ]; do select=`shuf -i1-2 -n1`; if [ $select == 2 ]; then extension="html"; else extension="gif"; fi;  number=`shuf -i1-2000 -n1`;  curl -H "X-Downstream: Serial-D$SERIAL" -sL -w "D${SERIAL},%{http_code},\"%{url_effective}\",%{time_total},%{time_namelookup},%{time_connect},%{time_redirect},%{time_starttransfer},%{size_download},%{size_request},%{num_redirects},%{speed_download}\\n" -o /dev/null "http://f5jayrbaz7nmtyyr.onion/qrcodes/image-${number}.${extension}" >> metricsC.csv;  SERIAL=$(( $SERIAL + 1 ));  done

There are three running, using Serial prefixes B,C and D. Each writes out to it's own file.

The files being requested are QR codes generated containing random strings, with a variety of other options randomly set to affect file size. In reality, though, the HTML files are all roughly the same size on disk.
Due to a router failure, there was a 13 hour outage in comms for the clients, which led to a number of request failures. However, as it's in the order of about 4000 (per client) when the test comprises 500,000 requests by each client, decided to leave running (we're already around 276,000 requests in anyway).
Have captured the relevant log lines, as expected there are some missing as a result of the router outages
edge2$ zcat combined.log.txt.gz | awk -F'\t' '{print $12}' | sort | uniq -c
 284394 CACHE_HIT
  12972 CACHE_MISS

edge1$ zcat combined.log.txt.gz | awk -F'\t' '{print $12}' | sort | uniq -c
 729769 CACHE_HIT
   8378 CACHE_MISS

MISC17-Image_tests/$ cat metrics*csv | wc -l

MISC17-Image_tests$ cat metrics* | awk -F, '{print $2}' | sort | uniq -c
 303710 000
1034016 200
    497 404
     45 502
     65 504

Fields in the metrics logs are

- 1 - Serial
- 2 - Response code
- 3 - Final url
- 4 - Total request time
- 5 - DNS lookup time
- 6 - Initial Connect time
- 7 - Time spent in redirect
- 8 - TTFB
- 9 - Downloaded size
- 10 - Request size
- 11 - Number of redirects
- 12 - Averaged download speed