########################################################################################## GPXIN-22: Measure impact of Distance Calculations ########################################################################################## Issue Type: Task ----------------------------------------------------------------------------------------- Issue Information ==================== Priority: Minor Status: Closed Resolution: Done (2015-08-27 05:22:13) Project: PHP GPXIngest (GPXIN) Reported By: btasker Assigned To: btasker Components: - Experimental Features Affected Versions: - 1.03 Targeted for fix in version: - 1.03 Time Estimate: 0 minutes Time Logged: 0 minutes ----------------------------------------------------------------------------------------- Issue Description ================== GPXIN-6 implemented automatic distance calculations between trackpoints based on changes in Latitude/Longitude. The original feeling was that this might be un-necessarily processor intensive - in various tests though that doesn't seem to have been the case. So need to run some additional tests and use the data to decide whether or not distance calculations should be enabled by default (if so, there should be an ability to suppress them). ----------------------------------------------------------------------------------------- Attachments ============ - run2.csv.gz ----------------------------------------------------------------------------------------- Issue Relations ================ - relates to GPXIN-6: Distance calculations - relates to GPXIN-23: Move calcDistance out of Experimental Features ----------------------------------------------------------------------------------------- Activity ========== ----------------------------------------------------------------------------------------- 2015-08-13 18:04:11 ----------------------------------------------------------------------------------------- btasker changed priority from '[issuelistpty 3]Major[/issuelistpty]' to '[issuelistpty 4]Minor[/issuelistpty]' ----------------------------------------------------------------------------------------- 2015-08-27 04:24:10 btasker ----------------------------------------------------------------------------------------- Doing a very quick test run to get a basic overview. -- BEGIN SNIPPET -- ben@milleniumfalcon:~/tmp$ grep trkpt test.gpx | wc -l 10320 -- END SNIPPET -- Created two test scripts - _with\_calcs.php_ and _without\_calcs.php_ -- BEGIN SNIPPET -- loadFile('test.gpx'); $gpx->ingest(); print_r($gpx->getGPXNameSpaces()); enableExperimental('calcDistance'); $gpx->loadFile('test.gpx'); $gpx->ingest(); print_r($gpx->getGPXNameSpaces()); -- END SNIPPET -- Test run (not interested in the script output at this point) -- BEGIN SNIPPET -- ben@milleniumfalcon:~/tmp$ for i in {1..6}; do (time php without_calcs.php > /dev/null) ; done; echo "With:"; for i in {1..6}; do (time php with_calcs.php > /dev/null) ; done; real 0m0.446s user 0m0.417s sys 0m0.028s real 0m0.463s user 0m0.444s sys 0m0.016s real 0m0.442s user 0m0.422s sys 0m0.020s real 0m0.443s user 0m0.420s sys 0m0.020s real 0m0.443s user 0m0.410s sys 0m0.032s real 0m0.445s user 0m0.428s sys 0m0.016s With: real 0m0.492s user 0m0.460s sys 0m0.012s real 0m0.466s user 0m0.450s sys 0m0.016s real 0m0.475s user 0m0.421s sys 0m0.052s real 0m0.459s user 0m0.446s sys 0m0.012s real 0m0.477s user 0m0.432s sys 0m0.044s real 0m0.470s user 0m0.426s sys 0m0.044s -- END SNIPPET -- Which gives the following result -- BEGIN SNIPPET -- ========================================================== | Run | 1 | 2 | 3 | 4 | 5 | 6 | ---------------------------------------------------------- | With | 0.492 | 0.466 | 0.475 | 0.459 | 0.477 | 0.470 | ---------------------------------------------------------- | Without| 0.446 | 0.463 | 0.442 | 0.443 | 0.443 | 0.445 | ---------------------------------------------------------- | Diff | 0.046 | 0.003 | 0.033 | 0.016 | 0.034 | 0.025 | ========================================================== -- END SNIPPET -- So on average, have the calculations enabled required an extra 0.026216667 seconds of calculation. Per trackpoint (0.026216667/10320) that's a cost of 0.00000254 seconds. Want to run a few more test runs, with different files and bigger run-sizes, but based on that I can't see any argument against turning the calculations on by default ----------------------------------------------------------------------------------------- 2015-08-27 04:27:58 ----------------------------------------------------------------------------------------- btasker changed status from 'Open' to 'In Progress' ----------------------------------------------------------------------------------------- 2015-08-27 05:11:59 ----------------------------------------------------------------------------------------- btasker added 'run2.csv.gz' to Attachments ----------------------------------------------------------------------------------------- 2015-08-27 05:16:09 ----------------------------------------------------------------------------------------- btasker removed 'run2.csv.gz' from Attachment ----------------------------------------------------------------------------------------- 2015-08-27 05:17:10 btasker ----------------------------------------------------------------------------------------- Larger run this time, triggered with -- BEGIN SNIPPET -- ben@milleniumfalcon:~/tmp$ for i in {1..1000}; do echo "$((time php without_calcs.php) |& grep real | cut -f2),$((time php with_calcs.php) |& grep real | cut -f2)," >> processing_times.csv ; done; ben@milleniumfalcon:~/tmp$ less processing_times.csv # quick check to make sure nothing's > 60 seconds ben@milleniumfalcon:~/tmp$ sed -i 's/0m//g' processing_times.csv ben@milleniumfalcon:~/tmp$ sed -i 's/s,/,/g' processing_times.csv -- END SNIPPET -- I used real time as that's consistently been the highest value. ----------------------------------------------------------------------------------------- 2015-08-27 05:17:32 btasker ----------------------------------------------------------------------------------------- Resulting data is attached (run2.csv.gz) but the overall stats are -- BEGIN SNIPPET -- =================================================================== | | Mean | Median | Max | Min | ------------------------------------------------------------------- | Total | 0.030436 | 0.0305 | 0.105 | -0.006 | ------------------------------------------------------------------- | Per-Row | 0.000002949 | 0.000002949 | 0.000002949 | -0.000000581| =================================================================== -- END SNIPPET -- So, the overall extra time required to perform the distance calculations is negligible, and in at least one case, the version with calculations ran (marginally) faster. The test file I'm using represents a journey of more than 3 hours with varied speeds, so there should be a good range of calculations going on. I'm fairly comfortable with the idea of raising an FR to move _calcDistance_ out of experimental features, so that it's enabled by default (so long as it can be suppressed if needed). ----------------------------------------------------------------------------------------- 2015-08-27 05:22:13 btasker ----------------------------------------------------------------------------------------- GPXIN-23 has been raised to enable the functionality by default. Closing ----------------------------------------------------------------------------------------- 2015-08-27 05:22:13 ----------------------------------------------------------------------------------------- btasker changed status from 'In Progress' to 'Resolved' ----------------------------------------------------------------------------------------- 2015-08-27 05:22:13 ----------------------------------------------------------------------------------------- btasker added 'Done' to resolution ----------------------------------------------------------------------------------------- 2015-08-27 05:22:16 ----------------------------------------------------------------------------------------- btasker changed status from 'Resolved' to 'Closed'