########################################################################################## JILS-41: Add support for downstream re-validation ########################################################################################## Issue Type: New Feature ----------------------------------------------------------------------------------------- Issue Information ==================== Priority: Major Status: Closed Resolution: Done (2016-04-29 14:58:32) Project: Jira Issue Listing Script (JILS) Reported By: btasker Assigned To: btasker Affected Versions: - 0.01b Targeted for fix in version: - 0.01b Labels: Caching, Headers, Time Estimate: 0 minutes Time Logged: 154 minutes ----------------------------------------------------------------------------------------- Issue Description ================== The aim is to add support for a downstream client/cache to revalidate a copy of a page they've previously obtained. As an initial step, should add the following headers to page responses - Last-Modified - E-Tag (These two headers have recently been added to Issue pages and attachments/thumbs) With the ultimate aim being to add support for conditional GETs. ----------------------------------------------------------------------------------------- Issue Relations ================ - relates to JILS-40: Database change indicator - Discovered JILS-42: Last-Modified isn't updated if a Project Description changes - Adding headers to Issue Page (Github) (https://github.com/bentasker/Jira-Issue-Listing/commit/81c70556b72efb4f066a6e2a6ce2c424cb4ec10a) - Adding headers to Attachments and Thumbs (Github) (https://github.com/bentasker/Jira-Issue-Listing/commit/5d22f2d4c8fe9d5d1761c19a0283be93108fae05) - RFC 7232 (https://tools.ietf.org/html/rfc7232) ----------------------------------------------------------------------------------------- Activity ========== ----------------------------------------------------------------------------------------- 2016-04-20 12:25:36 ----------------------------------------------------------------------------------------- btasker changed timespent from '0 minutes' to '10 minutes' ----------------------------------------------------------------------------------------- 2016-04-20 12:34:24 btasker ----------------------------------------------------------------------------------------- A little bit of background to help shape requirements: I currently have an install of Sphider (http://www.sphider.eu/) which crawls JILS (and other sites) periodically. Sphider's bot is, being generous, less than clever. The way it performs revalidations is to fetch the entire page, generate an MD5 of the content and then compare that to a stored hash to see whether it needs to reprocess. This is less than efficient as it means the origin has to serve the entire page either way, which really starts to matter when your JIRA database grows beyond a certain (arbitrary) size. Before it places a GET, the bot places a GET in order to check for non-200 results. So I've patched it to also extract Last-Modified from the headers (may well add ETag later). This is now compared against a date stored in the database (no changes there, the date was already stored), and the GET is changed if the date remains the same. So, for Sphider's purposes, as of commits 81c7055 and 5d22f2d revalidation is now possible on Issue pages and attachments (and thumbs, though it doesn't fetch those). However - true revalidation still isn't possible. If (for an example) a caching NGinx reverse proxy were to be placed downstream, it'd periodically need to re-fetch the entire content, as it (correctly) uses conditional requests to revalidate the content rather than placing a HEAD first. So, the final step in this issue is to add support for a more normal use-case - conditional GETS (i.e. If-Modified-Since etc). As no page currently returns a cache-control header, also need to implement a configuration option to allow one to be set if desired (the alternative is forcing a caching age either at Apache or on the downstream proxy). The configuration option should allow a per-class setting to be used, so that different ages can be set for Issue pages, Attachments, Issue indexes (i.e. project home pages, versions, components etc). ----------------------------------------------------------------------------------------- 2016-04-20 12:41:46 btasker ----------------------------------------------------------------------------------------- The other element that needs considering is whether to add Last-Modified/ETag headers to Project, Version and Component indexes. If so, what's the best way to do so? The idea of adding the headers is to make revalidation's cheap, so the SQL statement used to get the necessary data needs to be as simple as possible, otherwise it may well be cheaper to simply re-enumerate the issues for that page. But, there's also the question of how smart to be. Which of the following questions do we ask? - Has any issue linked to from this page changed in some way? - Has any issue linked to from this page changed in such a way it'll have caused this page to change? The difference being that the former is (largely) just a raw check of the _jiraaction_ table. The second involves looking for certain events (change of Issue status/resolution, change of issue title, change of assignee, Change of priority, Change of Type, new issue creation etc). The latter would give a more accurate result, but comes at a cost - The SQL query will be more complex - It ties the query to the current layout, making future changes to the layout more complex (will have to remember to update the query) Going with the simpler route, though, means that the page will need to re-indexed whenever _any_ change is made to an issue within a project - even if that "change" is simply that a comment has been added, or an additional watcher has been added (not even displayed on the issue page currently). But, it does avoid the risk of forgetting to update the query in the future and having new changes not be picked up when they should. ----------------------------------------------------------------------------------------- 2016-04-20 12:42:04 ----------------------------------------------------------------------------------------- btasker changed timespent from '10 minutes' to '26 minutes' ----------------------------------------------------------------------------------------- 2016-04-20 12:48:28 btasker ----------------------------------------------------------------------------------------- I think the above needs some additional thought put into it, so for the time being will look at getting support for Conditional Requests dropped into the Issue and Attachment pages. So will look for and honour - If-Modified-Since - If-None-Match ----------------------------------------------------------------------------------------- 2016-04-20 12:48:30 ----------------------------------------------------------------------------------------- btasker changed status from 'Open' to 'In Progress' ----------------------------------------------------------------------------------------- 2016-04-20 13:31:27 git ----------------------------------------------------------------------------------------- -- BEGIN QUOTE -- Repo: Jira-Issue-Listing Commit: 3d3684c14f127bb11fde441bcc8558b6d3c9f7e7 Author: Ben Tasker