JILS-42: Last-Modified isn't updated if a Project Description changes



Issue Information

Issue Type: Bug
 
Priority: Major
Status: Closed

Reported By:
Ben Tasker
Assigned To:
Ben Tasker
Project: Jira Issue Listing Script (JILS)
Resolution: Fixed (2018-01-05 08:39:50)
Affects Version: 0.1, 0.01b,
Target version: 0.1,
Labels: Caching,

Created: 2016-04-22 13:34:37
Time Spent Working
Estimated:
 
120 minutes
Remaining:
  
65 minutes
Logged:
  
55 minutes


Description
JILS-41 introduced support for two new response headers

- E-Tag
- Last-Modified

And for two new request headers

- If-None-Match
- If-Modified-Since

Unfortunately on Project Index pages (also Component and Version indexes), whilst the E-Tag will update if Project properties (description, name, url etc) change, the Last-Modified date doesn't.

This is an issue, because some downstream caches (NGinx being one) only use If-Modified-Since when revalidating. So although the page content has changed, the page will revalidate.

There isn't actually a good way to get around this, as JIRA doesn't keep a log of when project/version/component properties were changed. So there isn't a date in the database we can extract to identify the most recent change (unless an issue has changed, then we can use that).


Issue Links

NGinx Configuration Manual
NGinx Ticket 101
Toggle State Changes

Activity


Looking at the NGinx configuration options, it seems to imply that If-None-Match is supported.

My testbox (nginx/1.8.0) definitely isn't using it, and the NGinx caching guide (https://www.nginx.com/blog/nginx-caching-guide/) says If-Modified-Since will be used.

So it's possible If-None-Match support is more recent. Alternatively - Etags aren't currently enabled on that proxy, so perhaps they need to be turned on for it to consider etags when going upstream?
Got sidetracked...


Simply enabling etags isn't sufficient, as by default NGinx is using HTTP/1.0 to go upstream
GET /browse/EXAMPLE-1.html HTTP/1.0


E-tags don't exist in 1.0, so need to tell NGinx to use 1.1 when going to the origin
proxy_http_version 1.1;


But even then they don't seem to be used. I'm probabyl missing a config option somewhere, will have a look at the code-base to try and see what (and when it was introduced)
Looks like If-None-Match support was introduced in 1.7.3 (see NGinx ticket 101), so should be available on on my install.

Looking at the commits against that issue (primarily https://trac.nginx.org/nginx/changeset/c95d7882dfc9ff90ee161328747114b6b54667a5/nginx ) it looks like the only config variable taken into account is proxy_cache_revalidate, which is already set.

The check's pretty simple:

static ngx_int_t
ngx_http_upstream_cache_etag(ngx_http_request_t *r,
    ngx_http_variable_value_t *v, uintptr_t data)
{
    if (r->upstream == NULL
        || !r->upstream->conf->cache_revalidate
        || r->upstream->cache_status != NGX_HTTP_CACHE_EXPIRED
        || r->cache->etag.len == 0)
    {
        v->not_found = 1;
        return NGX_OK;
    }
 
    v->valid = 1;
    v->no_cacheable = 0;
    v->not_found = 0;
    v->len = r->cache->etag.len;
    v->data = r->cache->etag.data;

    return NGX_OK;
}
That moment where you realise you're being a twatspanner on the public internet. I'm returning an E-Tag header not an ETag header, so of course NGinx isn't trying to use it.

Repo: Jira-Issue-Listing
Commit: d4cf7f137964a4d143d341775b68f52b6589de53
Author: Ben Tasker <github@<Domain Hidden>>

Date: Fri Apr 22 14:44:21 2016 +0100
Commit Message: sigh see http://projects.bentasker.co.uk/jira_projects/browse/JILS-42.html#comment1921340



Modified (-)(+)
-------
attachment.php
component-issues.php
issue_page.php
project-index.php
version-issues.php




Webhook User-Agent

GitHub-Hookshot/cd33156


View Commit

Despite the fix, NGinx is still going upstream with If-Modified-Since instead of If-None-Match. I'm sure I'm probably missing something here though.
I've just double-checked the cached item on disk and the headers are definitely present and correct
# egrep -a "etag|Last-Mod" /usr/share/nginx/cache/static/d/5b/65d69cf4d601773f304231cdb736a5bd
Last-Modified: Thu, 12 Mar 2015 06:24:14 GMT
etag: is-12709-deda25ecae31b80de016e60bed193286bf93fd54


Just to rule out the obvious I switched the ordering of the headers, but no dice. Not that ordering should matter anyway

Updating to Nginx 1.8.1 hasn't helped either.

Doesn't seem to be anything wrong with the header, Chrome is happy to use If-None-Match when revalidating a cached copy of the page. So there's either something wrong with my NGinx config, or NGinx itself.
Looking at the most recent source - https://trac.nginx.org/nginx/browser/nginx/src/http/ngx_http_upstream.c#L5372 - it's not taking any new configuration variables into account when deciding whether an ETag can be used for revalidation.

After running a few tests, it seems to relate to the length of the ETag. I swapped the SHA1 for a MD5 and NGinx now presents an If-None-Match header when attempting to revalidate. Shouldn't matter as ETags are of type Entity-Tag, which according to the spec can be any length (though I had forgotten to quote them).

As the page type, and issue ID are in the clear, only need to worry about collisions within a page, so I think it should be safe to swap everything over to MD5. Will work through and do that now (and insert the errant quotes)

It definitely seems to have to do with the length of the ETag. I originally updated the ETag generation for the issue page to
$etag='"is-'.$issue->ID."-".md5($changes->maxcreate.$comments->maxcreate).'"';

and that still wasn't sufficiently short for NGinx to present an If-None-Match.

Shortening to
$etag='"is-'.$issue->ID."-".md5($changes->maxcreate.$comments->maxcreate).'"';

worked but would mean NGinx would stop using ETags for revalidation if an issues internal ID were higher than 999999. Whilst that's a fairly large JIRA install, it's not necessarily huge, so I've updated the etag generation to
$etag='"i'.$issue->ID."-".md5($changes->maxcreate.$comments->maxcreate).'"';

So, it should work for up to 99,999,999 issues. If someone's got more than that number of issues, it's going to fall back to IMS, but there's not much can be done about that other than looking at how to adjust NGinx to remove the limitation (seems to be 40 bytes/320 bits)

I've made a similar change to all the other etags (will commit in a moment) to get those working as well, and revalidation does now seem to be using If-None-Match

Obviously it's still worth trying to think of a way to identify a Last-Modified date for Project/Version/Component pages, but NGinx will now correctly use If-None-Match, so that reduces the impact a little.

Repo: Jira-Issue-Listing
Commit: 6dcbe77fefa6b5d601dd8e8edf603a19064873dc
Author: Ben Tasker <github@<Domain Hidden>>

Date: Fri Apr 22 15:59:55 2016 +0100
Commit Message: Shortened ETags to achieve NGinx compatability. See JILS-42



Modified (-)(+)
-------
attachment.php
component-issues.php
issue_page.php
project-index.php
version-issues.php




Webhook User-Agent

GitHub-Hookshot/cd33156


View Commit

btasker changed timespent from '0 minutes' to '55 minutes'
So, to summarise

Although it'll accept and serve longer ETags returned from the origin, for revalidation purposes, NGinx appears to limit the length of ETags to 40 bytes. Any longer and it won't include an If-None-Match header when sending a conditional request upstream (instead falling back on If-Modified-Since).

In order for NGinx to use If-None-Match, upstream connections need to be HTTP/1.1 (because etags didn't exist in 1.0), so the following must be set (whether globally - in the http block, on a per host basis - in the server block, or within the location being proxied - in the location block)
proxy_http_version 1.1;


There's no need to explicitly enable ETags with the etag directive (though you might want to enable it for other reasons).
btasker added '0.1' to Fix Version
btasker added '0.01b' to Version
btasker added '0.1' to Version
btasker changed status from 'Open' to 'Resolved'
btasker added 'Fixed' to resolution
btasker changed status from 'Resolved' to 'Closed'

Work log


Ben Tasker
Permalink
2016-04-22 16:06:46

Time Spent: 55 minutes
Log Entry: Looking into why NGinx doesn't use If-None-Match