FKAMP-2: Anti-Amp Script doesn't work on Google's AMP cache

Issue Information

Issue Type: Bug
Priority: Major
Status: Closed

Reported By:
Ben Tasker
Assigned To:
Ben Tasker
Project: Anti-AMP Scripts (FKAMP)
Resolution: Fixed (2019-06-09 12:38:55)
Affects Version: V1.1, v1.2, v1.3,
Target version: v1.4, v1.4.1a, v1.4.1,

Created: 2019-05-10 22:52:49
Time Spent Working




Issue Links

Github #2
Toggle State Changes


I can't really paste the full amount into JIRA because it's so messy, but here's what Google are actually serving when you hit that location (assuming they believe you're a mobile)
<!doctype html><html lang="en-GB" jsl="$t t-KQGreKEI5HU;$x 0;" class="r-iaKt_mpAHihk">

The opening is followed by some CSS and then a whole load of obfuscated javascript (see screenshot).

The important things of note though are

- All that's one one line (yeuch)
- Google aren't actually declaring the document as being AMP (which is why the detector isn't firing) - note the html tag

btasker added 'Screenshot_20190510_230246.png' to Attachments
btasker changed Project from 'STAGING' to 'Miscellaneous'
btasker changed Key from 'STGNG-10' to 'MISC-29'

Repo: RemoveAMP
Commit: 074b168c44777fb3ea7343afdab8caa31badfb99
Author: B Tasker <github@<Domain Hidden>>

Date: Fri May 10 23:07:46 2019 +0100
Commit Message: MISC-29 Search for Canonical and trigger redirect on Google AMP cache pages (see Github #2)

We need to do this because Google's cache doesn't properly declare that the resource is AMP.

Modified (-)(+)

Webhook User-Agent


View Commit

Repo: RemoveAMP
Commit: 354f2c3a36f46a2006744be73ec69c3de206c03f
Author: B Tasker <github@<Domain Hidden>>

Date: Fri May 10 23:22:21 2019 +0100
Commit Message: Update Greasemonkey hook to reference v1.4

Means we should auto-roll out the changes made for MISC-27

Modified (-)(+)

Webhook User-Agent


View Commit

You do get a couple of window repaints when going to the Google hosted cache though.

Looking at logs it's because there are a couple of hops to make:

- Start:
- Go to listed canonical (still AMP):
- Go to listed Canonical (proper HTML):

The user is reporting that they have to manually reload the page before they're redirected away from Google's AMP cache. This only happens when they visit via Google Search though.

I've had no luck (user-agent changes, incognito + UA change etc) in getting Google to actually serve me AMP results so haven't been able to repro.
User's provided a console screenshot and also provided the outer-HTML they see (the iframe itself serves up the same content as when you go direct).

The URL is the same as those mentioned earlier, but when the user goes there direct the redirect works.
btasker added '57573041-742d9380-73f0-11e9-8346-1c85a23ee1ea.png' to Attachments
Now this is interesting, if not particularly helpful in answering the outstanding issue.

I've now been able to get Google to serve me AMP results. But... the must be doing some kind of compatability or experience tracking on their side, because if you do the following:

- Open Private Window in FF
- Select UA that wouldn't get AMP results
- Search "theregister AMP bad"
- No AMPs
- Change to UA that should, re-search

You still get no AMP results.

This is important because it seems that the default Iphone user-agent in the User-Agent switcher extensions, for whatever reason, doesn't get given AMP results. So when I later switched to Mobile/Chrome in there, I still didn't get the results even though I should have. Closing the private window, and then opening anew and starting off with the correct UA gets me AMP results.

Unfortunately, the behaviour I'm getting still differs to the user. I may need to lay hands a Mac to try and repro with Safari. I don't see any obvious smoking gun in the console output - the call out to is interesting (though you can't see the full querystring, so I can't get anything beyond a 404 back atm, but it's going to which does make me wonder if that's the path that's actually being served from).

It's a blind shot in the dark, but I'm tempted to get the user to run a copy of the script which adds that domain to the checker, just to see what the behaviour ends up being.

Either way, we already know the initial change won't be sufficient on it's own, as both Cloudflare and Bing run amp caches too. The problem there, though, is they're rolling out "Real URL" support, so we won't be able to rely on the source hostname in future as they'll serve AMP from rather than (say)

Though, to be fair, based on a very quick check, Cloudflare do seem to be honouring the spec and properly declaring AMP
ben@milleniumfalcon:~$ curl -s -o /tmp/cf; head /tmp/cf | grep -o -P '<html [^>]*'
<html amp i-amphtml-layout
Right, lets take one more look at the original page first though.

So, we've got google's wrapper page, which basically involves serving a small document with a lot of javascript in it (obfuscated in the way most of Google's JS is).

Worth noting, for clients without javascript, there's also a noscript in the head which directs them straight to the cache (seems stupid considering you need JS for amp.js to load?)
<noscript> <meta content="0;url=" http-equiv="refresh"> </noscript>

But, ultimately, Google's page just results in a page which has an iframe in it referencing the actual amp cache.

So, if we go to the iframe loads content from;amp_js_v=0.1#origin=;cid=1&amp;prerenderSize=1&amp;visibilityState=visible&amp;paddingTop=32&amp;history=1&amp;p2r=0&amp;horizontalScrolling=0&amp;csi=1&amp;storage=1&amp;viewerUrl=;cap=navigateTo,fragment,handshakepoll,cid,replaceUrl,fullReplaceHistory

If you try and access that direct in a browser, you'll get a 404, need to make sure that every request header is correct
ben@milleniumfalcon:~$ curl ',fragment,handshakepoll,cid,replaceUrl,fullReplaceHistory' -H 'User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 12_1_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.0 Mobile/15E148 Safari/604.1' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Referer:' -H 'DNT: 1' -H 'Connection: keep-alive' -H 'Upgrade-Insecure-Requests: 1' -H 'Pragma: no-cache' -H 'Cache-Control: no-cache' -o /tmp/amp

At which point, you get our source document
ben@milleniumfalcon:~$ head /tmp/amp | wc -c

Neither of those pages properly declares themselves as being AMP.

Now, if we try and detect the AMP hostname, that'll trigger from within the iframe, and so won't be able to redirect the browser.

The bit I still can't fully answer though, is why the user's Safari isn't triggering as the wrapper domain is (or, at least, should be) correct for the check that was added earlier in this issue. Need to look at that screenshot again, there must be something I'm missing.

Actually, thinking about it, the user was originally trying to adjust so they could use the Anti-amp script without the require directive - (seems it's not supported on iOS)

That directive was added because otherwise the anti-AMP protection wouldn't work on sites with moderately strict Content Security Policies. See here -

There is a complaint about CSP in the user's screenshot, and the bit of the path that's visible would suggest that it's on Google's wrapper page. The line number's not particularly helpful there, of course as pretty much everything is on one line within the document. But, it seems reasonable to assume that Google wouldn't leave it's own cache fucked by it's CSP, so it's probably the result of injected code.

I should be able to test this to reproduce by creating a new user script which injects the code without using require and then seeing whether I get a similar line in the console. Wouldn't explain why it then works following a refresh, but we can come to that later.
Ok, tested by disabling the anti-amp hook and creating a new userscript with this content -

The result isn't actually quite as expected.

What I got was
Navigated to
eww vile... trying to redirect open_source_insider_google_amp_bad_bad_bad:5:9
Redirecting you to open_source_insider_google_amp_bad_bad_bad:13:17
Content Security Policy: The page's settings blocked the loading of a resource at inline ("script-src").
Loading failed for the <script> with source "". open_source_insider_google_amp_bad_bad_bad:2:1
Loading failed for the <script> with source "". open_source_insider_google_amp_bad_bad_bad:2:1
Loading failed for the <script> with source "". open_source_insider_google_amp_bad_bad_bad:2:1
Loading failed for the <script> with source "". open_source_insider_google_amp_bad_bad_bad:2:1
Loading failed for the <script> with source "". open_source_insider_google_amp_bad_bad_bad:2:1
Loading failed for the <script> with source "". open_source_insider_google_amp_bad_bad_bad:2:1
Loading failed for the <script> with source "". open_source_insider_google_amp_bad_bad_bad:2:1
Loading failed for the <script> with source "". open_source_insider_google_amp_bad_bad_bad:2:1
Loading failed for the <script> with source "". open_source_insider_google_amp_bad_bad_bad:2:1
Loading failed for the <script> with source "". open_source_insider_google_amp_bad_bad_bad:2:1
Content Security Policy: The page's settings blocked the loading of a resource at inline ("script-src").

There are, as expected the CSP violations, but they both actually originate from the document within the iframe rather than at Google's level. And actually, if we look at the response serving that wrapper, they're not serving a CSP.

So, given the CSP violation occurs within the iframe, it shouldn't stop the redirect from happening (the redirect is disabled in my test userscript and just calls console.log() instead - we can see it triggered).
Although, re-enabling that redirect, I get redirected to (as stated in that console message), but then don't get redirected onwards to the real canonical... oh, that's because I limited the test userscript to

Wait.... it's limited to so there's no way that it's responsible for the CSP violation on Disabled it, refreshed and the warning is still there. That's hilarious, especially as they've got report-uri defined, every single page view must be resulting in a report going back because the content they're serving violates their own CSP.

Anyway, means we can disregard the CSP related lines in the user's screenshot
OK, I think the long and short of it is, I'm going to need to find a Mac to try and repro further on - whatever the cause is isn't really visible (or maybe just not obvious) in the provided info.

To try and get the user up and running though, I'll create a minor release with an iframe check in it. Won't update the hook to refer to it though as the performance overhead will likely be quite high (as we'll need to do a full DOM scan).
So that it doesn't get lost in commit histories, there's a copy of my test script with the iframe changes here -

Repo: RemoveAMP
Commit: 51400177ebdc778364be15fc06f2cc3b6c3629e3
Author: B Tasker <github@<Domain Hidden>>

Date: Sun May 12 10:38:49 2019 +0100
Commit Message: MISC-29 Add slightly snarky and very temporary iframe detection to try and work around AMP detection issues on Safari when hitting Google's cache (see #2)

This involves doing a scan of the DOM, so might be quite expensive at times. The aim is to try and replace this once I've laid hands on a Mac to be able to repro the issue and troubleshoot it.

Modified (-)(+)

Webhook User-Agent


View Commit

Also asking the user to edit their hook script to revert this commit -

He's noted that if he adjusts the hook script to run fuckOffAMP every 10 seconds, the redirect does happen (he just has to wait a short while - presumably 10s). So there's a theory that maybe the page load isn't being considered a new page. We do rely on the onload event listener (since that commit) so it's certainly a possibility.
btasker added 'Google AMP' to Version
btasker added 'Google AMP' to Fix Version
Ok, I've laid hands on a Mac. The only downside is now I have to remember how to use the thing.

Steps to repro (hopefully)

- Be using Safari
- Install Tampermonkey ( will take you to the relevant safariextensions page)
- Safari Menu -> Preferences -> Advanced -> Tick Show Develop menu in menu bar
- Go to
- When tampermonkey prompts, press Install
- Develop menu, Choose Safari -- IOS 11.3 -- iPhone
- Google "The Register Amp Bad"
- Results should have the Register article at the top with a little Amp icon
- Develop -> Show Web Inspector
- Click Network, tick Preserve Log
- Click the link to the El Reg article
- Redirect doesn't trigger
Console looks exactly like the one the user provided before. We can see that fuckOffAMP has run against something and decided it's not AMP.
Well, I had exported a HAR in the hope that I could look through it on hardware that was less... yeah... but looks like I'm stuck on the Mac as Chrome on Linux, and various online HAR viewers claim Safari's chucking out an invalid date format. I don't much fancy trying to patch various incompatabilities in the file...
OK, first thing to note is Safari Developer Tools still labels the window "Web Inspector - - search" which does support the theory we've not actually loaded a new page.

Although the address in Safari's address bar is we never actually see a request for the path /amp/s/ in the Network tab.

Looking through the HAR with grep and less supports this.

So, it seems Google aren't reloading the page, they're simply rewriting it with JS and using JS to update the address bar (AMP and address bar tampering, two things I loathe at once.... fun).
OK, so looking back at the search result itself, in Web inspector we can see that the result has the following HTML
class="C8nzq BmP5tf amp_r" 

    <h2 class="bNg8Rb">Web results</h2>
    <div aria-level="3" role="heading" class="MUxGbd v0nnCb">
        Kill Google AMP before it kills the web • The Register
    <div class="zbELhe MUxGbd lyLwlc aLF0Z">
        <span jsname="zYLzN" class="ZseVEf OC0qVb" aria-label="AMP logo"></span>
        <span class="qzEoUe"> › o...</span>

I've not bothered digging through Google's JS to try and find what reads what, but it does seem that if we tamper with their attributes it won't use Google's internal page and instead appears to go straight to, which is enough for the anti-amp code to trigger and redirect us.

I just want to take a quick look though, and see whether we can disrupt even that and get taken direct
OK, pasting this into Safari's JS console leads to us going direct to the El Reg page rather than anything Ampy
var da;
var bads = ['data-amp','data-amp-cur','data-amp-title','data-amp-vgi','ping'];
var eles = document.getElementsByClassName('amp_r');
for (var i=0; i<eles.length; i++){
    if (eles[i].tagName.toLowerCase() != 'a'){

    da = eles[i].getAttribute('data-amp-cur');
    if (! da){
    eles[i].href = da;
    for (n=0; n<bads.length; n++){

The challenge, of course, will be working out how best to trigger it. My inclination to begin with, though, is just to create a new Greasemonkey script for it, limited to the Google pages and try that

Repo: RemoveAMP
Commit: 91706cc3aae1904c7f19ae30c638df6ae8a846a6
Author: Ben Tasker <btasker@<Domain Hidden>>

Date: Wed May 15 15:26:27 2019 +0100
Commit Message: MISC-29 Create new greasemonkey script

This basically just dumps the script created in that issue into a greasemonkey script - may very well not work (just easier to transfer to the Mac by putting into the repo).

In theory it should prevent Google using their own Amp caches (without a page reload) or even sending the user to an Amp page in the first place. Liable to be a bit fragile though...

Added (+)

Webhook User-Agent


View Commit

That was very close, but not quite.

The attributes correctly get purged, and the href updated. Unfortunately, something run's after which results in &ampcf=1 being appended to the URL (so the far end returns a 404). We could do something nasty and hacky and try to ensure that always ends up in the querystring.

Fuck you Google. It doesn't look like it gets appended until you actually click the link. I guess hacky will have to do for now.

Repo: RemoveAMP
Commit: 177f3b41a6250812113d548ebd832ebca862006b
Author: Ben Tasker <btasker@<Domain Hidden>>

Date: Wed May 15 15:41:00 2019 +0100
Commit Message: MISC-29 Hacky hack fix to work around Google's behaviour

On an AMP compatible device, when the user clicks a link in Google's search results, they'll append &ampcf=1 to the URL (presumably because they expect it to go via their ping page instead of direct).

This makes sure that ends up in the querystring rather than the url path.

Longer term, might be better to rewrite the ping url rather than just overriding href though

Modified (-)(+)

Webhook User-Agent


View Commit

OK, that elicited the desired behaviour.

The problem I have is that it may prove to be quite fragile. We're already reliant on Google not changing the class name amp_r (although it seems unlikely they would), I'd rather not end up sending random query string arguments to other people's sites. Most of the time you'd expect that ampcf would probably just get ignored, but it only takes on site to be using it in some way and we've potentially broken the user's browsing.

Will take a quick look at how their ping URLs work to see if we can use that instead. I've a feeling though, that if ampcf=1 is appended to a call to them they'll redirect to an AMP cache rather than to the site itself
Looks like we can't just rewrite ping anyway
ben@thor:~/repos/RemoveAMP$ curl ";source=web&amp;rct=j&amp;url=;ved=2ahUKEwjdkI2P053iAhXYShUIHY3_DQEQFjAAegQIBRAB&amp;psig=AOvVaw1iuOAsoWZI4urLlVg9q4Vn&amp;ust=1558013609587974&amp;ampcf=1"<html lang="en-GB"><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Redirect Notice</title><style>body,div,a{font-family:arial,sans-serif}body{background-color:#fff;margin-top:3px}div{color:#000}a:link{color:#00c}a:visited{color:#551a8b}a:active{color:red}div.mymGo{border-top:1px solid #bbb;border-bottom:1px solid #bbb;background:#f2f2f2;margin-top:1em;width:100%}div.aXgaGb{padding:0.5em 0;margin-left:10px}div.fTk7vd{margin-left:35px;margin-top:35px}</style><script nonce="rhgRoKjoLDXeFAPz09F1AA==">function go_back(){window.history.go(-1);return false;}

function ctu(oi,ct){var link = document && document.referrer;var esc_link = "";var e = window && window.encodeURIComponent ?encodeURIComponent :escape;if (link){esc_link = e(link);}
new Image().src = "/url?sa=T&url=" + esc_link + "&oi=" + e(oi)+ "&ct=" + e(ct);return false;}
</script></head><body><div class="mymGo"><div class="aXgaGb"><font style="font-size:larger"><b>Redirect Notice</b></font></div></div><div class="fTk7vd">&nbsp;The page you were on is trying to send you to an invalid URL.<br><br>&nbsp;If you do not want to visit that page, you can <a href="#" data-ct="originlink" data-oi="unauthorizedredirect" onclick="return go_back();" onmousedown="ctu(this.getAttribute('data-oi'),this.getAttribute('data-ct'));">return to the previous page</a>.<br><br><br></div></body></html>

Makes sense really otherwise they'd have people bouncing victims of their redirect page onto other locations.

OK, will leave as is for now
Attaching the HAR for posterities sake
btasker added 'dodgy_amp.har' to Attachments

Repo: RemoveAMP
Commit: 217d5e8ff8150eb1988d2dbb4837f0f65a4df696
Author: Ben Tasker <btasker@<Domain Hidden>>

Date: Wed May 15 16:20:14 2019 +0100
Commit Message: Revert "MISC-29 Add slightly snarky and very temporary iframe detection to try and work around AMP detection issues on Safari when hitting Google's cache (see #2)"

This reverts commit 51400177ebdc778364be15fc06f2cc3b6c3629e3.

Further testing has identified the cause of the issues in Safari/Google Search, and this detection won't help with that.

Might be useful in the long run, but as it was thrown together quite quickly I don't feel it's been adequately tested so should be removed for now

Modified (-)(+)

Webhook User-Agent


View Commit

OK, user has confirmed this has fixed the issue for them.

I've rolled this as v1.4.1 -

Marking as fixed
btasker changed status from 'Open' to 'Resolved'
btasker added 'Fixed' to resolution
btasker changed status from 'Resolved' to 'Closed'
btasker changed Project from 'Miscellaneous' to 'Anti-AMP Scripts'
btasker changed Key from 'MISC-29' to 'FKAMP-2'
btasker added 'V1.1' to Version
btasker added 'v1.2' to Version
btasker added 'v1.3' to Version
btasker removed 'Google AMP' from Version
btasker added 'v1.4' to Fix Version
btasker added 'v1.4.1a' to Fix Version
btasker removed 'Google AMP' from Fix Version
I've created a new JIRA project to track development of these scripts, so

- MISC-25 becomes FKAMP-1
- MISC-29 becomes FKAMP-2
- MISC-31 becomes FKAMP-3

btasker removed 'Fixed' from resolution
btasker changed status from 'Closed' to 'Reopened'
btasker added 'v1.4.1' to Fix Version
btasker changed status from 'Reopened' to 'Resolved'
btasker added 'Fixed' to resolution
btasker changed status from 'Resolved' to 'Closed'