project Websites / Gitlab Issue Listing Script avatar

websites/Gitlab-Issue-Listing-Script#59: Mirroring is starting to encounter issues



Issue Information

Issue Type: issue
Status: opened
Reported By: btasker
Assigned To: btasker

Milestone: vnext
Created: 26-Dec-23 12:14


Labels: Bug

Description

The last few days, I've noticed that mirroring to projects.bentasker.co.uk has been having some issues.

Originally it was that a project index (this one) wasn't updating - when I checked it manually, the GILS render was timing out (so, of course, wget wasn't able to fetch a new version).

Today, I've noticed that the project homepage is missing most projects

Screenshot_20231226_121045

Curiously, the HTML is closed off properly (i.e. there's </body></html>) etc. Not sure whether that's wget fixing it during link conversion?

Hitting the GILS homepage directly works just fine.



Toggle State Changes

Activity


assigned to @btasker

I suspect this'll turn out to be system load - the system that Gitlab is running on does occasionally struggle, what with Gitlab being incredibly RAM hungry.

But, it would be better if it was handled properly - I'd rather a page not get synced rather than a broken copy of it.

In fact, looking at it, the index page for that project is also currently broken

Screenshot_20231226_121714

To get things running again the other day, I had to purge the Redis cache.

Whilst we're looking into this, it might be prudent to disable the cache so that we're at least not caching broken stuff.

have disabled redis

    public $redis_enabled = false;

Pages are currently loading correctly

Doing some clicking about today, things seem to have mirrored correctly overnight.

I disabled search-spider crawls in misc/Python_Web_Crawler#11 which will have lifted a reasonable chunk of load off the underlying system (although it won't be a panacea).

mentioned in issue misc/Python_Web_Crawler#12