Wiki: Mirroring only some projects/Websites / Gitlab Issue Listing Script



Issue websites/Gitlab-Issue-Listing-Script#13 introduced the ability to limit which projects GILS would display pages for.

This functionality means that (as of v0.4) it's possible to run a GILS mirror that only contains a subset of your projects. The intent when implementing the functionality was to allow a mirror of some of my projects onto projects.bentasker.co.uk (where you are probably reading this page now).


Caveats

The functionality only prevents rendering of pages.

It does nothing to filter commit notifications in issues in other projects.

So, if you've only allowed project foo/bar and in project stash/supersecret have made a commit:

update secret stash with stuff from foo/bar#1

Then the page for issue #1 in foo/bar will disclose that you have a project called stash/supersecret. Users won't be able to view anything in that project, but they'll know it exists on your server


Configuration

You can adjust the configuration to suit, the main thing that you need to ensure is set appropriately are the config variables $enforce_explicit_limits and $permitted_projects

<?php

ini_set('display_errors', 0);

class GILSConfig {

    // My gitlab server
    public $server = "https://gitlab.server.lan";
    public $access_token = "myrealtokenisdifferent";
    public $force_https = true;

    // This could be set to true as an additional safety net
    public $public_projects_only = false;

    // Only display those we list below
    public $enforce_explicit_limits = true;

    // Only display these projects
    public $permitted_projects = array(
        "jira-projects/FKAMP",
        "jira-projects/MISC",
        "jira-projects/ADBLK",
        "jira-projects/HLS",
        "misc/docker-gitphp",
        "websites/Gitlab-Issue-Listing-Script",
        "websites/privacy-sensitive-analytics",
        "websites/videos.bentasker.co.uk"
    );

    // embed the following into <head>
    public $include_head = array(
            "scripts" => array("https://pfanalytics.bentasker.co.uk/agent.js"),
            "stylesheets" => array()        
    );

    // Redis config
    public $redis_enabled = true;
    public $redis_host = 'myredis.server.lan';
    public $redis_port = 6379;
    public $redis_pass = 'asuperstrongpassword';
    public $redis_ttl = 300;


    // Misc other bits (these are the defaults)
    public $excluded_groups_from_homepage = array();
    public $enhanced_commit_notifications = true;
    public $commit_mention_text = "mentioned in commit";
}

This will tell GILS to only display the projects listed in $permitted_projects


Running

The simplest way to stand up GILS is with docker

docker run -d \
--name=GILS-public \
--hostname=GILS-public \
-v $PWD/config.php:/var/www/html/config/config.php \
-p 1280:80 \
--restart=unless-stopped \
bentasker12/gitlab-issue-listing-script:0.4

This will stand GILS up, listening on port 1280


Automating a mirror

Assuming you want to publish the mirror to a remote server, the most efficient way is to take a full local mirror and then rsync only the changes across:

#!/bin/bash
#
# Generate an offline copy of public GILS projects 
#
# Copyright (C) 2022 B Tasker
#
#
# Cron:
#
# 0 */4 * * * /mnt/work/projects.bentasker.co.uk/refresh_gils_mirror.sh


### Config

# Set the url to GILS here
GILS="http://gils:1280/" 

# Add any additional wget opts here
ADDITIONAL_WGET_FLAGS=''

# Where are we mirroring into locally?
FILEDIR=/mnt/work/projects.bentasker.co.uk

# Where should we sync to?
SYNC_DEST=hiyori:/usr/share/nginx/static/gils/



### Onwards!

cd $FILEDIR

# Don't attempt to update if the system is unavailable
wget -U "GILS-Project-Mirror" -O - "$GILS/status" | grep SUCCESS > /dev/null 2> /dev/null
if [ "$?" == 1 ]
then
        echo "Status check failed. Exiting"
        exit 1
fi

# Create a temporary location to operate in
rm -r gils_projects.new 2>/dev/null
mkdir gils_projects.new
cd gils_projects.new

# Run the mirror
wget -R "robots.txt" -U "gils-Project-Mirror" $ADDITIONAL_WGET_FLAGS -nH -r -p -k "$GILS/"

# Switch this mirror with the previous one
cd ..
rm -rf gils_projects.old 2>/dev/null
mv gils_projects gils_projects.old
mv gils_projects.new gils_projects

# Rsync the data up to our webserver
TEMP_LOG=$(mktemp)
rsync -r -avcz -e ssh --log-file="$TEMP_LOG" --log-file-format="File-changed %f" gils_projects $SYNC_DEST

# Detect which files were changed
for changedfile in `grep -o -P "File-changed [^\ ]+" "$TEMP_LOG" | cut -d\  -f2`
do
    # Do something
    # flush_cache "$changedfile"
    echo "Changed: $changedfile"
done

# Tidy the old log file away
rm -f "$TEMP_LOG"

This will mirror GILS output to a local directory, and then will upload changed files to a webserver. You can also have it act upon changed files (perhaps to flush a CDN cache etc)