There are currently two supported modes (depending on whether duplicate protection is enabled in Linkwarden or not):
Neither are necessarily what we might want.
If, for some reason, I'm regularly linking out to https://example.com/somepage
, I probably don't want 100s of copies of it to appear in Linkwarden (i.e. I want duplicate protection).
If, however, I link to it periodically (over the course of months or years), I might want duplicates because they can show how the destination has changed over time.
So, the idea is that (if this feature is enabled) the script should:
https://example.com/somepage
existsnow() - added
is greater than a configured threshold, add it againThis would obviously rely on Linkwarden's duplicate protection being off (which is actually the default state).
Activity
13-Aug-24 12:11
assigned to @btasker
13-Aug-24 12:28
There's an API endpoint for searching, but the docs don't give any details on forming searches.
Doing one in the UI results in this request
Prints the date.
If I disable duplicate protection and submit that link a second time, running the search gives me two items (and two dates)
14-Aug-24 13:07
It occurs to me that, if we do this, we should also validate the collection (and tags) on anything that comes back.
If I'm using Linkwarden for other stuff and happen to drop a link in, I probably still want the script to preserve a copy in case I clear out the stuff I added at some point
16-Aug-24 07:32
There's an unexpected (but logical) caveat with this.
Whilst doing some test runs, I found that some URLs seemed to be ignoring the periodic check as a result of returning no results.
For example
Sure enough, searching in Linkwarden for those URLs doesn't return anything.
But, if I let the script submit, Linkwarden's duplicate prevention kicks in.
The reason is, that Linkwarden is stripping the trailing slash from URLs, searching for
https://nearlylegal.co.uk/2024/08/its-not-me-its-you-breaking-up-with-twitter
instead gets results.I thought it might be a result of sites serving a redirect, but no, the originals have a trailing slash.
Turns out this was spotted about 20 hours ago and fixed upstream
18-Aug-24 17:09
Looks like the fix got released in Linkwarden 2.7.0.
Have just successfully tested against it
18-Aug-24 17:10
mentioned in merge request !1
18-Aug-24 17:12
mentioned in commit ada2ce89d63980b5911b65e9d4e6ef40c83754a7
Message
feat: implement periodic link duplication support (utilities/auto-blog-link-preserver#21)
feat: integrate support for periodic link duplication
Note: this also adds a new stat -
too_new
- to record how many links were not submitted because linkwarden already has a recent copyPERIODIC_LINK_DUPLICATION_THRESHOLD
to control when links are duplicated18-Aug-24 17:12
mentioned in commit 66afc037f4df4410593345d1a1c85b8e76a4d386
Message
Merge branch 'feature-periodic-duplicates' into 'main'
feat: implement periodic link duplication support
Closes #21
See merge request utilities/auto-blog-link-preserver!1