utilities/snippets_cli_go#8: Allow searching of recipes too



Issue Information

Issue Type: issue
Status: closed
Reported By: btasker
Assigned To: btasker

Milestone: v0.1
Created: 27-Sep-24 13:36



Description

The extraction mechanism used in #6 has ended up being quite generic rather than being overly specialised towards snippets.

It searches a feed, then fetches content based on the link there.

So, we could quite conceivably make this support searching snippets, recipes and my blog based on the binary name in the command line (allowing us to do some busybox like magic)



Toggle State Changes

Activity


assigned to @btasker

verified

mentioned in commit 97444fd9c97cbbd09b436d0006cee4cf0a5292ef

Commit: 97444fd9c97cbbd09b436d0006cee4cf0a5292ef 
Author: B Tasker                            
                            
Date: 2024-09-27T14:57:17.000+01:00 

Message

feat: search different sites based on the binary name (utilities/snippets_cli_go#8)

+47 -12 (59 lines changed)

The names introduced by ^ aren't set in stone, I just wanted to establish the structure.

Currently:

  • btcli: search www.bentasker.co.uk (note: feed only contains a limited number of items)
  • rbt_cli: search recipebook.bentasker.co.uk

We'll ultimately want to add one for snippets, but I can't do that until the site's migrated. In the meantime, though, support for the others means we can continue development.

mentioned in issue #1

The way that this works is relatively simple.

The utility looks up the command line that it was called with (allowing it get search terms as well as the name that it was called by).

At the top of the code is the definition of searchDestinations:

var searchDestinations = map[string]searchDestination{
    "snippets_cli" : defaultDest,
    "sbt_cli" : defaultDest,
    "btcli" : searchDestination{
        rss : "https://www.bentasker.co.uk/rss.xml",
        elemtype : "div",
        attrib : "itemprop",
        elemid : "articleBody text",
        parseTitle : false,
    },
    "rbt_cli" : searchDestination{
        rss : "https://recipebook.bentasker.co.uk/rss.xml",
        elemtype : "div",
        attrib : "class",
        elemid : "blog-post post-page",
        parseTitle : false,
    },
}

It looks for the program name in that map. If it finds it, it'll use those settings - otherwise it'll fall back to the default:

var defaultDest = searchDestination{
        rss : "https://snippets.bentasker.co.uk/rss.xml",
        elemtype : "article",
        attrib : "itemtype",
        elemid : "http://schema.org/SoftwareSourceCode",
        parseTitle : true,
        extraCol : "Language",
    }

The attributes parseTitle and extraCol were introduced in #12. If parseTitle is true and the page title (as listed in the RSS feed) ends in the form (something) it'll extract the text between those brackets and add an extra column to search results, using extraCol as the column title.