project Utilities / File Location Listing avatar

utilities/file_location_listing#6: Regex based skip rules



Issue Information

Issue Type: issue
Status: closed
Reported By: btasker
Assigned To: btasker

Milestone: v0.1
Created: 29-Dec-23 10:37



Description

The crawler already supports skipping URLs if they contain a substring defined in config/skipstrings.txt.

However, it's not always possible to use those - perhaps because you only want to block a substring for a specific domain, etc.

So, also want to be able to provide regular expressions to apply against URLs



Toggle State Changes

Activity


assigned to @btasker

verified

mentioned in commit 5d0437abe2ff5df1f9943949f8951c871c9fb142

Commit: 5d0437abe2ff5df1f9943949f8951c871c9fb142 
Author: B Tasker                            
                            
Date: 2023-12-29T10:35:13.000+00:00 

Message

feat: add support for regex based skip rules (utilities/file_location_listing#6)

+47 -9 (56 lines changed)