This project is the follow on from misc/Python_Web_Crawler#12
I discontinued that project because I no longer had need for full-text search.
What I do continue to have a need for, though, is identifying where I stored a file - i.e. searching by filename, path etc.
The aim of this project is to stand up a simple crawler and web portal which allows me to search for files by location
Although I'm not sure that it'll scale (in fact, I'm certain that it won't), I'd like the initial implementation to function without reliance on a traditional database - the focus should be on getting the crawler and information collection up and running.
The crawler should read a list of predefined domains from config and crawl pages on those domains. It should store
Nice to haves
Activity