

This git-annex repository contains 100k files, the entire collections "internetarchivebooks" and "usenethistorical". If we succeed, we will have backed up 1/1770th of the Internet Archive. This is our first part of the IA that we want to get backed up.
#Git annex download registration
Write client registration interface, which generates the client's ssh private key, git-annex UUID, and sends them to the client (done).

Get the clients to upload it to our server. Tell git-annex the content is no longer in the IA.

Get a list of files, checksums, and urls.shuf (optional - will randomize the order you download files in).crontab OR systemd (NOTE: you may need to run loginctl enable-linger to make sure the job is not killed).sane UNIX environment (shell, df, perl, grep).To adjust this value later, use git config annex.diskreserve 200GB in all of the IA.BAK/shard* directories.Ĭonfiguration and maintenance information can be found in the README.md file. It should prompt you for how much disk space to not use. timer unit) to perform periodic maintenance.
#Git annex download install
It will walk you through setup and starting to download files, and install a cron job (or.
