Automating Youtube Archival
Google has been known to terminate entire Youtube channels without notice in the hope of staying advertiser-friendly. 58 million “problematic” videos were deleted from the platform in Q3 2018 1. This week we are exploring various Youtube archival solutions that utilizes youtube-dl, a command-line program that extracts and download videos from web pages.
Install youtube-dl via pip to ensure that you have the latest version. Google often pushes updates that breaks youtube-dl so you always want to stay up to date. You might be able to install it via your distibution’s package manager but it’s often several versions behind.
pip install --upgrade youtube-dl youtube-dl --version
You can now use the youtube-dl command to download a single video, a playlist or a channel:
youtube-dl https://www.youtube.com/watch?v=XXXXXXXXXX youtube-dl https://www.youtube.com/channel/XXXXXXXXXX
You can use the same command to download videos from various websites.Youtube-dl is also a Python library that you can use in scripts:
Command line arguments & Scripting
This document is not a replacement for youtube-dl’s documentation, you can find the updated list of command-line arguments on its Github page.
Below is a bash script that will download everything specified in
list.txt, a text file
with a youtube channel or video on each line. The script will save the URLs of the videos
in archive.txt as it downloads them to speedup the subsequent executions.
--write-thumbnail options ensures that we also download
the video metadata such as the description and the title.
You can decide to run the script periodically or to schedule it with cron:
# Download new videos every 2 hours 0 */2 * * * /mnt/Archive/Youtube/archive_yt.sh
While a cron job downloads all videos uploaded by a channel (if the uploader does not delete the video between executions), it does not handle live streams. Youtube-dl allows you to download live streams with the same command but you obviously have to start the execution during the stream. Below is a different approach that takes advantage of the Youtube email notification feature. This simple Python script reads your last 3 emails and searches for a Youtube link in the email body. It will immediately start downloading the video using the youtube-dl Python library. You can use this method to download uploaded videos as well as live streams. If you are using a Gmail account, you will need to generate an App Password to allow the script to login.
Automatically upload to rclone remote
To take advantage of cloud storage, you can setup ytdlrc to automatically move videos to an rclone remote as they are downloaded. This simple script is completely interchangeable with youtube-dl and can be set up on a machine with low disk space. The script uses your existing youtube-dl and rclone configuration and is ideal for setting up automatic Youtube archival on a cheap VPS.
If you wish to save a video’s metadata without downloading the actual video, there are command-line utilities dedicated to this task.