I didn’t find any articles or videos about this archivist (it is a blend words from archiving/archiver and the activist) activity, so I accidentally found this script/software package in one place.
Why I am archiving/backup the YouTube videos?
Backing up or archiving YouTube videos is important for several reasons.
Firstly, some video creators could be they have violated YouTube’s community guidelines or terms of service (accidentally, like some youtube poopers, commentary videos creator, ranters, tech videos like Enderman) or video creators can get caught by a random computer virus that causes their YouTube channel to be hacked, for example, one of the biggest video makers has experienced this in Linus Tech Tips (“Elon Musk in via Tesla channel(s) give cryptocurrencies” named SCAM)!
Another reason could be that they are no longer interested in creating content on the platform. Some creators may also delete their accounts if they are not commercially viable, but may some watchers has interested about them.
Et cetera.
What tool can you use to use this program and what are the requirements?
You can use it with any computer (mostly with x86/64, I didn’t try with ARM). Useable in personal computers. but recommended using the VPS, specialized computers e.g: Raspberry Pi, or any other methods (e.g: Google Colab).
I recommended a stable internet connection, if possible, have an internet connection with unmetered data traffic.
You may need to run a computer for more than a day (it is dependent size of the channel/playlist size), not only because of the download but also because of the size of the storage space, the speed of your internet connection, and the capacity (in bandwidth) of the Archive.org host.
The storage requirement depends on the size of the video channel and/or playlist of videos you want to archive on Archive.org, the developers tell you to use a system with at least 100GB of storage space, but for example, if you want to archive more than 1500 videos from one channel, that number can be up to 650 GB, but not only the amount of videos depends on the size, but also their length and resolution, mostly videos in WEBM format (in some cases Matroska file format) are downloaded by the yt-dlp program used by TubeUp. What the Internet Archive system does with uploaded content is to convert the files to H264-encoded video at a lower resolution than the original, but the original WEBM (or MKV) format videos are preserved, but as a downloadable file.
Firstly (if you have contains the channel), save the videos one by one in the playlist, when a new video is released and especially the latest content, download and archive the videos one by one, I will explain why later.
Linux and of the knowledge. Python 3. PIP and Git, for Mac users, may useable this software like Linux, but that’s not a Linux system, but a Unix system and the Linux system is work like Unix. If you use Windows, use at least Windows Subsystem for Linux version 2 or Virtual Machine where you running.
And if you use Docker, just use the Docker-supported systems without installing the TubeUp components (except: using the Google Colab)
I use the Ubuntu (in VPS, in PC and in the Google Colab or kind of Jupyter Notebook hostings – except use the Docker container, about that later).
If you want, please support me via thIs (fiat- and cryptocurrency), Donably, or Wishtender
Recommended VPS services
The cheapest Storage VPS server for archiving videos to Archive.org (useable for detached TubeUp Docker container), KVM, Proxmox is most suitable for this purpose, avoid the OpenVZ VPS servers, because this technology does not support the newest Linux kernel and distros, whereas required for Docker and/or newest Python runtime packages. Some VPS servers support the Ubuntu distros, but you can try this backup with any Linux distros.
InterServer: One Slice (CPU CORES 1 CORES; MEMORY 2048MB; STORAGE 1TB HDD; TRANSFER 2TB; Linux server with 10Gbps speed) 6$, if you want, use this (mostly seasonal) coupon to get 99% discount (SAVE99, CODE01, BUMPER99, 99OFFMONTH) + 3$ fee (not skippable, not fully own domain) – you can pay with card or crypto (via Coinbase Commerce) – USA servers
Contabo: Storage VPS S (CPU 2 vCPU Cores; RAM 4 GB RAM; STORAGE 400 GB SSD; PORT 200 Mbit/s Port, Linux Server) €5.99 + fees (1-month contract: €5.99 or in 12 months contract is zero fees + 0 to 27% VAT) – no coupons, not always discounts and just you pay with card or transfer – European (Germany) server
Free method (for backup video chunks, like mostly one video or playlists): Google Colab (variable size and more than enough RAM )
Some services give free credits, free time limited trial days to use they’s VPS services, like Google Cloud (i used this in free period), DigitalOcean, Oracle VPS free tier, Akamai
Any limitations may applied e.g: sign-up and proof of ID authentication, quotas
Now How to install and get start using this program (all, non-Docker, attached terminal mode)
First, you required sign up to Archive.org, after to sign up your account, you may have upload an content manually or you nothing yet uploading with fresh account, because may not want some spam by content uploaders.
For All users: Google Collab method, but these commands are usable for all (without ‘!’ prefix, this just because for use ipnyb/Jupyter Notebook/Google Collab and doesn’t need Google account).
Sign in or login your Google Account -> Go to Google Drive -> Click the New in upper left -> More -> Connect more apps -> Find the Colaboratory -> Install Colaboratory -> repeat go to the New -> More -> Google Colaboratory
You went to colab.research.google.com
In first ‘Code’ block write this next codes (or if you not use Google Colab, enter them separately and don’t write the ‘!’ prefix):
!sudo apt update && sudo apt upgrade -y
!sudo apt install ffmpeg python3-pip git
!sudo python3 -m pip install -U pip tubeup
!ia configure –username=youremail –password=yourpassword
!sudo python3 -m pip install -U tubeup pip
Hit:1 http://archive.ubuntu.com/ubuntu focal InRelease
Get:2 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB]
Get:3 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]
Hit:4 https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/ InRelease
Get:5 http://archive.ubuntu.com/ubuntu focal-backports InRelease [108 kB]
Hit:6 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 InRelease
Hit:7 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu focal InRelease
Hit:8 http://ppa.launchpad.net/cran/libgit2/ubuntu focal InRelease
Hit:9 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu focal InRelease
Hit:10 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu focal InRelease
Hit:11 http://ppa.launchpad.net/ubuntugis/ppa/ubuntu focal InRelease
Fetched 336 kB in 2s (216 kB/s)
Reading package lists… Done
Building dependency tree
Reading state information… Done
4 packages can be upgraded. Run ‘apt list –upgradable’ to see them.
Reading package lists… Done
Building dependency tree
Reading state information… Done
Calculating upgrade… Done
The following packages have been kept back:
libcudnn8 libcudnn8-dev libnccl-dev libnccl2
0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded.
Reading package lists… Done
Building dependency tree
Reading state information… Done
git is already the newest version (1:2.25.1-1ubuntu3.11).
ffmpeg is already the newest version (7:4.2.7-0ubuntu0.1).
python3-pip is already the newest version (20.0.2-5ubuntu1.9).
0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded.
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: pip in /usr/local/lib/python3.10/dist-packages (23.1.2)
Requirement already satisfied: tubeup in /usr/local/lib/python3.10/dist-packages (2023.5.29)
Requirement already satisfied: internetarchive==3.0.2 in /usr/local/lib/python3.10/dist-packages (from tubeup) (3.0.2)
Requirement already satisfied: urllib3==1.26.13 in /usr/local/lib/python3.10/dist-packages (from tubeup) (1.26.13)
Requirement already satisfied: docopt==0.6.2 in /usr/local/lib/python3.10/dist-packages (from tubeup) (0.6.2)
Requirement already satisfied: yt-dlp in /usr/local/lib/python3.10/dist-packages (from tubeup) (2023.6.22)
Requirement already satisfied: jsonpatch>=0.4 in /usr/local/lib/python3.10/dist-packages (from internetarchive==3.0.2->tubeup) (1.33)
Requirement already satisfied: requests<3.0.0,>=2.25.0 in /usr/local/lib/python3.10/dist-packages (from internetarchive==3.0.2->tubeup) (2.27.1)
Requirement already satisfied: schema>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from internetarchive==3.0.2->tubeup) (0.7.5)
Requirement already satisfied: tqdm>=4.0.0 in /usr/local/lib/python3.10/dist-packages (from internetarchive==3.0.2->tubeup) (4.65.0)
Requirement already satisfied: mutagen in /usr/local/lib/python3.10/dist-packages (from yt-dlp->tubeup) (1.46.0)
Requirement already satisfied: pycryptodomex in /usr/local/lib/python3.10/dist-packages (from yt-dlp->tubeup) (3.18.0)
Requirement already satisfied: websockets in /usr/local/lib/python3.10/dist-packages (from yt-dlp->tubeup) (11.0.3)
Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from yt-dlp->tubeup) (2023.5.7)
Requirement already satisfied: brotli in /usr/local/lib/python3.10/dist-packages (from yt-dlp->tubeup) (1.0.9)
Requirement already satisfied: jsonpointer>=1.9 in /usr/local/lib/python3.10/dist-packages (from jsonpatch>=0.4->internetarchive==3.0.2->tubeup) (2.4)
Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.25.0->internetarchive==3.0.2->tubeup) (2.0.12)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.25.0->internetarchive==3.0.2->tubeup) (3.4)
Requirement already satisfied: contextlib2>=0.5.5 in /usr/local/lib/python3.10/dist-packages (from schema>=0.4.0->internetarchive==3.0.2->tubeup) (0.6.0.post1)
Config saved to: /root/.config/internetarchive/ia.ini
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: tubeup in /usr/local/lib/python3.10/dist-packages (2023.5.29)
Requirement already satisfied: pip in /usr/local/lib/python3.10/dist-packages (23.1.2)
Requirement already satisfied: internetarchive==3.0.2 in /usr/local/lib/python3.10/dist-packages (from tubeup) (3.0.2)
Requirement already satisfied: urllib3==1.26.13 in /usr/local/lib/python3.10/dist-packages (from tubeup) (1.26.13)
Requirement already satisfied: docopt==0.6.2 in /usr/local/lib/python3.10/dist-packages (from tubeup) (0.6.2)
Requirement already satisfied: yt-dlp in /usr/local/lib/python3.10/dist-packages (from tubeup) (2023.6.22)
Requirement already satisfied: jsonpatch>=0.4 in /usr/local/lib/python3.10/dist-packages (from internetarchive==3.0.2->tubeup) (1.33)
Requirement already satisfied: requests<3.0.0,>=2.25.0 in /usr/local/lib/python3.10/dist-packages (from internetarchive==3.0.2->tubeup) (2.27.1)
Requirement already satisfied: schema>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from internetarchive==3.0.2->tubeup) (0.7.5)
Requirement already satisfied: tqdm>=4.0.0 in /usr/local/lib/python3.10/dist-packages (from internetarchive==3.0.2->tubeup) (4.65.0)
Requirement already satisfied: mutagen in /usr/local/lib/python3.10/dist-packages (from yt-dlp->tubeup) (1.46.0)
Requirement already satisfied: pycryptodomex in /usr/local/lib/python3.10/dist-packages (from yt-dlp->tubeup) (3.18.0)
Requirement already satisfied: websockets in /usr/local/lib/python3.10/dist-packages (from yt-dlp->tubeup) (11.0.3)
Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from yt-dlp->tubeup) (2023.5.7)
Requirement already satisfied: brotli in /usr/local/lib/python3.10/dist-packages (from yt-dlp->tubeup) (1.0.9)
Requirement already satisfied: jsonpointer>=1.9 in /usr/local/lib/python3.10/dist-packages (from jsonpatch>=0.4->internetarchive==3.0.2->tubeup) (2.4)
Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.25.0->internetarchive==3.0.2->tubeup) (2.0.12)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3.0.0,>=2.25.0->internetarchive==3.0.2->tubeup) (3.4)
Requirement already satisfied: contextlib2>=0.5.5 in /usr/local/lib/python3.10/dist-packages (from schema>=0.4.0->internetarchive==3.0.2->tubeup) (0.6.0.post1)
Support (not just) the YouTube links, like video link and playlists, channel link with /videos prefix like https://www.youtube.com/@AnthonyPadilla/videos – with other video sharing websites may have different method, but TubeUp is mostly optimized for YouTube video archiving. Redirection links aren’t supported.
Simple command prompts: (in Google Colab with ! prefix in separated ‘Code’ block)
tubeup {video|channel|playlist url}
Press enter and wait and don’t close the terminal (to do detached mode if you want to close the terminal (and yes, to the SSH remote terminal is too), but stay turned on your computer – in this non-Docker method is very difficult).
If you want, please support me via thIs (fiat- and cryptocurrency), Donably, or Wishtender
Docker containerized method
If you are did not installed Docker, install that to the supported Linux system
# To install the latest stable versions of Docker CLI, Docker Engine, and their # dependencies: # # 1. download the script # # $ curl -fsSL https://get.docker.com -o install-docker.sh # # 2. verify the script's content # # $ cat install-docker.sh # # 3. run the script with --dry-run to verify the steps it executes # # $ sh install-docker.sh --dry-run # # 4. run the script either as root, or using sudo to perform the installation. # # $ sudo sh install-docker.sh
If you are installed the Docker, you can write these command. Unlike the previous non-Docker command, you have write your Internet Archive account’s S3, API authentication secret codes, in this via website (if you are Signed In to your archive.org account).
You write this command:
docker run -it –rm -e “S3ACCESS={{ S3 ACCESS KEY HERE }}” -e “S3SECRET={{ S3 SECRET KEY HERE }}” etnguyen03/tubeup {video|channel|playlist url}
If you coldn’t run this, you may didn’t give the privileged mode, try running this command with sudo prefix.
If you want running with the detached mode, you can running with -d, like (with sudo prefix):
sudo docker run -d -it –rm -e “S3ACCESS={{ S3 ACCESS KEY HERE }}” -e “S3SECRET={{ S3 SECRET KEY HERE }}” etnguyen03/tubeup {video|channel|playlist url}
Details about the Dockerized TubeUp in Github
(In Attached mode) Software’s feedbacks
Video Title There are no annotations to write. [download] 100.0% of 187.50MiB at 6.91MiB/s ETA 00:00 Downloaded /root/.tubeup/downloads/videoid.f248.webm [download] 100.0% of 25.53MiB at 4.51MiB/s ETA 00:00 Downloaded /root/.tubeup/downloads/videoid.f251.webm uploading videoid.description: 100% 1/1 [00:00<00:00, 5.82MiB/s] uploading videoid.webp: 100% 1/1 [00:00<00:00, 4.21MiB/s] uploading videoid.info.json: 100% 1/1 [00:00<00:00, 4.09MiB/s] uploading videoid.webm: 100% 214/214 [00:06<00:00, 32.83MiB/s] :: Upload Finished. Item information: Title: Video Title Item URL: https://archive.org/details/youtube-videoid
Messages by TubeUp
“:: Item already exists. Not downloading. Title: A {video title} Video URL: https://www.youtube.com/watch?v={video_id}” – if appears this, means that it will not publish videos to be downloaded (because you or another publishes it with TubeUp), but will skip and search for videos that have not been re-uploaded to archive.org.
“[youtube] {video_id}: nsig extraction failed: You may experience throttling for some formats
Install PhantomJS to workaround the issue. Please download it from https://phantomjs.org/download.html
n = Bprqdi-6PcRlIgQ-hd ; player = https://www.youtube.com/s/player/8c7583ff/player_ias.vflset/en_US/base.js” – First, just being patient, wait at least 5-10 minutes. Secondary, replace your channel link to this: https://www.youtube.com/@{channel_tag}/videos or replace to a regular full playlist link. If doesn’t work, you may install PhantomJS.
If you want, please support me via thIs (fiat- and cryptocurrency), Donably, or Wishtender
Reached Speeds
Reached all-time high download speed from YouTube in Google Colab is 100MB/S, but average speed is 10-30 mb/s, the upload is 15-38 mb/s, but may decrease this speed if may overloaded the Archive.org servers.
“the whole life is an experiment and maybe hard time-wasting thing and deadly with some vis maior, I can’t live it” -me
The featured image is an AI-generated image, created by Bing image creator + added with logos to represent this article (if possible).