[Image: a download session which is explained below.]
Published: 2022-12-09
Introduction
In this newsletter's series on Tor (of four interlinked parts) I declared that using tools like youtube-dl (now improved as yt-dlp) and Tor one gets closer to the battle-lines for information availability.
This short article will attempt to graphically display some of these challenges.
Before we continue, a recently published article was extremely limited in its notifications. I have no idea why. The article considers why it is that our leadership are placing so much effort on issues which are not species threatening and very little on those that are.
In the Trenches
One may ask “why be concerned about information control?’ Let’s just call it censorship. As also mentioned in the Tor series, censorship is a product of surveillance. One cannot censor that which one does not know exists. Couple that with the old adage that "information is power". Thus, censorship is a form of inhibiting the ability of a population to know what is actually occurring or to consider wider issues. How do the politics, economics or philosophy of current events impact me and how can I respond?
At the end of the Tor series I committed to publicly release the toolset which I developed and use to reduce what commercial organisations like Alphabet/Google/Youtube know about what I read or watch. This has been done. The key tool is "yt-dl". This is not yt-dlp, though it uses it. This is a quizzical and serendipitous confluence of naming.
The toolset is The Torrible Toolset (or 3T). Its for GNU/Linux systems and you are most welcome to use it. The current version is 0.3. No bugs have been reported during the many months since its initial release at 0.1. Later releases are motived by improvements; the use of yt-dlp instead of youtube-dl and latterly improvements for downloading from Rumble. See the release notes or the repository (git) history.
Here is what the download of a video published on Youtube looks like when using yt-dl (and its core tool, yt-dlp).
To many, the above will look like gobbledygook. It is, however, informative. Let me decode it.
At the top is the command being issued (green arrow, top). yt-dl, please download this URL (a video published on Youtube). yt-dl checks if Tor is running, and if so uses that transport. It also inserts the default to limit the graphical, and thus also the data, size of the video to be "medium". yt-dl then informs the user of the exact command that it is issuing to yt-dlp.
yt-dlp in Action
Following this are a series of messages from yt-dlp. It informs us that it is attempting to download the web page for the video from Youtube. Youtube issues an HTTP response code 429 which means "Too many requests” (coming from your network/computer). Technically, the 429 comes from a proposed but not yet approved standards propoal (RFC 6585). The text of this is not an approved standard. It is, of course, published.
We are using Tor with another 2 million people and requests are exiting a collection of 2 to 3 thousand exit nodes. So, yes, there are a lot of requests, lets say a thousand being issued by each exit node in a short period, say half a minute.
The not standardized 429 response would make sense from a small operator. Sorry, I can't handle that request volume! But for Google this is nothing. Thus, its 429 is employing a not approved standard to say “I dont want to serve your network/computer.”
So, yt-dlp falls back to one of its many counter-measures to this blockage. This is what "Downloading android player API JSON" means. Part 2 of that strategy is the next line of downloading an iframe embedded JS API. This is highlighted by the next "green arrow". (JS means JavaScript, and API means application programming interface).
yt-dlp then re-attempts to download the web page and gets the same 429-fuck-you highlighted by the second red arrow. yt-dlp falls back to the next work around, the "Web player API JSON". It issues a technical warning that because it could not get the "nsig" you may experience throttling. This matters little. We are downloading not watching live at full screen resolution.
Finally, yt-dlp can begin the download process and informs us that it is downloading 1 or 18 potential formats (i.e the one that yt-dl instructed it to, the "medium" scale). This is the next "green arrow".
At the bottom we see the name of the file yt-dlp is creating. It includes the title of the video and critically, the unique identifier of the video. See clearly what has been chosen by the programmers. The file name includes both the human readable information, the video title, and the key data information, the unique identifier. Herein lies another window into information control. The authors of youtube-dl chose to give you both pieces of information. You may not, but they do understand the importance of a unique identifier. They ensure that it is available in the file name.
On the final line we see the progress of download. The speed varies over time due to uncontrollable things like the available bandwidth of the three Tor nodes doing the work and whatever policy Youtube uses to limit access. Youtube also has bandwidth limitations which are fundamental. My observance is that certain videos are given a deliberate bandwidth restriction. These are Youtube's policy choices in action on top of its choice to tell uninformed users of Tor to piss off.
We can see a download rate of over 800 KB/s. I only need around 70 for the download to keep up with viewing this size of video, so I am actually abusing Tor in this case. Perhaps I need to update the tool, yt-dl to signal to Tor that I don't need more than say 200 KB/s. That would be a nice thing to do. (Adds to list for improvements).
Information Control
What we are really seeing is the information availability battle. As noted in the Tor series, Alphabet/Google/Youtube really want your behavioral data. They want to know the true IP address of where the download is occurring. Even better for them is you using their web-based player in which they can also serve advertisements and know exactly which parts of the video you watched.
My toolset and its use of yt-dlp in combination with Tor denies them ALL of this information. It also denies ad revenue to the video creator. However, given the recent raft of revelations about how Youtube denies ad revenue willy-nilly most "content creators" have sought other independent methods of revenue than Google's AdSense.
All of the above is so normal for me that I don't even notice it. I don't see the battle anymore. I just note whether the download has begun and at what rate. The above is not the most stupid persistent resistance I've seen from Youtube. There are far worse cases including multiple session terminations, more 429s and even variants thereof. This case represents the most common. It is a window into the battle.
Welcome back to the information control battle
A far deeper, 12 000 word, analysis of Tor and the information control battle is provided in the Tor series.
Oh, and this article is a "backdoor" to encourage you to watch the video which was being downloaded.
[Image: a frame from the downloaded video in which Lee Camp interviews Ben Norton as a part of his new series “Behind The News” for Mint Press News.]
Ben Norton is a fantastic young journalist who has established his own outlet, Multipolarista. He is a staunch socialist, so take that into account. However, he is also a very skilled journalist. If you wish to understand the current and recent (200 years) state of political changes in Latin America he is an excellent reference.
History matters. Viva la resistencia!
Sources
The core image is by the author from his computer.
The Tor series are available in the archives of this newsletter.
The downloaded video:
Ben Norton & Lee Camp: Coup in Peru & Latin America Fights Dollar Hegemony, Lee Camp interviews Ben Norton on the 2 coups which have just occurred in Latin America, Behind The Headlines for Mint Press News, 2022-12-08
Culture
… awaiting inspiration.
If you like what you read here, you can please the author by sharing it.
Notification
Subscription is optional. Subscribers can expect notifications for most articles. Better is to use RSS, or bookmark the "Archive" page and visit at leisure. If you use Twitter, following @YesXorNo1 is also an effective notifications strategy.
Copyright and Licensing
This work is copyright to the blog's author with CC BY-SA 4.0 licensing. Have fun, reuse, remix etc. but give credit and place no further restrictions. Let’s build culture.