Subscriber Surveillance: A Public Service Announcement including the Fine Print
A Public Service Announcement including the Fine Print
Publication date: 2022-08-10
Introduction
This article provides a view of the "author's" interface at Substack as a manner of documenting some of the surveillance which Substack performs upon readers and particularly, subscribers. This is intended as a "public service announcement" to this newsletter's community.
The newsletter has just begun making a policy change. This article is to inform the readership of this change, the reasoning behind it and most importantly to arm the readership with information so that they can take actions which best serve them.
If you don't care about surveillance, you can ignore this article. However, I suggest that there is valuable information here, and that living in the surveillance age as we do, learning a little more about it is valuable.
An Executive Summary is provided before all of the details. Thereafter mitigations are suggested. Finally, the Policy (change) section is provided.
Substack
Substack is a free publishing platform which provides a subscription service, both free and monetized. To run its business of providing free publishing, it must fund its staff and infrastructure. The author has no inside knowledge or even bothered to ask Substack how they fund themselves. An obvious source of income is a percentage of the subscription fees.
As many readers will be aware there is also a market for data. Like any market there are producers, "shops" (brokers), and consumers. For this market the product is individualized web behaviour. As this article will demonstrate, Substack gathers a lot of data, and that it partners with at least one other organisation in the data market. Whether Substack sells the data it gathers is unknown to the author, and actually does not really matter. The question for the reader is, do you want to give it to them?
Substack's "free" publishing service is well run, professionally produced, and simple to use. This author has NEVER had any comment from Substack on anything that has been published in this newsletter. There has never been any editorial influence. For this, Substack deserves praise. The counter to this is the level of surveillance to which the readership and also the author are subjected.
Executive Summary
As a subscriber using email notification, Substack is recording everything it can: when the email notification is displayed to you, and when you open it. Thus, Substack is monitoring when you read email. Obviously, Substack knows when it sent the email and if you click a link to the article in it. All of this is recorded, and some of it is made available to the author.
Substack uses Google Analytics via the Google Tag Manager interface to record every link on which you click for every article in every newsletter at Substack. Substack knows all of this data, and it would be reasonable to assume that Google does too.
Proof of these claims, and mitigations for them are presented below. Under Policy, at the end, a change in notification policy is made and explained.
Moving Through the Interface
The simplest way to inform a reader of some of the data gathered by Substack is to display what they show an author about you. Data, about site statistics or personal information is hidden. This article is about what types of data are recorded, not what the data is.
When you visit the front page, Substack like any web site obtains information from your browser and the network connection. The details are beyond the scope of this article, though counter measures are presented in Mitigations.
Highlighted is the Archive "menu" item. This allows one to view articles in historical order, rather than "most read". This author encourages using this as a bookmark, if you use bookmarks.
After an author logs in, they are presented with the "author" frontpage. This is identical to the public frontpage, except "Login" is replaced with "Dashboard". It is the Dashboard interface which reveals the data gathering.
The Dashboard has a horizontal menu (like the public or author frontpage), but this time it is for the author to learn about statistics and viewership for their articles, podcasts, and overall statistics. “Recommendations” allows an author to recommend other sites at Substack and “Settings” is how an author controls their substack site (colours etc.).
By default (shown) the Posts menu is selected, and a chronological order of posts are shown with information available for each.
For a post in which email notification was not used the information available for the author is simple:
Total Views: the number of times the article has been viewed
Traffic Sources: a breakdown of where traffic to the article has arrived from
The 93% 'direct', below, means that people have used some form of bookmark, RSS or other manual method has been used to reach the page. It essentially means that the REFER field in HTTP has not been provided and Substack does not know from where you came. The 7% from substack indicates that the last site used before visiting the page is substack, which may be its front page or another newsletter.
For an article which has been published with email notification much more information is available for the author.
Total Views: (same as above)
Recipients: number of recipients (to how many subscribers was a notification sent). This is essentially a record of the number of subscribers when the article notification was sent.
Open Rate: what percentage of recipients opened the email
Link Clicks: for all persons who arrived at the article via email, what percentage have clicked which links in the article
Traffic Sources: (same as above)
The Link Clicks data informs us that this information is collected for everything, not only for subscribers. It is only provided here because it can be traced to a subscriber which an author may find useful.
The Subscriber menu (from the Dashboard) lists the newsletter's subscribers. Each subscriber can be clicked, and information about them is available to the author, which we get to below.
The information presented below is for one subscriber. For each subscriber, there are three menu items, with Events being the default.
Note the event sequence for August 9 for this subscriber: Received, Seen, Opened, Opened. This is a very granular degree of detail. I am confident that Substack knows exactly when each event occurred too. Thus, they are tracking when you read email. Note again the detail: Received, Seen, Opened, and not shown but also when a link to the article was clicked too. All of this will be recorded to the microsecond.
The Subscriber Stats menu page shows overall statistics for the individual subscriber. Note the 'Subscriber Country'. This is likely to have been obtained from that declared by the browser at signup. This is certainly, as above, a subset of the data which Substack is tracking/recording.
The other producer, and likely broker, with which Substack is collaborating is Google. Google Tag Manager is a conduit for data gathered on web pages being loaded into Google Analytics which is a massive database of what people do on the web. It is of great use to marketers, and marketing is an important part of business. Its not all bad. However, Google never forgets anything, and over time they have a colossal amount of information on just about everyone.
The primary mechanism for data to move from a Substack (or any) page into Google Analytics is via the web language JavaScript. However, in case a browser is not running JavaScript, they have a fallback. This hidden page element called and iFrame contacts Google Tag Manager so that data is obtained.
<noscript><iframe
src="https://www.googletagmanager.com/ns.html?id=42-seems-on-the-point"
height="0" width="0"
style="display:none;visibility:hidden"></iframe>
</noscript>
[This HTML is obtained directly from the page rendered for the most recent article. It has been formatted for legibility.]
To track you across different sites which you visit, web sites (and Google and others) use Cookies. A visit to a single article produces 6 cookies, with four of them for .substack.com and two for the newsletter yesxorno.substack.com
Mitigations
This author wishes for only one statistic, the number of views for an article. This is a rough guide to interest in the article, and a proxy for quality. I have no interest whatsoever in knowing your name (false or otherwise) of your email address, or any other detail.
Notifications
As shown above the email tracking is quite invasive. To remove this, replace your subscription with another notification mechanism. Unsubscribe and choose your replacement method.
Suggestions:
Set a weekly or twice per week calendar item "Check out articles". This newsletter rarely posts more than two articles per week, and they are never "breaking news", but rather articles that should maintain value for days or weeks or longer. Use a bookmark, and bookmark the 'Archive' page.
Use RSS
If you use Twitter, follow YesXorNo1@twitter.com
Browser Tracking
One could use a VPN, but that wont really do much. It will obscure your source IP address, but the data is moved around by JavaScript which you need running, and the cookies will track you anyway.
The best solution is to use Tor. An interlinked four article series from April should provide you with as much understanding of Tor as you wish to know. If you wish to try Tor, which I recommend, first take the opportunity to read the elements of that series which you find of interest.
If you like to use bookmarks and you wish them to be available in Tor, import them and occassionally export them as a backup. Sometimes they get lost in major upgrades which can be a real pain in the behind.
Policy Change
The observant among you may have seen a very subtle change in the footer of articles over the last month. The Notifications section used to say "does not", which became "rarely" and then "not always" for issuing Substack email notifications.
The new policy will be "usually" (see the footer below for exact wording). Notifications will be sent when I believe the article is of wide interest. This is also the reason I recommend using the "Archive" button, so that articles are listed in reverse publication time, and you will notice an article for which no notification was sent. Perhaps you may be interested? However, with a "most popular" listing, you will never see them.
The motivation for the change is that people have signed up for notifications, and thus I should provide them. I have been reticent to make this change, as I know about the tracking and deeply dislike surveillance. A few notifications have recently been used as a trial. Subscribers have responded to notifications, so “your wish is my command”. However, to clear my conscience I needed to provide the above article to ensure that if you care, you have been informed.
You can expect notifications to be more frequent, according to the new policy.
Thanks
Thank you, subscribers and readers. It means a lot to this author that my work is read.
Sources
Images by the author.
Notification
Subscription is optional. Subscribers can expect notifications for most articles Better is to use RSS, or bookmark the "Archive" page and visit it at your leisure. If you use Twitter, following @YesXorNo1 is also an effective notifications strategy.
Copyright and Licensing
This work is copyright to the blog's author with CC BY-SA 4.0 licensing. Have fun, reuse, remix etc. but give credit and place no further restrictions. Lets build culture.