Substack is one of the best CMS/social medias available to the general public.
In short:
Substack polarizes people who window shop/lurk on forums
Substack handles the awkwardness of converting followers and payment for creators
There is higher friction in posting, leading to higher quality content for readers
Network effects are present, but are not built into the system design (for example, you can’t see how many other people subscribe to this newsletter. 10? 100? 1000? You’ll never know how (un)popular I am!)
Yet, Substack has glaring problem. Discoverability is wack!
To address this, I built a project that automatically fetches a random Substack newsletter to read.
You can try the demo on Replit.
Deep Dive
This project is based around a URL I found on Reddit.
I’m not sure if an engineer on the Substack team built it, or if it’s kept up to date with all available newsletters, but it seems to do its job of fetching a random newsletter on Substack.
let url = 'http://random.substack.com/'
But I wanted to take it a step further.
See, most of the newsletters returned from the URL were empty, or were just a “coming soon!” page.
So I decided to parse the HTML and do some old fashioned web crawling.
const parser = require('node-html-parser') // parse html strings lib
I found three interesting things:
All Substack posts start with “/p”
All Substacks have a sitemap
Substack automatically generates five pages, meaning “active” newsletters have more than six pages
Here’s what that looks like in code:
// #1
if (substackURLArr.includes('p')) { ... }
// #2
const sitemapURL = substackRootURL + 'sitemap.xml'
// #3
const sitemapURLArr = sitemapXML.map(urls => urls)
metadata.hasWrittenArticles = sitemapURLArr.length > 5
How all this code came together
Let’s do our part to make Substack more popular so we can see more social media platforms in the future that treat creators as first class citizens, encourage intellectual discourse, and separate us from the idea that more followers === better person.
Till next,
Bram