Self-hosted RSS reading infrastructure

If you ever find yourself switching between several news websites in the morning, looking for new articles, you might be doing it wrong. Most news websites and blogs provide RSS feeds, which can be processed, aggregated, and ready for you to read in one place.

There are many services that provide the same thing – Feedly, Inoreader, or Feedbin to name a few. Feedly and Inoreader even have a limited service for free.

However, if you want better control over your data, more flexibility for the software to do exactly what you want it to, or just want to try to deploy some software, you can create your own.

Choosing the components

This project is comprised of several components. First, there is an RSS aggregator, the heart of this infrastructure. It periodically checks all the websites and aggregates all of the articles for you.

Next, there has to be a service, which can save selected articles, an iOS app capable of connecting to the RSS aggregator, and some way to push the articles to Kindle.

My requirements for this project were:

  1. A usable RSS reader
  2. Nice app for iOS
  3. Ability to read articles on Kindle
  4. Ability to save full-text articles and search them later

For RSS aggregators/readers there were a few possibilities:

While all of these seem like a good choice, I ultimately went with Miniflux, because it seemed to have everything I wanted, the best integration with Wallabag without any hustle and it seemed to be the crowd-favorite. Wallabag seems to be one of the few good options for a self-hosted service for saving the articles. I also liked this article, which puts them together as well.

With Miniflux and Wallabag in the bag, I looked for an iOS application that could connect to Miniflux’s Fever API.

  • Unread
    • Has a subscription, but most of the useful things are included for free
    • Good design and usability
    • Does not support adding folders and subscriptions through Fever API
  • Lire
    • It costs 9,99€, which is not too much and way better than a subscription
    • I like its Hot Links section (which is not behind a paywall), which shows specific links that show up the most in your feeds
    • It supports adding folders and subscriptions through Fever API
    • No direct save button as far as I know, but articles can be saved through “sharing” to the Wallabag app
  • FieryFeeds
    • A lot of useful features are behind a subscription

I chose to use Unread without a subscription and tried out Lire.

One of the easiest ways to read the articles on Kindle is to use Amazon’s Send-to-Kindle functionality. You can send an email with an ebook as an attachment to an address associated with your Amazon account, and the file will be downloaded to your Kindle automatically.

There are several tools, which can accomplish this:

There are also other approaches, which need a jailbroken Kindle:

I went with wallabag-kindle-consumer, even though I call it wallabag-kindle everywhere, just as a shorthand.

Setup

In this section I walk through the docker-compose.yml file for each component separately, then integrate them.

Miniflux

Miniflux is easy to set up using the official documentation. I added variables which control deleting old articles to keep unread ones for 90 days and already read ones for 30.

version: '3.4'  
services:  
#-------------------------Miniflux-------------------------
  miniflux:
    image: ${MINIFLUX_IMAGE:-miniflux/miniflux:latest}
    container_name: miniflux
    restart: always
    ports:
      - "127.0.0.1:8080:8080"
    depends_on:
      - miniflux_db
    environment:
      - BASE_URL=${MINIFLUX_DOMAIN}
      - DATABASE_URL=postgres://${MINIFLUX_DBUSER:-miniflux}:${MINIFLUX_DBPASS}@miniflux_db/miniflux?sslmode=disable
      - RUN_MIGRATIONS=1
      - CREATE_ADMIN=1
      - ADMIN_USERNAME=${MINIFLUX_ADMINUSER:-admin}
      - ADMIN_PASSWORD=${MINIFLUX_ADMINPASS}
      - CLEANUP_ARCHIVE_UNREAD_DAYS=90
      - CLEANUP_ARCHIVE_READ_DAYS=30
    # Optional health check:
    healthcheck:
     test: ["CMD", "/usr/bin/miniflux", "-healthcheck", "auto"]

  miniflux_db:
    image: ${POSTGRES_IMAGE:-postgres:14}
    restart: always
    container_name: miniflux_db
    environment:
      - POSTGRES_USER=${MINIFLUX_DBUSER:-miniflux}
      - POSTGRES_PASSWORD=${MINIFLUX_DBPASS}
    volumes:
      - ${DATA_PATH}/miniflux/db:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "${MINIFLUX_DBUSER:-miniflux}"]
      interval: 10s
      start_period: 30s

Wallabag

For Wallabag, I use pretty much the configuration from the README, except I disallowed a new user registration (directive SYMFONY__ENV__FOSUSER_REGISTRATION) and changed the database to be Postgres.

Note, that we run two postgres databases, where one would suffice. Even though it should not be a huge problem for both apps to use the same one, I prefer running them separately for ease of maintenance, simple docker-compose file and these services not having a singe shared user account.

version: '3.4'  
services:  
  wallabag:  
    image: wallabag/wallabag
    container_name: wallabag  
    restart: always  
    depends_on:  
      - wallabag_db  
    environment:  
      - POSTGRES_PASSWORD=${WALLABAG_DBPASS_SU}
      - POSTGRES_USER=${WALLABAG_DBUSER_SU:-wallabag_su}
      - SYMFONY__ENV__DATABASE_DRIVER=pdo_pgsql
      - SYMFONY__ENV__DATABASE_HOST=wallabag_db 
      - SYMFONY__ENV__DATABASE_PORT=5432
      - SYMFONY__ENV__DATABASE_NAME=wallabag
      - SYMFONY__ENV__DATABASE_USER=${WALLABAG_DBUSER}
      - SYMFONY__ENV__DATABASE_PASSWORD=${WALLABAG_DBPASS}
      - SYMFONY__ENV__DOMAIN_NAME=${WALLABAG_DOMAIN}
      - SYMFONY__ENV__FOSUSER_REGISTRATION=false
      - SYMFONY__ENV__SECRET=${WALLABAG_SECRET}
      - SYMFONY__ENV__SERVER_NAME="Wallabag"  
    ports:  
      - "127.0.0.1:8081:80"
    volumes:  
      - ${DATA_PATH}/wallabag/images:/var/www/wallabag/web/assets/images  
  
  wallabag_db:  
    image: ${POSTGRES_IMAGE:-postgres:14}
    container_name: wallabag_db  
    restart: always  
    environment:  
      - POSTGRES_PASSWORD=${WALLABAG_DBPASS_SU}
      - POSTGRES_USER=${WALLABAG_DBUSER_SU:-wallabag_su}
    volumes:  
      - ${DATA_PATH}/wallabag/data:/var/lib/postgresql/data

Reverse proxy

The infrastructure should use encryption for communication with a client, so we need to set up HTTPS.

There are many excellent tutorials on how to set up NGINX with Let’s encrypt certificate as a reverse proxy, for example here. Miniflux also supports HTTPS on its own, however, reverse proxy would have to be set up for Wallabag anyway, so we are killing two birds with one stone. Moreover, we have more control over the specific setup and ciphers we may want to use.

Original idea was to set up both of these on a subpath, for example, https://[domain]/miniflux/ and https://[domain]/wallabag/, however, Wallabag can not be configured that way. It would be possible to rewrite all links to Wallabag using the reverse proxy, but that is not an elegant solution and could break with updates.

By default we are exposing ports 8080 and 8081 on localhost, traffic should be redirected to them. For inspiration, here is a basic config for miniflux:

server {  
	server_name [domain];
	add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload" always;
	location / {  
		proxy_pass http://127.0.0.1:8080/;  
		proxy_redirect off;  
		proxy_set_header Host $host;  
		proxy_set_header X-Real-IP $remote_addr;  
		proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;  
		proxy_set_header X-Forwarded-Proto $scheme;  
	}  
   listen 443 ssl; # managed by Certbot  
   ssl_certificate /etc/letsencrypt/live/afafa.cc/fullchain.pem; # managed by Certbot  
   ssl_certificate_key /etc/letsencrypt/live/afafa.cc/privkey.pem; # managed by Certbot  
   include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
   ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot  
}

Sending articles to Kindle

When sending articles to the Kindle email address, it seems that Amazon is quite picky about which emails it accepts. If the domain appears to be untrustworthy, after every emailed article you will be prompted to confirm that the article is not SPAM.

At first, I tried to use the newly registered domain for Miniflux as a sender, which unsurprisingly was not considered trustworthy. Therefore, I deployed everything that I could to make it more trustworthy – I set correct MX DNS records, even though it was not strictly needed, SPF, then DMARC, and finally, even email signing with DKIM. Nothing helped.

Then I decided to use a third-party email sender. First I used gmx.com. I registered an email account and set all emails from wallabag-kindle-consumer to be sent through GMX’s SMTP. This did not help either.

Finally, I created a throwaway Gmail account, and set it up so I can use SMTP, and let the email sender use this address.

Amazon account also has to be configured:

  • Gmail account from which the emails will be sent needs to be whitelisted
  • Receiving email address, where the articles will be sent should be configured (It should not be similar to the sending address)

A step-by-step guide can be found here

version: '3.4'  
services:  
  wallabag-kindle:  
    image: janlo/wallabag-kindle-consumer  
    container_name: wallabag-kindle  
    restart: always
    environment:  
      - WALLABAG_HOST=http://wallabag
      - DB_URI=sqlite:////app/db/database.db  
      - CLIENT_ID=${WALLABAG_KINDLE_ID}
      - CLIENT_SECRET=${WALLABAG_KINDLE_SECRET}  
      - DOMAIN=http://127.0.0.1:8083
      - SMTP_FROM=${GMAIL_USER}
      - SMTP_HOST=smtp.gmail.com  
      - SMTP_PORT=587  
      - SMTP_USER=${GMAIL_USER}
      - SMTP_PASSWD=${GMAIL_PASS}
      - INTERFACE_HOST=0.0.0.0  
      - INTERFACE_PORT=8082
    ports:  
      - 127.0.0.1:8082:8082  
    volumes:  
      - ${DATA_PATH}/wallabag/kindle_db:/app/db

This version uses HTTP GET parameters for sending username and password, which means that they might be visible in Wallabag logs. To avoid this, consider building this container from the source code of a pull request I created until these changes are merged into the master branch.

Other possibilities

Another interesting solution would be not to use Send-to-Kindle at all. Instead, there could be an e-ink friendly website for RSS reader.

There are such applications for Inoreader, but they require sending Inoreader login information to a third-party, which might not be a good idea.

After a lot of searching, I found a simple Flask application exactly for this purpose - Kinss. Another possibility is to edit CSS of the Miniflux website to be easily renderable on a Kindle.

Updating

To keep all of the containers up-to-date I use Watchtower:

version: '3.4'  
services: 
  watchtower:
    image: containrrr/watchtower
    container_name: watchtower
    restart: always
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

Setup

docker and docker-compose have to be already installed.

Full docker file

Here is the full docker-compose.yml file:

version: '3.4'
services:
#-------------------------Miniflux-------------------------
  miniflux:
    image: ${MINIFLUX_IMAGE:-miniflux/miniflux:latest}
    container_name: miniflux
    restart: always
    ports:
      - "127.0.0.1:8080:8080"
    depends_on:
      - miniflux_db
    environment:
      - BASE_URL=${MINIFLUX_DOMAIN}
      - DATABASE_URL=postgres://${MINIFLUX_DBUSER:-miniflux}:${MINIFLUX_DBPASS}@miniflux_db/miniflux?sslmode=disable
      - RUN_MIGRATIONS=1
      - CREATE_ADMIN=1
      - ADMIN_USERNAME=${MINIFLUX_ADMINUSER:-admin}
      - ADMIN_PASSWORD=${MINIFLUX_ADMINPASS}
      - CLEANUP_ARCHIVE_UNREAD_DAYS=90
      - CLEANUP_ARCHIVE_READ_DAYS=30
    # Optional health check:
    healthcheck:
     test: ["CMD", "/usr/bin/miniflux", "-healthcheck", "auto"]

  miniflux_db:
    image: ${POSTGRES_IMAGE:-postgres:14}
    restart: always
    container_name: miniflux_db
    environment:
      - POSTGRES_USER=${MINIFLUX_DBUSER:-miniflux}
      - POSTGRES_PASSWORD=${MINIFLUX_DBPASS}
    volumes:
      - ${DATA_PATH}/miniflux/db:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "${MINIFLUX_DBUSER:-miniflux}"]
      interval: 10s
      start_period: 30s
#------------------------Wallabag-------------------------
  wallabag:  
    image: wallabag/wallabag
    container_name: wallabag  
    restart: always  
    depends_on:  
      - wallabag_db  
    environment:  
      - POSTGRES_PASSWORD=${WALLABAG_DBPASS_SU}
      - POSTGRES_USER=${WALLABAG_DBUSER_SU:-wallabag_su}
      - SYMFONY__ENV__DATABASE_DRIVER=pdo_pgsql
      - SYMFONY__ENV__DATABASE_HOST=wallabag_db 
      - SYMFONY__ENV__DATABASE_PORT=5432
      - SYMFONY__ENV__DATABASE_NAME=wallabag
      - SYMFONY__ENV__DATABASE_USER=${WALLABAG_DBUSER}
      - SYMFONY__ENV__DATABASE_PASSWORD=${WALLABAG_DBPASS}
      - SYMFONY__ENV__DOMAIN_NAME=${WALLABAG_DOMAIN}
      - SYMFONY__ENV__FOSUSER_REGISTRATION=false
      - SYMFONY__ENV__SECRET=${WALLABAG_SECRET}
      - SYMFONY__ENV__SERVER_NAME="Wallabag"  
    ports:  
      - "127.0.0.1:8081:80"
    volumes:  
      - ${DATA_PATH}/wallabag/images:/var/www/wallabag/web/assets/images  
  
  wallabag_db:  
    image: ${POSTGRES_IMAGE:-postgres:14}
    container_name: wallabag_db  
    restart: always  
    environment:  
      - POSTGRES_PASSWORD=${WALLABAG_DBPASS_SU}
      - POSTGRES_USER=${WALLABAG_DBUSER_SU:-wallabag_su}
    volumes:  
      - ${DATA_PATH}/wallabag/data:/var/lib/postgresql/data
#-------------------------Send-to-Kindle-------------------------
  wallabag-kindle:  
    image: janlo/wallabag-kindle-consumer  
    container_name: wallabag-kindle  
    restart: always
    environment:  
      - WALLABAG_HOST=http://wallabag
      - DB_URI=sqlite:////app/db/database.db  
      - CLIENT_ID=${WALLABAG_KINDLE_ID}
      - CLIENT_SECRET=${WALLABAG_KINDLE_SECRET}  
      - DOMAIN=http://127.0.0.1:8083
      - SMTP_FROM=${GMAIL_USER}
      - SMTP_HOST=smtp.gmail.com  
      - SMTP_PORT=587  
      - SMTP_USER=${GMAIL_USER}
      - SMTP_PASSWD=${GMAIL_PASS}
      - INTERFACE_HOST=0.0.0.0  
      - INTERFACE_PORT=8082
    ports:  
      - 127.0.0.1:8082:8082  
    volumes:  
      - ${DATA_PATH}/wallabag/kindle_db:/app/db
#-------------------------Watchtower-------------------------
  watchtower:
    image: containrrr/watchtower
    container_name: watchtower
    restart: always
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

To complete all of the variables, .env file has to be created, looking similar to this:

DATA_PATH=/RSS-data
# Miniflux variables
MINIFLUX_DBUSER=miniflux
MINIFLUX_DBPASS=[CHANGEME]
MINIFLUX_ADMINUSER=admin
MINIFLUX_ADMINPASS=[CHANGEME]
# Wallabag variables
WALLABAG_DBUSER_SU=wallabag_su
WALLABAG_DBPASS_SU=[CHANGEME]
WALLABAG_DBUSER=wallabag
WALLABAG_DBPASS=wallapass
WALLABAG_SECRET=[CHANGEME]
# Wallabag-kindle variables
WALLABAG_KINDLE_ID=
WALLABAG_KINDLE_SECRET=

# Domain names
MINIFLUX_DOMAIN=http://localhost:8080/
WALLABAG_DOMAIN=http://localhost:8081/
# E-mailing variables
GMAIL_USER=[GMAIL USER]
GMAIL_PASS=[GMAIL PASS]

All of the passwords should be changed and DATA_PATH should be set to point to a folder where all of the data will be kept. If you are deploying it with some hostname, change MINFLUX_DOMAIN and WALLABAG_DOMAIN as well. GMAIL_USER and GMAIL_PASS should be set to a throwaway Gmail account, which was mentioned in Sending articles to Kindle section.

After running docker-compose up in the same folder as .env and Docker-compose.yml files, there should be three services listening:

  • Miniflux at localhost:8080
  • Wallabag at localhost:8081
  • Wallabag-kindle-consumer at localhost:8082

If you are deploying these services externally, reverse proxy can be set up to forward requests to them.

Configuring the services

Next steps could not be done through docker file alone and have to be done manually. These have to be done just once.

Miniflux and Wallabag

  1. Log into Wallabag (at localhost:8081) with wallabag:wallabag and change the default password
    • (optional) create a new non-administrative user and log into it
    • (optional) set up 2FA
  2. Click My Account(Upper right corner) -> API clients management
    1. Click CREATE A NEW CLIENT
    2. Name it Miniflux
    3. Note the Client ID and Client Secret
    4. Do it again, this time name the new client Kindle and note Client ID and Client Secret as well
  3. Log into Miniflux (at localhost:8080) with credentials you put in .env file
    • (optional) create a new non-administrative user and log into it
  4. Click Settings -> Integrations
    1. Go into Wallabag section
    2. Tick Save articles to Wallabag
    3. Put http://wallabag into Wallabag API Endpoint
    4. Put in Wallabag Client ID and Wallabag Client Secret, which were generated in Wallabag and Wallabag username and Password
    5. Click Update
  5. If everything was set up successfully, in miniflux, when you click Save under an article, you should see it added in Wallabag

Sending to Kindle

  1. Make sure you have the gmail set to accept SMTP communication (guide, do not forget to “Allow less secure apps”) and it is whitelisted the in the Amazon Kindle settings (guide)
  2. In the .env file, add the Client ID and Client Secret created in Wallabag for Kindle on lines WALLABAG_KINDLE_ID and WALLABAG_KINDLE_SECRET
  3. Create a local database for this tool by running this command as a superuser or a docker user:
docker exec -d wallabag-kindle python service.py --create_db --env
  1. Rebuild the wallabag-kindle container
docker-compose up -d --build wallabag-kindle
  1. Go to a newly created interface (at 127.0.0.1:8082) and put in your username, password, email for your Kindle, and any other email where you wish to recieve notifications
  2. Now, if you add tag kindle, kindle-mobi or kindle-pdf to an article in Wallabag, it should be sent to the Kindle in a few minutes
  3. Set up automatic tagging in wallabag
    1. Click My Account(Upper right corner) -> Config -> TAGGING RULES
    2. Put readingTime >= 5 or any other condition into Rule field and kindle into Tags field
    3. Click SAVE
  4. Now, if you save an article from Miniflux, and it is long enough, it should be tagged Kindle in Wallabag, until the wallabag-kindle finds out about it, at which point it sends it to the kindle and removes the tag. Synchronization from Amazon servers to your Kindle may take some time as well.

Iphone apps

I recommend installing an app for reading RSS from Miniflux as well as the app for Wallabag, mainly for saving articles you encounter outside of RSS feeds.

  1. Enable Fever API by going to Miniflux (at localhost:8080) -> Settings -> Integrations
  2. Fill in desired username and password, click Activate Fever API and then Update
  3. In the Unread app, click Add an account.. -> FEVER
  4. As the url set Fever API endpoint, https://[domain]/fever
    • If that does not work, try https://[domain]/, or check server access logs for reference what url the app tries to access
  5. Put in Fever API login information

In Unread, you can directly save an article. This functionality works in conjunction with Wallabag and sending to Kindle without a hitch as well.

Tips and tricks

  • Miniflux can be configured to fetch a full article and serve it as RSS. This is controlled by a check Fetch original content in feed settings. If it is enabled, Miniflux fetches the whole article and serves it as RSS, which means that the iOS apps download it in full even without “offline caching” functionality
  • With https://kill-the-newsletter.com/ you can transform a newsletter into an RSS feed

Future work

This infrastructure can be extended in some ways:

  • Local and therefore more private version of https://kill-the-newsletter.com/, which aggregates newsletters and presents them to you as an RSS feed
  • Converting Twitter, Facebook, or virtually any website into RSS by periodically scraping it and finding differences
  • As was mentioned before, Kindle-friendly website
  • Improve parsing of some of the articles (might be configurable in Miniflux)
  • Try out parsing behind a paywall (might be configurable in Miniflux)