Brain surgery on the Matrix server Synapse

For many years now I have been running a Matrx server for myself and a few other users. Mostly for fun but also because I love federated technology.

Intro

You can skip this introduction if you already know what Matrix is, but if not, read on!

The Matrix network was one of the early federated messaging protocols. It uses self-hostable servers that exchange messages between them. People can join a Matrix server of choice and talk to both local and remote users on other servers. Messages get exchanged in rooms like they used on IRC networks. And all the rooms are shared between all the servers. The messages that people send create an internal DAG (Directed Acyclic Graph) that makes sure all servers will see all messages eventually, and in the right order. For encrypted rooms all these messages are even fully end-to-end encrypted. Quite a technological feat.

Problem

All the exchanged messages that are stored in this DAG build up in the database. Since the DAG needs to be consistent you can’t just randomly delete bigger messages. And you need the history to keep synchronized with other servers. This makes it that the database keeps on growing and growing, until something fills up. When not controlled properly the database or disk can become full and even crash/corrupt the database. Besides that the unlimited growing storage can bring quite a hefty cost.

So how do we prevent this?

Solution

This solution is for the Matrix server Synapse. This guide is not suitable for other server software.

There are a few options and solutions, some more invasive then others. Firstly there are some admin API’s to clean caches. These will cleanup old data and rooms but might not directly shrink your database. There are some database analytics possible to find the biggest rooms to help decide which rooms to cleanup. Then there are some tools that help prevent your database from unnecessary growing. Lastly we can dive into the database and lobotomize the Synapse server and force the database to shrink.

There are four parts:

  • Preparations
    Get some basic information and to set up your environment.
  • Managing the Synapse server
    Cleanup rooms and caches using the admin API.
  • Prevention
    Set up the state-compressor to prevent duplicate state build-up.
  • Diving into the database
    Dive into the database to do some deep-cleaning and get information to determine your strategy.

Technically you don’t have to mess around in your database to do some cleanup but it can really help to determine where to start.

Important

There is a LOT of information of this article and it assumes you are comfortable using Linux, Bash, Curl, Docker and all necessary tools. It assumes both your Synapse and PostgreSQL are running in a Docker. This guide only gives you options to pursue, it won’t hold your hand.

Using the API to purge or truncate rooms won’t shrink your database. This only frees up data to be re-used. It can even grow your database a bit from the new state changes. So make sure you have enough spare room. To shrink the database afterwards read the database chapter.

Disclaimer

Warning: this is doing a deep dive into your Matrix server. Especially the database chapter is like doing a lobotomy on your Synapse. I am not a developer of Matrix so this information is all based on my own research and has not officially been verified. These instructions worked for me but may completely bork your database or transmogrify it into unicorns, I don’t know. Do this at your own risk and make sure you have a backup!

Preparations

With these preparations we set up some environment variables and get the needed access token to talk to the admin API.

These instructions are written for Bash session. Adjust for your own environment as necessary.

Getting access to the API

You need admin-access to the Synapse server. For this you need to be logged in as an admin user (for example with Element) and then you can get the access token from the settings. It should look like a (very) long string of random characters. Store this in the variable MYTOKEN. This token is used in most of the commands below.

export MYTOKEN="aAbBcC012345678...Z"

Determine a moment in history.

For some endpoints we need to give a timestamp. This timestamp should be in the form of the time in milliseconds since the epoch. A simple way to generate one and put it in an environment variable follows. Adjust the the date and time to your own preferences.

This timestamp will be used as the earliest time to keep data for. So we will purge all data from before this timestamp.

export TIMESTAMP=$(date --date='2024-01-01 00:00:00' '+%s%N' | cut -b1-13)
echo "Timestamp: $TIMESTAMP"

Managing the Synapse server

Using the admin API’s we can let the Synapse server manage most of its own data. We start by purging all data from before the timestamp as set in the preparations.

Purge Cached Remote Media

This will purge all cached remote data from before the timestamp. This will only remove remote media and keep local data intact, since we are responsible for this data. If needed the server can request this media again from the remote servers.

curl \
	-X POST \
	--header "Authorization: Bearer $MYTOKEN" \
	"http://localhost:8008/_synapse/admin/v1/purge_media_cache?before_ts=${TIMESTAMP}"

Purging room history

For big rooms that have been growing over time we can also choose to purge data from before the timestamp. This is the best way to truncate bigger rooms like Matrix HQ without completely abandoning them.

Change the ROOM variable below to your selected room id.

export ROOM='!123aBcDeFgHiJkL:example.com'

curl \
	-X POST \
	--header "Authorization: Bearer $MYTOKEN" \
	--data-raw '{"purge_up_to_ts":'${TIMESTAMP}'}' \
	"http://localhost:8008/_synapse/admin/v1/purge_history/${ROOM}"

The above call will start the purge process and this can take a long while. The result that is returned contains an purge_id that you can use to check the progress.

{"purge_id":"naVelpADypgAlFiT"}

You can use the purge_id to check for progress.

curl \
	--header "Authorization: Bearer $MYTOKEN" \
	'http://localhost:8008/_synapse/admin/v1/purge_history_status/naVelpADypgAlFiT'

Removing rooms

During normal use some rooms will be joined and left again. Other rooms will be migrated, relocated or simply abandoned. All data for these rooms is kept in the server, and we can choose to purge it.

A good way to determine rooms that we can purge is in the database-section. You can also check your client for obsolete rooms. When you have determined what rooms you want to remove you can issue the following commands to remove the rooms from the server. We make sure that all local users are removed and that we won’t ban the room for future usage.

Change the ROOM variable below to your selected room id.

export ROOM='!123aBcDeFgHiJkL:example.com'

curl -X DELETE \
	--header "Authorization: Bearer $MYTOKEN" \
	--data-raw '{"block":false,"purge":true}' \
	"http://localhost:8008/_synapse/admin/v2/rooms/${ROOM}"

The above call will start the purge process and this can take a long while. The result that is returned contains an delete_id that you can use to check the progress.

{"delete_id":"CLfgUeBILogveHtl"}

You can use the id to check for progress.

curl --header "Authorization: Bearer $MYTOKEN" \
	"http://localhost:8008/_synapse/admin/v2/rooms/delete_status/CLfgUeBILogveHtl"

Removing unreferenced state-groups

This problem is tracked in issue #3364.

Before you go through all the steps for this clean-up action, check that it is really a problem on your Synapse-server. Use the following SQL-statement to check the amount of unreferenced state-groups. Only go through all these steps if you have more then thousands of groups.

SELECT COUNT(*) from state_groups sg
	LEFT JOIN event_to_state_groups esg ON esg.state_group=sg.id
	LEFT JOIN state_group_edges e ON e.prev_state_group=sg.id
WHERE esg.state_group IS NULL and e.prev_state_group IS NULL;

There can be unreferenced states left over, and these can take up unnecessary space. One of the Syanapse developers (@erikjohnston) made a tool to clean these from the database. The tool is really handy but is written in Rust and requires a full Rust toolchain to compile. And to properly get the data to the database requires some trickery.

See: https://github.com/erikjohnston/synapse-find-unreferenced-state-groups

You can use the following steps to build this into an Docker image yourself:

git clone https://github.com/erikjohnston/synapse-find-unreferenced-state-groups.git
cd synapse-find-unreferenced-state-groups

Create a file called Dockerfile with the following content:

FROM docker.io/rust:alpine AS builder

RUN apk add python3 musl-dev pkgconfig openssl-dev make

ENV RUSTFLAGS="-C target-feature=-crt-static"

WORKDIR /build

COPY . .

RUN cargo build \
        --release

FROM docker.io/alpine

RUN apk add --no-cache libgcc libcrypto3 ca-certificates

COPY --from=builder /build/target/release/rust-synapse-find-unreferenced-state-groups /usr/local/bin/rust-synapse-find-unreferenced-state-groups

Then you can build the tool with:

docker build -t synapse-find-unreferenced-state-groups:latest .

Running it will need some Docker trickery with local files, adjust the following command to your own situation, don’t just blindly copy it! Make sure your server is stopped while you generate the list, otherwise incomplete events can get deleted.

export ROOM='!123aBcDeFgHiJkL:example.com'

docker run -ti \
	--rm \
	--network postgresql-network \
	--user "$(id -u)":"$(id -g)" \
	--volume="$PWD:/data" \
	-- \
	synapse-find-unreferenced-state-groups:latest \
			/usr/local/bin/rust-synapse-find-unreferenced-state-groups \
					-p "postgresql://matrix-synapse:[YOUR DATABASE PASSWORD HERE]@postgresql/synapse" \
					-r "$ROOM" \
					-o "/data/unreferenced.csv"

Now copy the resulting csv-file into your PostgreSQL database container.

docker cp ./unreferenced.csv postgresql:/unreferenced.csv

Use the following script to import the resulting list into a temporary table and use it to clean up the leftover state groups.

CREATE TEMPORARY TABLE unreffed(id BIGINT PRIMARY KEY);
COPY unreffed FROM '/unreferenced.csv' WITH (FORMAT 'csv');
DELETE FROM state_groups_state WHERE state_group IN (SELECT id FROM unreffed);
DELETE FROM state_group_edges WHERE state_group IN (SELECT id FROM unreffed);
DELETE FROM state_groups WHERE id IN (SELECT id FROM unreffed);

Prevention

There is a way to prevent excessive growth of the state in the database. Ofcourse this won’t solve the massive amount of data from bigger rooms like Matrix HQ but at least it will limit the growth of duplicate state rows.

Synapse Auto Compressor

The Synapse developers made a tool that (periodically) compresses all the state rows that are in the database. By combining separate state changes in one bigger state change we save a lot of space. Because we reduce the number of rows this can also improve performance.

The tool is Synapse Auto Compressor but building it is a bit of a hassle. If you have access to Docker on your server you can build it there, otherwise you have to build it somewhere locally.

git clone https://github.com/matrix-org/rust-synapse-compress-state.git
cd rust-synapse-compress-state
docker build -t rust-synapse-compress-state:latest .

After the building is successful you can now run the compressor. You can (and maybe should) even automate this with crontab for example.

Make sure you change the connection string and Docker network to work with your database.

docker run -ti \
        --rm \
        --network postgresql-network \
        synapse_auto_compressor \
                /usr/local/bin/synapse_auto_compressor \
                        -p "postgresql://matrix-synapse:[YOUR DATABASE PASSWORD]@postgresql/synapse" \
                        -c 5000 -n 100

The compressor keeps a local checkpoint to resume the next time it is run. If you are going to poke around in the database (like in the sections below) you might need to reset its state. Use the following SQL statement for that.

-- Clean up synapse_auto_compressor state tables.
--
-- From: https://github.com/matrix-org/rust-synapse-compress-state/issues/78#issuecomment-1409932869
--
-- The compressor doesn't take into account
-- - deleted rooms
-- - state groups which got deleted either as unreferenced 
--   or due to retention time
--
-- Procedure:
-- 1. Delete progress related to deleted rooms
-- 2. Delete progress for rooms where one of the referenced state groups
--    no longer exist
-- 3. Replicate changes from state_compressor_state to state_compressor_progress

BEGIN;

DELETE
FROM state_compressor_state AS scs
WHERE NOT EXISTS
    (SELECT *
     FROM rooms AS r
     WHERE r.room_id = scs.room_id);

DELETE
FROM state_compressor_state AS scs
WHERE scs.room_id in
    (SELECT DISTINCT room_id
     FROM state_compressor_state AS scs2
     WHERE scs2.current_head IS NOT NULL
       AND NOT EXISTS
         (SELECT *
          FROM state_groups AS sg
          WHERE sg.id = scs2.current_head));

DELETE
FROM state_compressor_progress AS scp
WHERE NOT EXISTS
    (SELECT *
     FROM state_compressor_state AS scs
     WHERE scs.room_id = scp.room_id);

COMMIT;

Diving into the database

We can query the Synapse database to give us some information on empty rooms or rooms that will free up the most space.

Also after removing empty rooms and purge old data we created some breathing room, but our database needs some help to actually give-back the free space to the system.

Connecting

docker exec -ti postgresql psql -U postgres -d synapse

Most of the following snippets are to be run from the PostgreSQL REPL.

Finding (local) empty rooms

Rooms are joined and left by users all the time, but we mostly care about our local users. A room can keep existing but as we have no local users in it we don’t need to keep track of the state.

The following query gives us a tally of rooms with the amount of local users and the amount of total users.

The rooms without local users are safe to purge using the admin API above. Big rooms with only a few users can also be a candidate for deletion or truncation depending on the amount of use, and what kind of users you have.

SELECT
	room_stats_current.room_id, room_stats_state.name,
	room_stats_current.local_users_in_room, room_stats_current.joined_members
FROM room_stats_current
	LEFT JOIN room_stats_state ON room_stats_current.room_id = room_stats_state.room_id
ORDER BY joined_members DESC, local_users_in_room DESC;

Finding the biggest rooms

From the Synapse FAQ

To find the rooms that are taking up the most space in the database you can use the following query to get the top-10 (or more if you adjust the LIMIT). Use this as guidance for truncation or purging.

SELECT s.canonical_alias, g.room_id, count(*) AS num_rows
FROM
  state_groups_state AS g,
  room_stats_state AS s
WHERE g.room_id = s.room_id
GROUP BY s.canonical_alias, g.room_id
ORDER BY num_rows desc
LIMIT 10;

Database Table Size

To get a quick view of your table size in the Synapse database you can use the following query. You can use this as guidance on Vacuuming below.

SELECT
	table_name,
	pg_size_pretty(pg_total_relation_size(quote_ident(table_name))),
	pg_total_relation_size(quote_ident(table_name))
FROM information_schema.tables
WHERE table_schema = 'public'
ORDER BY 3 desc;

Vacuuming

After all cleaning, compressing and purging actions the database still hasn’t changed in size. Or might even have grown bigger. PostgreSQL won’t automatically release deleted rows back to the system as free space. We need to trigger this.

This is a critical action with a lot of caveats.

Firstly, the database might be locked during the vacuuming actions. Especially when the table is huge this will cause a lot of problems for processes trying to mutate in it. So make sure you shut down your server while you do this.

Secondly, PostgreSQL cleans up its data by rebuilding a whole new data file writing only the kept data rows. This means that, in a worst case scenario, you need more then the size of the full database as free diskspace.

Vacuuming the small tables

The following tables are not excessively big in size, so vacuuming them should be the first action. This frees up a bit of space and shouldn’t take too long.

VACUUM (FULL, VERBOSE) event_json;
VACUUM (FULL, VERBOSE) device_lists_changes_in_room;
VACUUM (FULL, VERBOSE) cache_invalidation_stream_by_instance;
VACUUM (FULL, VERBOSE) events;
VACUUM (FULL, VERBOSE) event_edges;

For other options to VACUUM see the chapter Database Table Size.

Vacuuming the state table

This is the big one. This table contains all the state changes for all rooms, users, messages, etc. This table is the one that grows the largest and will take a while to vacuum. Make sure you have stopped the Synapse server and that you have enough free space (at least the full size of the database).

VACUUM (FULL, VERBOSE) state_groups_state; -- Warning, this one is big

Rebuilding Indices

Now after all the slicing and dicing in the data it is a good moment to rebuild all the (stale) indices. This takes a bit of time but the server should be able to function normally during the process.

REINDEX (VERBOSE) DATABASE synapse;

Finally

I hope this helps in managing the database of your Synapse server. Being part of the Matrix network is great fun but does require some dedication sometimes. A lot of thanks to the Matrix developers, you guys are awesome!

Postscriptum

Read Part Two on how to remove left-over data from the database.