Deep Dive into the Matrix Database

So in the last post Matrix Synapse Maintenance we tried to clean our Matrx Synapse database. Well, this did not solve my problem… The database was still huge and no amount of purging or vacuuming cleared that up. Apparently there are some issues that prevent the data to be fully removed. Well, you live and learn.

After some research today we’re going to clean up all the left-overs. This will finally delete enough data to actually shrink the database.

The problem

In the last post we checked the database for rooms that were empty, left-over or just too big. We purged these rooms using the Room Delete (v2) Admin API.

The problem is that purging doesn’t actually remove all the room-state from the database (see issue: #12821 and the original issue #12821). This means that our purging left all records lingering in our database without a way to use them or delete them cleanly using the API.

Solution

So, what are we going to do? In the original issue (#12821) the user @rettichschnidi gives us a two very handy database queries in a comment.

Disclaimer / Warning

This is still like a lobotomy on your Matrix Synapse database. I myself have tested these steps, but I offer no guarantees that nothing will go wrong, your database gets corrupted or whatever horrors could happen. Make sure you have backups and think about the steps instead of blindly copy-pasting them.

Preparation

The purging of old state_groups from the database only works on rooms that are actually purged. So purge the rooms as explained in my previous post.

Now since we’re mucking about in the brain of the Synapse server make sure to stop the server before embarking on the mission.

Now connect to your database. For example when using PostreSQL in Docker (where synapse is the name of your database schema):

docker exec -ti postgresql psql -U postgres synapse

Validation

The statements in the next part will remove all state information from rooms that are not known on the server anymore. To check which rooms will be cleaned you can use the following statement:

SELECT 
	DISTINCT(state_groups.room_id) AS room_id_gone 
FROM 
	state_groups 
	LEFT JOIN events USING(room_id) 
WHERE 
	events.room_id IS NULL 
GROUP BY 
	room_id_gone

Purging the states

Now we’re going to purge the old data from state_groups and state_groups_state with the following two SQL statements.

Removing obsolete rows from state_groups_state:

DELETE FROM 
	state_groups_state 
WHERE 
	room_id IN (
		SELECT 
			DISTINCT(state_groups.room_id) AS room_id_gone 
		FROM 
			state_groups 
			LEFT JOIN events USING(room_id) 
		WHERE 
			events.room_id IS NULL 
		GROUP BY 
			room_id_gone
	);

Removing obsolete rows from state_groups:

DELETE FROM 
	state_groups 
WHERE 
	room_id IN (
		SELECT 
			DISTINCT(state_groups.room_id) AS room_id_gone 
		FROM 
			state_groups 
			LEFT JOIN events USING(room_id) 
		WHERE 
			events.room_id IS NULL 
		GROUP BY 
			room_id_gone
  );

Releasing the diskspace

So, now that all the obsolete states have been removed from the database we need to VACUUM. Remember that you need enough free space to fully copy the data to a new file.

REINDEX (VERBOSE) DATABASE synapse; VACUUM FULL VERBOSE;

Finishing up

You can now restart the Matrix Synapse server and everything should be peachy.

Conclusion

Thanks very much to @rettichschnidi for the SQL statements to remove these obsolete states. I hope this post helps admins with their database issues.