Testing principles

Our belief is that a fair database benchmark should follow some key principles:

✅ Test different databases on exactly the same hardware

Otherwise, you should acknowledge an error margin when there are small differences.

✅ Test with full OS cache purged before each test

Otherwise you can’t test cold queries.

✅ Database which is being tested should have all its internal caches disabled

Otherwise you’ll measure cache performance.

✅ Best if you measure a cold run too. It’s especially important for analytical queries where cold queries may happen often

Otherwise you completely hide how the database can handle I/O.

✅ Nothing else should be running during testing

Otherwise your test results may be very unstable.

✅ You need to restart the database before each query

Otherwise, previous queries can still impact current query’s response time, despite clearing internal caches.

✅ You need to wait until the database warms up completely after it’s started

Otherwise, you may end up competing with the database’s warm-up process for I/O which can severely spoil your test results.

✅ Best if you provide a coefficient of variation, so everyone understands how stable your results are and make sure yourself it’s low enough

Coefficient of variation is a very good metric which shows how stable your test results are. If it’s higher than N%, you can’t say one database is N% faster than another.

✅ Best if you test on a fixed CPU frequency

Otherwise, if you are using “on-demand” CPU governor (which is normally a default) it can easily turn your 500ms response time into a 1000+ ms.

✅ Best if you test on SSD/NVME rather than HDD

Otherwise, depending on where your files are located on HDD you can get up to 2x lower/higher I/O performance (we tested), which can make at least your cold queries results wrong.