Introduction
MySQL 8.0.20 introduced a binary log transaction compression feature designed to save storage space and reduce network bandwidth usage by applying compression to each transaction. This feature can be enabled dynamically without restarting the server by setting the binlog_transaction_compression system variable to ON, as it is OFF by default. Additionally, the compression level of the zstd algorithm can be adjusted using the binlog_transaction_compression_level_zstd variable. By default, MySQL uses a compression level of 3, balancing resource usage and compression ratio. Detailed information on the implementation can be found in work log WL#3549.
Limitations
While MySQL's binary log transaction compression offers several benefits, it does not apply to all types of events. Excluded events include:
- Non-transactional engines.
- Transactions that end up in incident events.
- Compressed transactions cannot exceed (MAX_ALLOWED_PACKET - bytes used for event header) bytes of compressed payload.
- Only ROW-based format transactions are compressed.
Monitoring
The effects of binary log transaction compression can be monitored using the Performance Schema table binary_log_transaction_compression_stats. This table provides various statistics, such as the data compression ratio for the monitored period and the impact of compression on the server's most recent transaction. The table can be truncated to reset the statistics. For example:
In this example, the compression ratio is around 66%, demonstrating significant storage savings.
Readyset
We are pleased to announce that starting at Readyset version stable-240523 users can benefit from MySQL Binary Log Transaction Compression support. As part of introducing support in Readyset, we also extended MySQL rust driver to be able to decode compressed events. This work has been submitted as a contribution to the driver and has been released to the public in the rust driver version v0.31.0.
Test Scenarios
To test the potential savings of compressing binary logs, we conducted tests on two different EC2 instances: a smaller instance (t2.medium) with 2 CPUs and 4GB of RAM, and a larger instance (c4.4xlarge) with 16 CPUs and 30GB of RAM. The steps to reproduce this test are as follows:
1) Create a large table and populate it:
2-) Run the workload from an application server:
Case 1: Small Server (2 CPUs)
In the initial run, binary log compression was disabled. By utilizing mysqlbinlog, along with shell and Perl scripts, we can count the number of transactions within a single binary log. In the example below, we can see that a single binary log contains 2,123,891 transactions.
Network utilization was approximately 23GB:
And CPU usage was around 22%:
After enabling binary log compression, the number of transactions in a single binary log increased to 5,764,847, representing a 270% increase:
We also observed a 40% reduction in network usage:
While CPU usage increased from 22% to 39%:
Case 2: Large Server (16 CPUs)
With the larger server, we repeated the same steps. In the first run, the binary log contained the same number of transactions:
The network usage was stabilized at 63GB/hour:
The CPU usage was stable at 3.85%:
In the second run with binary log compression enabled, we observed a similar increase in the number of transactions stored in the binary log, as anticipated:
Network usage was reduced by 68%:
CPU usage showed a slight increase:
Results
Below are the charts summarizing these numbers:
Summary
Binary log compression demonstrated significant savings in disk and network usage. The compression gains in storage size were consistent due to the same algorithm and workload. However, network performance varied because the increased CPU load reduced the capacity to process the workload. Additionally, servers with larger capacity benefited more from compression as they experienced less CPU impact. These observations may explain Oracle's decision to leave binlog_transaction_compression disabled by default.
Users utilizing Binlog Transaction Compression on MySQL server are now able to see the same benefits of reducing network usage when using Readyset.