Building a scalable MySQL Proxy in Rust

Blog-InlineIMAGES-Rust-MySQL

Building a scalable MySQL Proxy in Rust

At AgilData, we have many years of experience running production MySQL infrastructures at scale. We run a 64 server sharded MySQL cluster for Pokémon GO for example. One component of our AgilData Scalable Cluster infrastructure is a MySQL proxy server, written in Java, that intercepts queries and executes them against a sharded database cluster. We go beyond the simple single-server routing provided by most sharding solutions and implement full distributed transactions and federated queries across the shards, allowing aggregate queries to be used and minimizing application changes.

While Java and Scala most definitely can work at scale, it is inevitable that developers end up working around the JVM platform to some degree to reduce the cost of garbage collection. Apache Spark is a great example of this. Some brilliant engineering has gone into Spark to make it run on the JVM while not really using the JVM – from dynamic byte code generation to custom off-heap memory management.

We’re currently working on a new product that requires a MySQL proxy and this time we decided to build it in Rust and release it as open source. Our timing for this project turned out to be quite fortuitous, thanks to the recent release of two key crates: tokio-rs and futures-rs that provide a great foundation for performing scalable asynchronous io.

Proxy Overview

The overall design of the proxy is pretty simple. We start up a server that binds to a socket and listens for incoming connections. For each incoming connection we spin up a Future to service requests from that socket. We pass the Future to the tokio-core reactor which controls the event loop. This code sample demonstrates the use of future combinators to chain together a number of futures. The call to TcpStream::connect() on line 5 (to connect to the MySQL server) does not actually connect right away but returns a future of that connection. Rather than having to wait here we can chain this future together with some more future operations using the and_then() call. The input to each and_then() function is the output from the previous future.

The Pipe future takes care of polling both the client and server for read and write readiness and performs as many io operations as it can until all of these resources are in the NotReady state and then returns. The tokio reactor gets notified when these sockets are ready for further read/write operations and then the poll() method gets called again.

Potentially, we could break this one future down into separate futures for each individual operation e.g. client-read, client-write, server-read, server-write, but then we would need to implement messaging between futures, perhaps using channels, and that adds complexity and possibly overhead too, but we haven’t explored that option extensively.

PacketHandler API

The real value in this mysql-proxy crate though is the ability for the user to provide their own implementation of a PacketHandler trait to tell the proxy what action to perform for each request or response packet. The support operations are Drop, Forward, Mutate, Respond, and Error. The full enumeration is:

The simplest implementation of a packet handler that simply forwards all packets without any further behavior would look like this:

One of the things we really liked about implementing this in Rust was the conciseness of the code to invoke the packet handler and then perform the requested action. Rust’s pattern matching works great for use cases like this.

Benchmarks

We used the industry standard TPC-C benchmark to test the performance of the proxy compared to the performance of mysql-router. We ran the benchmarks both in the cloud (AWS) and on dedicated hardware. In each case, the TPC-C benchmark and proxy were on one server, with MySQL on a second server. We used mysql-proxy-rs version 0.1.7 built with the --release flag in both cases.

Cloud

Benchmark-TPC-C_Cloud

Dedicated Hardware

Benchmark-TPC-C_Hardware

We were pleased to see that our proxy had comparable performance to mysql-router despite being built on a framework that was only released a few weeks ago. As with any benchmark, this should be taken with a grain of salt because at this stage it is entirely possible that mysql-proxy-rs is missing basic features that will need to be added and those features could impact performance. However, this is an encouraging start!

Why are we open sourcing this?

In our experience with supporting multiple versions of MySQL and MariaDB over the years, we know that there are protocol compatibility issues we have to handle correctly and it takes time to get this right. By open sourcing the proxy we hope we can receive contributions from the community to help accelerate the time-to-maturity for this project. We’re also excited to be able to make a contribution to the Rust ecosystem.

Future Roadmap

We’re tracking our roadmap on github, but here are some of the main areas we are exploring:

  • Currently, the proxy maintains a one to one mapping of a client connection to a server connection. For more flexibility, the API should be extended so that the user-provided proxy logic can decide which MySQL instance to connect to. This would facilitate basic query routing and sharding.
  • We’re currently working with the base tokio-core crate, but would most likely benefit from moving up the stack to tokio-proto and potentially tokio-middleware. This would be a good opportunity for us to get more involved with the tokio project too.
  • Currently the proxy doesn’t contain much code for parsing different packet types, and we will be moving more of this code from our product into the open source code base over time.

Lessons Learned

Rust is an early language and is evolving quickly. We’re using nightly builds and we have occasionally hit issues where we need to make sure all developers are using the same version, with co-ordinated upgrades to take advantages of new features.

The tokios-rs and futures-rs are very new and evolving even faster than the language, so we had a couple of false starts trying to code against a moving target. Now that both of these crates have 0.1.0 releases we expect this to become less of an issue but we understand that we’re on the bleeding edge here.

As I already said in my post on my experiences at Rustconf 2016, the Rust community is awesome. We’re often asking questions on the Rust IRC channel or The Tokio Project’s Gitter, and using Rust Playground to share code samples.

Get Involved!

The source code for the proxy is hosted on github and the crate has also been published to crates.io.

We’d love to hear from you if find this useful and would welcome contributions.

Share This Post

Welcome to the AgilData Blog!

Find the latest news about AgilData, Big Data and tips and tricks straight from our engineering team.

Top