A next-generation trading system that delivers faster performance
I. Development of the Electronic Trading System
The US Paperless Crisis in the 70s
Latency and throughput are the key indicators to measure the performance of a trading system. Our prime objective is to achieve low latency and high throughput when designing a trading system.
In the context of trading, latency refers to a time interval between a request received by and a response made by a trading system. The surge of high-frequency trading volume, to a large extent, drives the market’s demand for low latency. To enable high-frequency traders to cross-trade on crypto exchanges, their trading systems should be equipped with low latency trading engines to quickly handle orders and reflect market realities in the highly competitive crypto market.
Throughput is the amount of requests or events that a trading system can process within a second. Throughput can directly impact trading efficiency, so that crypto trading systems should be designed to withstand extreme scenarios and utilize processing units.
b. Maintainability and scalability
Compared to traditional assets, crypto prices are more volatile and vulnerable to global shocks. As crypto trading systems continuously handle requests 24/7, they are designed to undergo as little offline maintenance as possible. In addition, it is obvious that the crypto sector undergoes a rapid transformation because different digital derivatives services as varied as margin, futures and options trading have been rolled out only in a decade since its rise. The proliferation of innovative services has raised the requirements for the maintainability and scalability of crypto trading systems.
II. OKEx Lightning System 2.0: Lightspeed Performance
Lightning 2.0 upgrade framework
At the early development stage of crypto trading systems, platforms usually retrieve details of a bid order of the counterparty by auto-matching it in the database until the order expires or is filled. The system then calculates the traded amount and generates a transaction entry after the matching. This method could ensure data consistency but failed to deal with many market requests at the same time because of its long processing time.
Our next-generation trading system, Lightning 2.0, has adopted the latest in-memory matching technique, where our system stores order data in-memory in the order matching engine during auto-matching, and less frequent access to the database during trading. All matching outcomes and intermediate data are also stored in-memory, which can reduce the quantities of inputs and outputs involved, hence significantly boost the order matching speed.
Moreover, modern central processing units (CPUs) access data in-memory at a slower speed than expected. According to a test, it takes only 1/7 of time to retrieve data from the L2 Cache of a CPU compared to the in-memory matching technique. In order to further reduce latency, it is important to understand how to make good use of the CPU cache. The unit of data transfer is the cache line, which is usually 64 bytes. While the CPU loads data in-memory, it transfers adjacent data in 64 bytes into the cache. Accordingly, we have made the following improvements to our Lightning system by controlling the distribution of in-memory data:
The two main types of messaging models are as follows:
Comparison of Lightning 1.0 and Lightning 2.0
In the request-response model, the client and the server are strongly coupled together. They are required to be available at the same time. The client can only wait until the server completes processing the request, which lowers its processing speed. However, in the publish-subscribe model, request processing is complete after the publisher places the message onto the queue. The publisher is decoupled from the subscriber. On the other hand, if the subscriber’s service is interrupted, the message persists on the queue and processing continues when his service resumes without the need for the publisher to resend the message, thus enhancing the reliability of system communication. Therefore, this pattern is adopted in almost all scenarios to improve our Lightning 2.0 system’s availability and throughput.
After we select the request-response pattern, the next step is choosing a suitable information exchange format. The essence of communication is to exchange messages, usually including data. Different exchanging formats have different speed of transmission and levels of communication evolvability, as well as use different programming languages. Therefore, it is a key consideration in designing a trading system.
Two common types of message formats: text-based & binary
3. Horizontal scaling
In order to improve and expand the processing capability of a trading system, horizontal scaling and vertical scaling are both desired. Vertical scaling refers to server upgrades, while horizontal scaling means that the addition of servers. The hardware performance of a server is subject to human production capacity. While the hardware configuration (hardware performance) of a server reaches a certain level (limit), it cannot be further improved, hence horizontal scaling is the only option. However, the horizontal scaling approach might lead to load balancing. How to reasonably distribute the loads of the entire system to different servers?
The first consideration is the data race. Although the addition of servers can improve the system’s capability to process data in parallel, its processing capacity cannot be still effectively improved if an unreasonable distribution occurs since parallel computing may make its servers to frequently race for the same data.
A trading system basically stores order, fund, and position data. To lower the number of data races, load sharding is performed to partition those data into shards according to the number of our users available. Users’ order, fund, and position data are independently processed, which helps avoid data races. What’s more, we further optimized our system by adding a round of batch processing for each shard to enhance the processing capacity of our system. On the other hand, derivatives trading pair margin data is another target to undergo load sharding. For a user, each trading pair is completely independent. In this way, we employ load sharding in two phases. When our system needs more servers, load rebalancing is used based on sharding to achieve the flexibility of system expansion.
4. System Scaling
A basic way to enhance the maintainability and scalability of a trading system is to separate its functionality. In this upgrade, we further split our system’s functionality into 3 modules, namely order matching, counter, and risk control. Each module contains its own internal data and status. Specifically, the order matching module is responsible for maintaining the order book and the counter module stores data on positions and account balances, while the risk control module performs the function of risk management.
As the modules work with each other to enable the functionality of the entire trading system, a mechanism is required for their communication. There are two options for inter-service communication: data sharing and messaging.
Data sharing is the most basic method that runs in a way where a module updates its data, and another module obtains new data after query. However, this approach has two significant disadvantages. First, if multiple modules make changes to and queries on the same data, it will usually result in data races, during which the response time of the database will be far longer. Second, it is difficult to get a real-time understanding of changes in other modules, and we can only know such changes after the query.
As a result, our Lightning 2.0 system’s modules are designed to save their own data and not to share data with each other. If modules’ internal state changes, the change will be encapsulated into an event and placed onto the event loop. This can reduce coupling and competition between system modules, and they can communicate with each other at an optimal speed after the event is encapsulated, which greatly enhances our system’s communication speed.
III. Lightning 2.0 Data Performance
Here are the latest statistics of our Hong Kong server testing in November:
In terms of order processing capacity, our system has a peak order processing capacity of 100,000 txn/s, comparable to mainstream trading systems in the global equity market.
The following three indicators are used to test system latency:
Three common indicators to test latency: ACK, Live, and Cancel
It shows that our Lightning 2.0 trading system has a lower latency.
Before Upgrade/ After Upgrade
IV. Industry Leader in Technology
As a world-leading cryptocurrency exchange with comprehensive C2C, spot, and derivatives trading services, we are constantly improving our trading products, risk management system, order matching engine, crypto assets storage service, and customer service, we have become the world’s largest crypto derivatives trading platform receiving great popularity with global users. It is our ultimate goal to grow with the blockchain and crypto sectors by committing extra resources to pursue higher trading security and efficiency to further push forward the development of a blockchain-driven world that everyone in the crypto space is dreaming of.
Disclaimer: This material should not be taken as the basis for making investment decisions, nor be construed as a recommendation to engage in investment transactions. Trading digital assets involves significant risk and can result in the loss of your invested capital. You should ensure that you fully understand the risk involved and take into consideration your level of experience, investment objectives and seek independent financial advice if necessary.