System design for millions of users (Part 1)

Designing a system that supports millions of users is not a small challenge, it requires continuous and continuous improvement. In this article, we will build a system that supports one user and gradually expands to millions of users.

System design for millions of users (Part 1)


System design for millions of users (Part 1)

Designing a system that supports millions of users is not a small challenge, it requires continuous and continuous improvement. In this article, we will build a system that supports one user and gradually expands to millions of users.



1. Single system configuration

When you get started, do everything on a single server. The following figure shows how to configure the server to run everything on the server, including web applications, databases, caches, etc.



To understand this setting, look at request flows and traffic sources.



First, let's take a look at the request flow.


1. Users access the website through a domain name such as api.mysite.com. The domain name of the website is provided by DNS. DNS (Domain Name System) is a third-party paid service that provides domain names and is not hosted on a server.


2. The Internet Protocol (IP) address is returned by the browser or mobile application. In this example, the IP address returned is


3. Once the IP address is obtained, Hypertext Transfer Protocol (HTTP) [1] is sent directly to the webserver.


4. The web server returns an HTML or JSON page in the response extraction.

Next, check the traffic source. Web server traffic comes from two sources: web applications and mobile applications.


• Web application: Combines server-side languages ​​(Java, Python, etc.) to process business logic, storage, and client-side languages ​​(HTML, JS) to represent an application.


• Mobile application: The HTTP protocol is a communication protocol between a mobile application and a web server. JSON is commonly used as a response API format for simple data conversions.



2. Database

As the number of users gradually increases, one server is not enough, one for web/mobile access and another for the database. By separating the access server (web tier) and the database server (data tier), you can extend them independently.



Database selection


You can choose from SQL or NoSQL types. Relational database (SQL) is also known as RDBMS (relational database management system). There are common names such as MySQL, Oracle, and PostgreSQL. SQL represents data and stores it in tables and rows. It is possible to perform join operations between tables with different SQL.


Non-relational databases (NoSQL) have common names such as MongoDB, Cassandra, Neo4j, and Redis.


There are four types of non-relational databases: key values, document-oriented, column-oriented, and graph-oriented.


Join operations are not supported in NoSQL.


For most developers, SQL is a better choice because it has a development period of over 40 years, a lot of documentation, and a strong community. However, in special cases, you can choose NoSQL instead for the following reasons:


  • Applications require very low latency.
  • Unstructured or irrelevant data.
  • Data (JSON, XML, etc.) must be serialized and deserialized.
  • It is necessary to save a large amount of data.



3. Scale-up and scale-out

Scale-up (making your system bigger) means the process of adding hardware (CPU, RAM, etc.) to your server. Scale-out allows you to add multiple servers to a resource.


When traffic is low, scaling up is a great solution to simplify the problem. Unfortunately, there are serious restrictions.


  • Scale-up is limited because you cannot add unlimited CPU and memory to the server
  • Scale-up without automatic conversion or failover. If the server goes down, both the application and the web will crash completely.
  • Scale-out is better suited for scaling large-scale applications than scaling up.


In the previous design, the user was directly connected to the webserver. If the server is offline, users will not be able to access the website. Also, if many users access the webserver at the same time, the server may become overloaded, slowing down the user's response, or making it impossible to connect to the server. At this point, load balancers are the best way to solve this problem.



4. Load balancer

The load balancer evenly distributes traffic among the web servers defined by the load balancer.



As mentioned above, the user connects directly to the load balancer's public IP address. The above settings do not allow clients to access the webserver directly. For added security, private IP is used for communication between servers. A private IP is an IP address that can be reached between servers in the same network, but not from the external Internet. The load balancer communicates with the webserver over a private IP.


There are two web servers behind the load balancer, which solves the problem of automatic switching and improves the usability of the web. The details are as follows.


  • If server 1 is offline, access will be forwarded to server 2. This will prevent the website from crashing. It also adds a new healthy web server for load balancing.
  • If your web traffic surges and your two servers can't handle it well, you can handle it properly with your load balancer. Simply download the server, add it to a nearby webserver group, and the request will be sent to the server automatically.
  • There is no problem with the Web layer, but there is no problem with the data layer. Automatic and failover are not supported because the current design has only one database. Database replication is a common technique for solving this problem.



5. Database replication

Database replication can be used on systems that manage multiple databases. Usually, there is a master/slave relationship between the master and the replica (slave).


The master database only supports write operations. The slave database, on the other hand, retrieves replication data from the master database and provides read operations only. All data editing commands such as INSERT, DELETE, UPDATE are sent to the master database. Most applications require read access rather than write access, so the system has more slave databases than the master database.



Benefits of database replication.

• Better performance

In the master-slave model, all writing and updates are done on the master, and read operations are distributed across the slaves. This model can process queries in parallel, which improves performance.

• Reliability

If one of your databases is destroyed by a natural disaster such as a hurricane or an earthquake, you can still recover your data. You don't have to worry about data loss as it will be copied to multiple locations.

• High availability

The data is copied to multiple locations so that you can access the other database even if one database is offline.


The following image shows how to add a load balancer to a database replica.



From the image above, you can see that:


• The user gets the IP address of the load balancer from DNS.

• The user connects to the load balancer from this IP address.

• HTTP requires a redirect to Server 1 or Server 2.

• The web server reads user data from the slave database.

• The web server transfers data editing operations to the master database. You can add, edit, and delete these operations.



6. Cache

The cache is temporary storage for storing the results of frequently accessed responses in data memory, resulting in faster response to subsequent requests. Each time a web page is reloaded, one or more databases are called to fetch data, as shown in Figures 1-6. Application performance can be affected by these duplicate calls. The cache can solve the above problem.



7. Cache layer

The cache layer is the layer that stores temporary data that is faster than the database. The advantages of separate cache tiers are improved system performance, reduced database load, and independent cache scalability. The following figure shows the cache server setup.



After receiving the request, the webserver checks to see if the cache is available for the response. In that case, send the data back to the client. Otherwise, it queries the database, caches the response, and sends it back to the client. This caching strategy is called read-through caching.


Problems with cache usage


• Determine when to use the cache. If you read the data frequently and rarely edit it, consider using a cache. The server cache is not suitable for persistent data because the cached data is stored in volatile memory. For example, restarting the cache server will result in the loss of all data, so important data must be stored in long-term memory.


• The policy will expire. Each time the cached data expires, it will be cleared from the cache. If there is no expiration policy, the cached data will be stored permanently.


• Consistency: Includes keeping cached and stored data in sync. Inconsistencies can occur if database and cache data operations are not within a single transaction. Maintaining consistency between the database and cache is difficult when scaling across multiple regions.


• Fault mitigation: A single server cache can be a single point of failure (SPOF). As defined by Wikipedia, "a single point of failure (SPOF) is a component of a system, and in the event of a failure, the entire system fails." Therefore, multiple server caches on different data centers avoid SPOFs. Another approach is to overprovision the required memory at a certain percentage. This will provide a buffer as memory usage increases.



If you are considering offshore development, please feel free to contact us.

Here is our contact information.

Account Manager: Quan (Japanese/English available)

Phone number: (+84) 2462 900 388

Email: contact@hachinet.com

Please feel free to contact us for consultation/application by phone.

Click here for more information ▶