What is network latency?
Network latency is the delay in network communication. It shows the time that data takes to transfer across the network. Networks with a longer delay or lag have high latency, while those with fast response times have low latency. Businesses prefer low latency and faster network communication for greater productivity and more efficient business operations. Some types of applications, such as fluid dynamics and other high performance computing use cases, require low network latency to keep up with their computation demands. High network latencies cause the application performance to degrade, and at high enough levels to fail.
Why is latency important?
As more companies undergo digital transformation, they use cloud-based applications and services to perform basic business functions. Operations also rely on data collected from smart devices connected to the internet, which are collectively called the Internet of Things. The lag time from latencies can create inefficiencies, especially in real-time operations that depend on sensor data. High latency also reduces the benefits of spending more on network capacity, which affects both user experience and customer satisfaction even if businesses implement expensive network circuits.
Which applications require low network latency?
Although all businesses prefer low latency, it's more crucial for specific industries and applications. The following are example use cases.
Streaming analytics applications
Streaming analytics applications, such as real-time auctions, online betting, and multiplayer games, consume and analyze large volumes of real-time streaming data from various sources. Users of such applications depend on accurate real-time information to make decisions. They prefer a low-latency network because lag can have financial consequences.
Real-time data management
Enterprise applications often merge and optimize data from different sources, like other software, transactional databases, cloud, and sensors. They use change data capture (CDC) technology to capture and process data changes in real time. Network latency problems can easily interfere with these applications' performance.
API integration
Two different computer systems communicate with each other using an application programming interface (API). Many times, system processing stops until an API returns a response. Network latency thus creates application performance issues. For instance, a flight-booking website will use an API call to get information about the number of seats available on a specific flight. Network latency might impact website performance, causing it to stop functioning. By the time the website receives the API response and restarts, someone else might have booked the ticket, and you would have missed out.
Video-enabled remote operations
Some workflows, such as video-enabled drill presses, endoscopy cameras, and drones for search-and-rescue, require an operator to control a machine remotely by using video. In these instances, high-latency networks are crucial to avoid life-threatening consequences.
What are the causes of network latency?
In network terminology, a client device and a server communicate by using a computer network. The client sends data requests, and the server sends data responses. A series of devices, such as routers, switches, or firewalls and links such as cables or wireless transmission, make up the computer network. In the form of small data packets, data requests and responses hop from one device to another through links until they reach their destination. Network devices, such as routers, modems, and switches, continuously process and route data packets over different network paths made of wires, optical fiber cables, or wireless transmission media. As a result, network operations are complex, and various factors affect the speed of data packet travel. The following are common factors that create network latency.
Transmission medium
The transmission medium or link has the greatest impact on latency as data passes through it. For instance, a fiber-optic network has less latency than a wireless network. Similarly, every time the network switches from one medium to another, it adds a few extra milliseconds to the overall transmission time.
Distance the network traffic travels
Long distances between network endpoints increase network latency. For example, if application servers are geographically distant from end users, they might experience more latency.
Number of network hops
Multiple intermediate routers increase the number of hops that data packets require, which causes the network latency to increase. Network device functions, such as website address processing and routing tables lookups, also increase latency time.
Data volume
A high concurrent data volume can increase network latency issues because network devices can have limited processing capacity. That is why shared network infrastructure, like the internet, can increase application latency.
Server performance
Application server performance can create perceived network latency. In this case, the data communication is delayed not because of network issues, but because the servers respond slowly.
How can you measure network latency?
You can measure network latency by using metrics such as Time to First Byte and Round Trip Time. You can use any of these metrics to monitor and test networks.
Time to First Byte
Time to First Byte (TTFB) records the time that it takes for the first byte of data to reach the client from the server after the connection is established. TTFB depends on two factors:
- The time the web server takes to process the request and create a response
- The time the response takes to return to the client
Thus, TTFB measures both server processing time and network lag.
You can also measure latency as perceived TTFB, which is longer than actual TTFB because of how long the client machine takes to process the response further.
Round Trip Time
Round Trip Time (RTT) is the time that it takes the client to send a request and receive the response from the server. Network latency causes round-trip delay and increases RTT. However, all the measurements of RTT by network monitoring tools are partial indicators because data can travel over different network paths while going from client to server and back.
Ping command
Network admins use the ping command to determine the time required for 32 bytes of data to reach its destination and receive a return response. It is a way to identify how reliable a connection is. However, you cannot use ping to check multiple paths from the same console or reduce latency issues.
What are the other types of latency?
A computer system can experience many different latencies, such as disk latency, fiber-optic latency, and operational latency. The following are important types of latency.
Disk latency
Disk latency measures the time that a computing device takes to read and store data. It is the reason there might be storage delays in writing a large number of files instead of a single large file. For example, hard drives have greater disk latency than solid state drives.
Fiber-optic latency
Fiber-optic latency is the time light takes to travel a particular distance through a fiber optic cable. At the speed of light, a latency of 3.33 microseconds occurs for every kilometer that the light travels through space. However, in fiber-optic cable, each kilometer causes a latency of 4.9 microseconds. Network speed can decrease with each bend or imperfection in the cable.
Operational latency
Operational latency is the time lag due to computing operations. It is one of the factors that cause server latency. When operations run one after another in a sequence, you can calculate operational latency as the sum total of the time each individual operation takes. In parallel workflows, the slowest operation determines the operational latency time.
What factors other than latency determine network performance?
Other than latency, you can measure network performance in terms of bandwidth, throughput, jitter, and packet loss.
Bandwidth
Bandwidth measures the data volume that can pass through a network at a given time. It is measured in data units per second. For example, a network with a bandwidth of 1 gigabit per second (Gbps) often performs better than a network with a 10 megabits per second (Mbps) bandwidth.
Comparison of latency to bandwidth
If you think of the network as a water pipe, bandwidth indicates the width of the pipe, and latency is the speed at which water travels through the pipe. Although less bandwidth increases latency during peak usage, more bandwidth does not necessarily mean more data. In fact, latency can reduce the return on investment in expensive, high-bandwidth infrastructure.
Throughput
Throughput refers to the average volume of data that can actually pass through the network over a specific time. It indicates the number of data packets that arrive at their destination successfully and the data packet loss.
Comparison of latency to throughput
Throughput measures the impact of latency on network bandwidth. It indicates the available bandwidth after latency. For example, a network's bandwidth may be 100 Mbps, but due to latency, its throughput is only 50 Mbps during the day but increases to 80 Mbps at night.
Jitter
Jitter is the variation in time delay between data transmission and its receipt over a network connection. A consistent delay is preferred over delay variations for better user experience.
Comparison of latency to jitter
Jitter is the change in the latency of a network over time. Latency causes delays in data packets traveling over a network, but jitter is experienced when these network packets arrive in a different order than the user expects.
Packet loss
Packet loss measures the number of data packets that never reach their destination. Factors like software bugs, hardware issues, and network congestion, cause dropped packets during data transmission.
Comparison of latency to packet loss
Latency measures delay in a packet’s arrival at the destination. It is measured in time units such as milliseconds. Packet loss is a percentage value that measures the number of packets that never arrived. So if 91 out of 100 packets arrived, packet loss is 9%.
How can you improve network latency issues?
You can reduce network latency by optimizing both your network and your application code. The following are a few suggestions.
Upgrade network infrastructure
You can upgrade network devices by using the latest hardware, software, and network configuration options on the market. Regular network maintenance improves packet processing time and helps to reduce network latency.
Monitor network performance
Network monitoring and management tools can perform functions such as mock API testing and end-user experience analysis. You can use them to check network latency in real time and troubleshoot network latency issues.
Group network endpoints
Subnetting is the method of grouping network endpoints that frequently communicate with each other. A subnet acts as a network inside a network to minimize unnecessary router hops and improve network latency.
Use traffic-shaping methods
You can improve network latency by prioritizing data packets based on type. For example, you can make your network route high-priority applications like VoIP calls and data center traffic first while delaying other types of traffic. This improves the acceptable latency for critical business processes on an otherwise high-latency network.
Reduce network distance
You can improve user experience by hosting your servers and databases geographically closer to your end users. For example, if your target market is Italy, you will get better performance by hosting your servers in Italy or Europe instead of North America.
Reduce network hops
Each hop a data packet takes as it moves from router to router increases network latency. Typically, traffic must take multiple hops through the public internet, over potentially congested and nonredundant network paths, to reach your destination. However, you can use cloud solutions to run applications closer to their end users as one means of both reducing the distance network communications travel and the number of hops the network traffic takes. For example, you can use AWS Global Accelerator to onboard traffic onto the AWS global network as close to them as possible, using the AWS globally redundant network to help improve your application availability and performance.
How can AWS help you reduce latency?
AWS has a number of solutions to reduce network latency and improve performance for better end-user experience. You can implement any of the following services, depending on your requirements.
- AWS Direct Connect is a cloud service that links your network directly to AWS to deliver more consistent and low network latency. When creating a new connection, you can choose a hosted connection that an AWS Direct Connect Delivery Partner provides, or choose a dedicated connection from AWS to deploy at over 100 AWS Direct Connect locations around the world.
- Amazon CloudFront is a content delivery network service built for high performance, security, and developer convenience. You can use it to securely deliver content with low latency and high transfer speeds.
- AWS Global Accelerator is a networking service that improves the performance of your users’ traffic by up to 60% by using the AWS global network infrastructure. When the internet is congested, AWS Global Accelerator optimizes the path to your application to keep packet loss, jitter, and latency consistently low.
- AWS Local Zones are a type of infrastructure deployment that places compute, storage, database, and other select AWS services close to large population and industry centers. You can deliver innovative applications requiring low-latency closer to end users and on-premises installations.
Get started with AWS Direct Connect by creating an AWS account today.