How do I benchmark network throughput on an Amazon EC2 Windows instance?

7 minute read
0

I need to measure the network bandwidth between Amazon Elastic Compute Cloud (Amazon EC2) Windows instances.

Resolution

Network performance benchmark testing can help you determine the Amazon EC2 instance types, sizes, and configuration that best suit your needs. For more information about network performance for each instance type, see Amazon EC2 instance types.

Launch and configure your Amazon EC2 Windows instances

Before you run benchmark tests, follow these steps:

  1. Launch two EC2 Windows instances to run network performance testing.
  2. Confirm that the instances support enhanced networking for Windows.
  3. Run network testing between instances that aren't located in the same placement group or don't support jumbo frames. To do this, check and set the maximum transmission unit (MTU).
  4. Verify that you can connect to the instances.

Install the NTttcp network benchmark tool on both instances

Connect to each of the two Windows instances, and then follow these steps:

  1. Download the latest release of Microsoft's NTttcp from the GitHub website.
  2. Unzip the contents of the file to a folder.
  3. Open a command prompt with administrator privileges, and then change directories to the folder where you unzipped the NTttcp network benchmark tool.
  4. Before you run NTttcp, change directories to the folder with the name that matches the architecture of your EC2 Windows instance.

Test TCP and UDP network performance between the instances

When you use NTttcp to test TCP and UDP performance, it communicates over port 5001 by default. However, you can use the -p switch to configure the port.

Important:

  • You must configure security groups to allow communication over the ports that NTttcp uses.
  • Add inbound and outbound Windows Firewall rules on both the receiver and sender that allow NTttcp.exe connections.

Test TCP network performance

First, configure one instance as a receiver/server to initialize listeners. Start with the default port 5001. Or, specify an alternate initial listener port with the -p switch.

For example, the following command initializes a two-threaded receiver that listens on ports 80–81 of the specified IP address. The first thread runs on CPU 0, and the second thread runs on CPU 1:

ntttcp -r -p 80 -a 6 -t 60 -cd 5 -wu 5 -v -xml c:\bench.xml -m 1,0,192.168.1.4 1,1,192.168.1.4

In this example, the ntttcp.exe receiver parameters describe the following tasks:

  • -r: Receive.
  • -p 80: Port that's used by the first thread to receive data. The port number is incremented for each additional receiver thread.
  • -a 6: Asynchronous data transfer that posts six receive-overlapped buffers per thread.
  • -t 60: Tests duration in seconds.
  • -cd 5: Tests cooldown time of 5 seconds.
  • -wu 5: Tests warmup time of 5 seconds.
  • -v: Specifies verbose test output.
  • -xml: Saves test output to the specified file (default saves to xml.txt).
  • -m: Specifies three mapping parameters per session (# threads, CPUID, receiver IP address). Multiple sessions are space delimited.

Then, configure the second instance as a sender/client, and then run a test against the receiver with your chosen parameters.

For example, the following command initializes a two-threaded TCP sender to ports 80-81 of the specified IP address. The first thread runs on CPU 0, and the second thread runs on CPU 1:

Note: The following command has the same IP address as the command in step 1. Enter the receiver IP address on both commands.

ntttcp -s -p 80 -a -t 60 -cd 5 -wu 5 -m 1,0,192.168.1.4 1,1,192.168.1.4

In this example, the ntttcp.exe sender parameters describe the following tasks:

  • -s: Send.
  • -p 80: Port that's used by the first thread to send data. This port number is incremented for each additional sender thread.
  • -a: The default value of asynchronous send-overlapped buffers per thread is two. Specify a non-default value if needed.
  • -t 60: Tests duration in seconds.
  • -cd 5: Tests cooldown time of 5 seconds.
  • -wu 5: Tests warmup time of 5 seconds.
  • -m: Specifies three mapping parameters per session (# threads, CPUID, receiver IP address). Multiple sessions are space delimited.

This generates an XML output on the receiver, such as the following example. In this test, the total used bandwidth is about 9.02 GBps:

<ntttcpr computername="Win_EC2_Recv" version="5.31">
  <parameters>
    <send_socket_buff>0</send_socket_buff>
    <recv_socket_buff>-1</recv_socket_buff>
    <port>82</port>
    <sync_port>False</sync_port>
    <async>True</async>
    <verbose>True</verbose>
    <wsa>False</wsa>
    <use_ipv6>False</use_ipv6>
    <udp>False</udp>
    <verify_data>False</verify_data>
    <wait_all>False</wait_all>
    <run_time>60000</run_time>
    <warmup_time>5000</warmup_time>
    <cooldown_time>5000</cooldown_time>
    <dash_n_timeout>10800000</dash_n_timeout>
    <bind_sender>False</bind_sender>
    <sender_name></sender_name>
    <max_active_threads>2</max_active_threads>
  </parameters>
  <thread index="0">
    <realtime metric="s">60.012</realtime>
    <throughput metric="KB/s">542199.263</throughput>
    <throughput metric="MB/s">529.491</throughput>
    <throughput metric="mbps">4441.696</throughput>
    <avg_bytes_per_compl metric="B">65091.350</avg_bytes_per_compl>
  </thread>
  <thread index="1">
    <realtime metric="s">60.012</realtime>
    <throughput metric="KB/s">559260.669</throughput>
    <throughput metric="MB/s">546.153</throughput>
    <throughput metric="mbps">4581.463</throughput>
    <avg_bytes_per_compl metric="B">65535.750</avg_bytes_per_compl>
  </thread>
  <total_bytes metric="MB">64550.500000</total_bytes>
  <realtime metric="s">60.011000</realtime>
  <avg_bytes_per_compl metric="B">65316.236</avg_bytes_per_compl>
  <threads_avg_bytes_per_compl metric="B">65313.550</threads_avg_bytes_per_compl>
  <avg_frame_size metric="B">8194.809</avg_frame_size>
  <throughput metric="MB/s">1075.644</throughput>
  <throughput metric="mbps">9023.160</throughput>
  <total_buffers>1032808.000</total_buffers>
  <throughput metric="buffers/s">17210.311</throughput>
  <avg_packets_per_interrupt metric="packets/interrupt">5.749
    </avg_packets_per_interrupt>
  <interrupts metric="count/sec">23942.694</interrupts>
  <dpcs metric="count/sec">9546.816</dpcs>
  <avg_packets_per_dpc metric="packets/dpc">14.417
    </avg_packets_per_dpc>
  <cycles metric="cycles/byte">2.826</cycles>
  <packets_sent>730596</packets_sent>
  <packets_received>8259632</packets_received>
  <packets_retransmitted>0</packets_retransmitted>
  <errors>0</errors>
  <cpu metric="%">7.813</cpu>
  <bufferCount>9223372036854775807</bufferCount>
  <bufferLen>65536</bufferLen>
  <io>6</io>
</ntttcpr>

Test UDP network performance

First, configure one instance as a receiver/server to initialize listeners. Start with the default port 5001. Or, specify an alternate initial listener port with the -p switch.

For example, the following command initializes a two-threaded receiver that listens on ports 80–81 of the specified IP address. The first thread runs on CPU 0, and the second thread runs on CPU 1:

ntttcp –r –u -p 80 –t 60 –cd 5 –wu 5 –v –xml c:\\bench.xml –m 1,0,192.168.1.4 1,1,192.168.1.4

In this example, the ntttcp.exe receiver parameters describe the following tasks:

  • -r: Receives.
  • -u: Tests UDP.
  • -p 80: Port that's used by first thread to receive data. The port number is incremented for each additional receiver thread.
  • -t 60: Tests duration in seconds.
  • -cd 5: Tests cooldown time of 5 seconds.
  • -wu 5: Tests warmup time of 5 seconds.
  • -v: Specifies verbose test output.
  • -xml: Saves test output to the specified file (default saves to xml.txt).
  • -m: Specifies three mapping parameters per session (# threads, CPUID, receiver IP address). Multiple sessions are space delimited.

Then, configure a second instance as a sender/client, and then run a test against the receiver with the desired parameters.

For example, the following command initializes a two-threaded UDP sender to ports 80-81 of the specified IP address. The first thread runs on CPU 0, and the second thread runs on CPU 1:

Note: The following command has the same IP address as the command in step 1. Enter the receiver IP address on both commands.

ntttcp -s –u -p 80 -t 60 -cd 5 -wu 5 -m 1,0,192.168.1.4 1,1,192.168.1.4

In this example, the ntttcp.exe sender parameters describes the following tasks:

  • -s: Sends.
  • -u: Tests UDP (default is to test TCP).
  • -p 80: Port that's used by first thread to send data. The port number is incremented for each additional sender thread.
  • -t 60: Tests duration in seconds.
  • -cd 5: Tests cooldown time of 5 seconds.
  • -wu 5: Tests warmup time of 5 seconds.
  • -m: Specifies three mapping parameters per session (# threads, CPUID, receiver IP address). Multiple sessions are space delimited.

This generates an XML output on the receiver:

<ntttcpr computername="Win_UDP_Test" version="5.31">
  <parameters>
    <send_socket_buff>8192</send_socket_buff>
    <recv_socket_buff>-1</recv_socket_buff>
    <port>82</port>
    <sync_port>False</sync_port>
    <async>False</async>
    <verbose>True</verbose>
    <wsa>False</wsa>
    <use_ipv6>False</use_ipv6>
    <udp>True</udp>
    <verify_data>False</verify_data>
    <wait_all>False</wait_all>
    <run_time>60000</run_time>
    <warmup_time>5000</warmup_time>
    <cooldown_time>5000</cooldown_time>
    <dash_n_timeout>10800000</dash_n_timeout>
    <bind_sender>False</bind_sender>
    <sender_name></sender_name>
    <max_active_threads>2</max_active_threads>
  </parameters>
  <thread index="0">
    <realtime metric="s">60.016</realtime>
    <throughput metric="KB/s">6463.886</throughput>
    <throughput metric="MB/s">6.312</throughput>
    <throughput metric="mbps">52.952</throughput>
    <avg_bytes_per_compl metric="B">128.000</avg_bytes_per_compl>
  </thread>
  <thread index="1">
    <realtime metric="s">60.016</realtime>
    <throughput metric="KB/s">7712.922</throughput>
    <throughput metric="MB/s">7.532</throughput>
    <throughput metric="mbps">63.184</throughput>
    <avg_bytes_per_compl metric="B">128.000</avg_bytes_per_compl>
  </thread>
  <total_bytes metric="MB">830.880005</total_bytes>
  <realtime metric="s">60.015000</realtime>
  <avg_bytes_per_compl metric="B">128.000</avg_bytes_per_compl>
  <threads_avg_bytes_per_compl metric="B">128.000<</threads_avg_bytes_per_compl>
  <avg_frame_size metric="B">127.780</avg_frame_size>
  <throughput metric="MB/s">13.845</throughput>
  <throughput metric="mbps">116.136</throughput>
  <total_buffers>6806569.000</total_buffers>
  <throughput metric="buffers/s">113414.463</throughput>
  <avg_packets_per_interrupt metric="packets/interrupt">1.968
  </avg_packets_per_interrupt>
  <interrupts metric="count/sec">57715.621</interrupts>
  <dpcs metric="count/sec">11576.306</dpcs>
  <avg_packets_per_dpc metric="packets/dpc">9.814</avg_packets_per_dpc>
  <cycles metric="cycles/byte">210.673</cycles>
  <packets_sent>2</packets_sent>
  <packets_received>6818294</packets_received> 
  <packets_retransmitted>0</packets_retransmitted>
  <errors>1</errors>
  <cpu metric="%">44.976</cpu>
  <bufferCount>9223372036854775807</bufferCount>
  <bufferLen>128</bufferLen>
  <io>2</io>
</ntttcpr>

To view all switches that are available with NTttcp, open a command prompt, and then run the following command:

ntttcp

Related information

Network maximum transmission unit (MTU) for your EC2 instance

Placement groups

How do I benchmark network throughput between Amazon EC2 Linux instances in the same Amazon VPC?

AWS OFFICIAL
AWS OFFICIALUpdated 9 months ago