Testing Cloud Network Throughput – Data Guard

Part 2: Oratcptest Throughput Test

Building on the iPerf test results of Part 1 of this article, we will now simulate Data Guard throughput using the OraTCPtest tool. The instructions for the tool can be found in Oracle documentation – “Measuring Network Capacity using oratcptest (Doc ID 2064368.1)“. It is a cross-platform jar file that can be run in client and server modes.

OraTCPtest – Running the Standby Database (Server) Process

1) Copy the oratcptest.jar tool to the Oracle Cloud Infrastructure (OCI) instance.

2) Confirm the java version running on the Server (Standby Database).

[opc@localhost ~]$ java -version
openjdk version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

 

3) Start the server process on a particular port.

[opc@localhost ~]$ java -jar oratcptest.jar -server -port=5555
OraTcpTest server started.

 

OraTCPtest – Running the Primary Database (Client) Process

1) Copy the oratcptest.jar tool to the Google Compute Engine (GCE) instance.

2) Confirm the java version running on the Client (Primary Database).

kayode@instance-1:~$ java -version
openjdk version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-1~deb9u1-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

 

3) Start the client process

kayode@instance-1:~$ java -jar oratcptest.jar server.cnt.com -port=5555 -duration=2s
[Requesting a test]
        Message payload        = 1 Mbyte
        Payload content type   = RANDOM
        Delay between messages = NO
        Number of connections  = 1
        Socket send buffer     = (system default)
        Transport mode         = SYNC
        Disk write             = NO
        Statistics interval    = 2 seconds
        Test duration          = 10 seconds
        Test frequency         = NO
        Network Timeout        = NO
        (1 Mbyte = 1024x1024 bytes)
 
(14:47:15) The server is ready.
                    Throughput             Latency
(14:47:17)     19.999 Mbytes/s           50.003 ms
(14:47:19)     20.634 Mbytes/s           48.464 ms
(14:47:21)     19.047 Mbytes/s           52.501 ms
(14:47:23)     17.759 Mbytes/s           56.311 ms
(14:47:25)     20.477 Mbytes/s           48.835 ms
(14:47:25) Test finished.
               Socket send buffer = 846 kbytes
                  Avg. throughput = 19.580 Mbytes/s
                     Avg. latency = 51.073 ms

 

We can see that the client process is running in Data Guard SYNC mode without Disk write. SYNC mode means that the primary database will wait for an ACK from the standby database before sending the next message. With Disk Write, the standby database writes network message to disk before it replies with an ACK to the primary database.

We can see information regarding the average throughput, average latency and Socket Send Buffer. The corresponding output on the server instance in Oracle cloud after running the test is:

kayode@instance-1:~$
 
[A test was requested.]
        Message payload       = 1 Mbyte
        Disk write            = NO
        Socket receive buffer = (system default)
 
The test terminated. The socket receive buffer was 1903744 bytes.

 

So how much throughput is required for Data Guard transport?

To answer this question, we need information about the redo data volume in the production database site. Then we need to determine if the network has sufficient bandwidth. We will now look at the throughput for various Data Guard modes using the oratcptest tool. We are not running an actual database in this example, so we will use the default values for the tool.

Data Guard ASYNC with No-Write

Now we run the tool in Data Guard ASYNC mode with No-Write. ASYNC mode implies that the client will not wait for an ACK from the server (standby database) before sending the next message. With No-Write, the server does not need to write network message to disk before it replies with ACK. In fact, in this scenario, the server does not even reply with an ACK in ASYNC mode.

kayode@instance-1:~$ java -jar oratcptest.jar server.cnt.com -port=5555 -mode=async -duration=10s -interval=2s
[Requesting a test]
        Message payload        = 1 Mbyte
        Payload content type   = RANDOM
        Delay between messages = NO
        Number of connections  = 1
        Socket send buffer     = (system default)
        Transport mode         = ASYNC
        Disk write             = NO
        Statistics interval    = 2 seconds
        Test duration          = 10 seconds
        Test frequency         = NO
        Network Timeout        = NO
        (1 Mbyte = 1024x1024 bytes)
 
(07:53:14) The server is ready.
                    Throughput
(07:53:16)     53.040 Mbytes/s
(07:53:18)     41.330 Mbytes/s
(07:53:20)     31.966 Mbytes/s
(07:53:22)     25.269 Mbytes/s
(07:53:24)     27.700 Mbytes/s
(07:53:24) Test finished.
               Socket send buffer = 2 Mbytes
                  Avg. throughput = 35.856 Mbytes/s

 

The average throughput is around 36 Mbytes/s which corresponds to 287 Mbit/s. We achieved 270 Mbit/s of throughput with iPerf using an MSS value of 1308 bytes (MTU of 1348 bytes).

End-to-end throughput is dependent on many factors:

1) Virtual machine instance size – used on Google Cloud for the demo i.e. f1-micro (1 shared vCPU, 0.6 GB memory).

2) IPsec VPN connection overhead which takes up considerable bandwidth of the internet connection.

3) specifications of the database servers (CPU, network card specifications etc.)

4) MTU and the network

And we can observe the response on the server side as follows.

[opc@localhost ~]$ java -jar oratcptest.jar -server -port=5555
OraTcpTest server started.

[A test was requested.]
Message payload = 1 Mbyte
Disk write = NO
Socket receive buffer = (system default)

The test terminated. The socket receive buffer was 3 Mbytes.

 

We can see that the Socket receive buffer on the server side is 3 Mbytes whilst the Socket send buffer on the client side is 2 Mbytes. This implies there is a potential to increase the throughput by increasing the Socket send buffer sockbuf on the client to match the Socket receive buffer on the server.

But there are limitations to the above assumption of increasing socket buffer values. The value of the sockbuf cannot exceed the maximum allowed by the operating systems. And we also have to consider that the transport network has a maximum limit to the data it can buffer – defined by its MTU. Eventually, TCP slow start mechanism will be used by the sender (client) to determine the maximum data it can send considering the congestion window of the network.

 

Data Guard SYNC with Write

And now we run the tool in Data Guard SYNC mode with Write. SYNC mode implies that the client (primary database) will wait for an ACK from the server (standby database) before sending the next message. With Write, the server will write network message to disk before it replies with ACK to the client. This means we should expect a higher latency due to disk write – which adds 3.947 ms of latency as can be seen in the output below.

kayode@instance-1:~$ java -jar oratcptest.jar server.cnt.com -port=5555 -write -duration=10s -interval=2s
[Requesting a test]
        Message payload        = 1 Mbyte
        Payload content type   = RANDOM
        Delay between messages = NO
        Number of connections  = 1
        Socket send buffer     = (system default)
        Transport mode         = SYNC
        Disk write             = YES
        Statistics interval    = 2 seconds
        Test duration          = 10 seconds
        Test frequency         = NO
        Network Timeout        = NO
        (1 Mbyte = 1024x1024 bytes)
 
(14:13:20) The server is ready.
                    Throughput             Latency
(14:13:23)     19.426 Mbytes/s           51.478 ms   (disk-write 4.238 ms)
(14:13:25)     16.635 Mbytes/s           60.115 ms   (disk-write 3.613 ms)
(14:13:26)     17.027 Mbytes/s           58.731 ms   (disk-write 3.801 ms)
(14:13:28)     20.721 Mbytes/s           48.261 ms   (disk-write 3.617 ms)
(14:13:31)     18.798 Mbytes/s           53.197 ms   (disk-write 4.416 ms)
(14:13:31) Test finished.
               Socket send buffer = 873 kbytes
                  Avg. throughput = 18.512 Mbytes/s
                     Avg. latency = 54.021 ms (disk-write 3.947 ms)

 

We can instantly observe that the throughput in SYNC Write mode, 18.512 Mbytes/s (148.096 Mbit/s) is significantly less than that of ASYNC No-Write mode, 35.856 Mbytes/s (270 Mbit/s). This is due to the combination of latency and processing overhead associated with the client requiring ACK for every message sent, and the server committing data to disk before sending an ACK back to the client.

[opc@localhost ~]$ java -jar oratcptest.jar -server -port=5555
OraTcpTest server started.
 
[A test was requested.]
        Message payload       = 1 Mbyte
        Disk write            = YES (oratcp.tmp)
        Socket receive buffer = (system default)
 
The test terminated. The socket receive buffer was 1957120 bytes.

 

Limitations of the Test Scenario

Default values were used in the oratcp simulations. The default message payload (1 Mbyte) was used for both ASYNC and SYNC modes. In reality, the default payload might be different. To accurately simulate the SYNC Write mode, we need to determine the average redo write size that log writer (LGWR) performs. We can do this by using the Automatic Workload Repository (AWR) on the primary database. For detailed information regarding how this is done, please refer to Oracle Support documentation.

MSS (and MTU) values also affect throughput. The default MSS value of 1308 bytes was used for the iPerf which is probably different from the default MSS used on the OraTCPtest tool. For more accurate results the tests can be conducted with matching MSS values.

 

Conclusion

When setting up Oracle Database with Data Guard configuration in the cloud (or on-premises), we need to correctly measure the network throughput in order to guarantee that we can achieve the disaster recovery RTO and RPO objectives.

Network benchmarking tools like iPerf are very functional but lack granularity to simulate Data Guard application requirements. Using OraTCPtest tool allows us to more accurately measure the throughput required for Data Guard application.

Leave a Reply

Your email address will not be published. Required fields are marked *