This doc will be deprecated soon. For higher performance needs, please refer to the High Throughput Recommendations documentation.
The WhatsApp Business API client allows your business to communicate with your customers in a programmatic way over WhatsApp. When you install the API client, it comes with a Webapp container and Coreapp container.
This document covers:
To send a message to a recipient, you first need to check whether the recipient has a valid WhatsApp account.
When you send an API call to the
contacts endpoint the Coreapp talks to the WhatsApp servers to verify the recipient has a WhatsApp account, then caches that status in the database for 7 days. Because of this cache, the Coreapp does not need to check the recipient account status with the server for the next 7 days. However, it can happen that the Coreapp allows the sending of a message to a recently deleted (< 7 days) WhatsApp account. In this case, no error messages are returned to the WhatsApp Business API client; the message will go to the WhatsApp servers, but will not be sent on to the recipient.
It is also instrumental to know that you can check more than one phone number with a single
contacts API call. Similarly, the Coreapp can also include more than one phone number in its calls to the WhatsApp servers. The WhatsApp servers rate limit the Coreapp to only one
contacts API call at a given time, but the Coreapp rate limits the WhatsApp Business API client to 40 calls/sec. While the
contacts call from the Coreapp to the server is in progress, the Coreapp accumulates all the incoming check contacts requests; once the Coreapp receives a response from the servers, it makes the next
contacts call to the server with all the accumulated phone numbers.
When a message is sent using the
messages endpoint, the Coreapp checks the database to verify the recipient has a existing WhatsApp account. If the account's status is valid, it persists the message in the job queue (database), then asynchronously issues the message ID to the client and attempts to send the message to the recipient number via the servers.
Each recipient in WhatsApp has two keys — a public key and a private key. Public keys are shared with senders so that sender can encrypt the message. Using the private key, the recipient decrypts the message. When your business sends messages with the WhatsApp Business API, the sender is the Coreapp. When the Coreapp messages the recipient for first time, the public key exchange happens between the Coreapp and recipient. Subsequently, these keys are used to encrypt/decrypt the messages. However, if the Coreapp has never messaged the recipient before, it doesn't have a recipient key to encrypt. In this case, the Coreapp talks to the server and requests a pre-key, which the Coreapp can use to encrypt the message. After encrypting the message using the pre-key/public key, the Coreapp will attempt to send the message via the servers.
The above explanations highlight that for the Coreapp to send one message successfully, it needs to make up to three successful roundtrip exchanges with the server:
Using the WhatsApp Business API client, the throughput of 70 messages per second (mps) can be achieved. We tested this performance using the setup specified below.
Coreapp performance might degrade if the volume of messaging is above 70 mps. So, the number of API requests to the
messages endpoint are rate limited to 80 per second. Request rate limits are implemented for each TCP endpoint. Both
contacts calls run on different TCP endpoints. This means it's possible to achieve the performance of both 80 messages per second and 40 check contacts per second without being rate limited.
The below report covers 70 mps with expected latencies and throughput using different components of system with the Coreapp. These numbers can be used as benchmarks to compare and contrast with your own setup.
AWS infrastructure is used to measure the performance. We used the AWS template to launch a WhatsApp Business API client. Below are the configuration parameters used for this template while launching it.
* Remaining configuration left with default values
The tests are designed to generate traffic simulating real-world usage of the WhatsApp Business API client.
A total of 4 concurrency test clients generate messages, each sending a message and waiting for a
200 OK response. As soon as the
200 OK response is received, the test client attempts to send the message again without any delays. For each of test runs, we sent a total of 10,000 messages.
Further, the test clients make a
contacts API call for each message before sending. This generally took around ~8-9 minutes to complete. The results below are based on these tests. For each test run, we cleared the
contacts cache and all pre-keys; this means the Coreapp has to make two additional roundtrips exchanges with the servers for these two operations.
It's important to configure a monitoring set up for the Coreapp as it helps measure the performance of the Coreapp and its interactions with other system components. While you can build your own monitoring set up by making
stats API calls, we strongly recommend setting up the instance monitoring provided by WhatsApp. This is the set up used for this demonstration.
Definitions of terms for each test operation as seen in the following graphs:
messagescall from the WhatsApp Business API client to the Coreapp
contactscall from the WhatsApp Business API client to the Coreapp
contactscall from the Coreapp to the server
messagescall from the Coreapp to the server
The number of operations per second (OPS) for messaging is a constant 70 for the entire test duration.
The latency for each server call may not be the same because the kind of work the server needs to perform for each operation is different. But, in general, these latency values are in the range of 80 to 150 ms. If the latency values with the servers are much higher than this, you should focus on debugging and correcting your configuration.
The OPS received from the server to the Coreapp to send a message is almost always the same as the throughput received from the WhatsApp Business API client with the Coreapp. The OPS for a pre-key fetch (SendGetPreKeyBatchResult:ok) may be less than or equal to sending a message (SendMessage:ok) this is because if the Coreapp is not reaching out to the recipient for the first time, a pre-key fetch call to server is not required.
The OPS for checking contacts (UnifiedSyncResult:ok) will be less than or equal to the OPS for sending a message. This is because only one call can be made from the Coreapp to the servers. While a
contacts call is in progress, the Coreapp consolidates all the
contacts requests. It sends these requests in subsequent call, after receiving the response from previous call.
One of the primary factors of performance is the database. It's important that the database is located as close to the Coreapp as possible. When sending messages or processing callbacks, the Coreapp performance intensifies the IO operations to the database. This sample of IO write latency for a throughput of 20 mps makes it clear that the recommended accepted database write latency is ~5ms.
For each message, there can be up to three callbacks from the WhatsApp servers to the Coreapp. These callbacks are sent, delivered and read notifications. When the callbacks are received, the Coreapp persists these in the database and asynchronously returns
success to the WhatsApp server and forwards these to callback server. For an increased number of callbacks, there is a rise in the number of writes to the database; this is another reason to keep monitoring database latency.
Here is the callback latency graph from our testing. Please note that it is processing only sent notifications but not delivered or read.
The default size of callback queue is 100,000. If the callback server latency is high and callback volume is very high, these callbacks are queued in the callback queue. When the callback queue is full, the Coreapp stops accepting all
messages API calls but continues to accept callbacks from the server, appending them to the callback queue. Here the Coreapp throughput to send messages becomes zero until the queue is emptied, up to certain point. This is why it's important that the latency with your callback server falls into the above range.
The job queue of the Coreapp is almost flat with ~1% usage for entire test duration.
Please refer to the Debugging section below to learn about mysqlslap and its recommended configuration.
Benchmark Running for engine innodb Average number of seconds to run all queries: 5.648 seconds Minimum number of seconds to run all queries: 5.648 seconds Maximum number of seconds to run all queries: 5.648 seconds Number of clients running queries: 5 Average number of queries per client: 1000
Running the suggested query with the database having Single-AZ took ~5 seconds. In other words, for executing a total of 1000 writes, it took approximately ~5 seconds. THis means each write operation took ~5 ms on average.
The way a database is configured is important to achieving high throughput. In this section, we show how a simple setting in the database can affect the Coreapp performance and publish those deviant results. It should help to easily compare these metrics when Coreapp performance is not ideal.
As mentioned, we used RDS as the database from AWS for our testing. Creating RDS with Multi-AZ reduced the throughput of the Coreapp by at least 25%. This reduced performance is significantly noticed when maintaining the peak for longer periods.
Multi-AZ: Amazon RDS provides high availability and failover support for database instances using Multi-AZ deployments (see High Availability (Multi-AZ) for Amazon RDS for more information). The important part, AWS claims, is that database instances using Multi-AZ deployments may have increased write and commit latency compared to a Single-AZ deployment due to the synchronous data replication that occurs, and this impact is more noticeable for large and write-intensive database instances.
According to the Amazon RDS Under the Hood: Multi-AZ blog post, assessment shows increases in database commit latencies of between 2 ms and 5 ms. With our test client, which has a write-intensive load, the commit latencies are approximately ~15 ms, which is 2 times longer than with Single-AZ. The above graph shows the average database write latency while using Multi-AZ.
Here is the number of operations per second (OPS) for messaging when using Multi-AZ during one of our test runs. It sustained approximately ~12 OPS. For different database configurations with different storage types (magnetic vs. SSD), the throughput of the Coreapp fluctuated between 10 to 15 OPS when using Multi-AZ. This is a huge dip in performance from 20 OPS when using Single-AZ.
Here is the CloudWatch graph with a comparison of the write IOPS of Multi-AZ and Single-AZ, keeping all other setup constant. While Single-AZ could reach more than 1800 IOPs, Multi-AZ hardly reaches 1000 IOPS. This reduced write IOPS of the database influences the performance of the WhatsApp Business API client. In our testing we noticed the performance of the Coreapp is reduced from 20 mps to ~10 to 15 mps.
The above is the job queue utilization with Multi-AZ turned on.
At it's best, the Coreapp using Multi-AZ gives a performance of ~15 mps, we can see the job queue reaching its peak with intense loads. As per the Coreapp design, once the job queue reaches its limit, it waits until the 50% of queue is served before accepting new jobs.
On the other hand, the job queue of the Coreapp using Single-AZ is almost a flat ~1% usage. This is a large difference in the performance of the job queue between using Single-AZ and Multi-AZ. These results should help us understand the important role the database plays in Coreapp performance.
Please refer to the Debugging section below to learn about mysqlslap and its recommended configuration.
Benchmark Running for engine innodb Average number of seconds to run all queries: 15.581 seconds Minimum number of seconds to run all queries: 15.581 seconds Maximum number of seconds to run all queries: 15.581 seconds Number of clients running queries: 5 Average number of queries per client: 1000
Running the suggested query against the database with Multi-Az turned on took ~15 seconds. In other words, executing a total of 1000 writes took approximately ~15 seconds. It means each write operation took ~15 ms on average, while it took ~5ms for Single-AZ.
The mysqlslap tool can be used to simulate the client load for a MySQL server, emulating as if multiple clients are accessing the server. It then reports the overall time to execute those queries.
mysqlslap -uWA_DB_USERNAME -pWA_DB_PASSWORD -h WA_DB_HOSTNAME -P WA_DB_PORT --auto-generate-sql --concurrency=5 --number-int-cols=5 --auto-generate-sql-load-type=write --auto-generate-sql-secondary-indexes=2 --auto-generate-sql-execute-number=1000 --engine=innodb --commit=1 --auto-generate-sql-add-autoincrement —auto-generate-sql-unique-write-number=200 —verbose
mysqlslap -uroot -pmypassword -h dbhostname.rds.amazonaws.com -P 3306 --auto-generate-sql --concurrency=5 --number-int-cols=5 --auto-generate-sql-load-type=write --auto-generate-sql-secondary-indexes=2 --auto-generate-sql-execute-number=1000 --engine=innodb --commit=1 --auto-generate-sql-add-autoincrement --auto-generate-sql-unique-write-number=200 --verbose
Make sure to run these queries from the machine running the Coreapp. This query simulates 5 concurrent clients doing a total of 1000 inserts into the database.
The WhatsApp Business API troubleshooting tool (wadebug) can be used to debug callback server latency.
In the above sections, we presented the critical system components that contribute to performance. If you would like to look at all the metrics or more detailed metrics of above components, there are snapshots of those results:
For these two tests, a total of 10,000 templated messages were sent.
Also, here is the snapshot for long hours testing. In this test, a total of 100,000 templated messages are sent.
If there is a delay in a subset of numbers, then it is likely not an issue affecting the customers integration but rather an issue on the recipients end. These delays in delivery can happen for a number of reasons, including: