How do I determine throttling in my CloudWatch logs?

5 minute read
0

I receive a "RequestLimitExceeded" or "ThrottlingException" error when working with Amazon CloudWatch logs, and my API call is throttled.

Short description

When you work with CloudWatch logs, you might exceed the API rate limit. When this happens, you receive a RequestLimitExceeded or ThrottlingException error, and your API call is throttled. You must identify where and when throttling is happening so you can resolve these errors and make informed rate limit increase requests.

Resolution

Intermittent throttling on CloudWatch logs when accessing logs

You can use the FilterLogEvents or GetLogEvents API calls to list your log events or log streams. These API calls have a hard limit, and they don't qualify for a limit increase. This means that if you use the FilterLogEvents API to search for log events from a specified log group, the default quaAPI has a default quota. This quote is 5 transactions per second (TPS) per account or AWS Region. If you reach this limit, then you receive the RateExceeded error.

Use these best practices to avoid throttling errors in this use case:

ThrottlingException errors when using an application/script to fetch CloudWatch log data

To collect CloudWatch logs, you can develop a collector script. This script attempts a DescribeLogStream or GetLogEvents API call to pull data from different log streams or different time frames in the same log group. API calls such as FilterLogEvents, GetLogEvents and DescribeLogStreams are designed for human interaction and not for automation. This means that you receive an error and the API call is throttled.

Use these best practices to avoid throttling in this use case:

  • Use exponential backoff and retries when you make an API call. For more information, see Exponential backoff and jitter and Error retries and exponential backoff in AWS.
  • Distribute your API calls over time. Schedule actions with some randomization so that they are spread over a period of time.
  • Add sleep intervals between consecutive API calls. Add some delay between API calls that are sent from the same script or application. If you send API calls in rapid succession, then this is more likely to cause rate errors.
  • In some cases, you might use a SIEM solution such as Splunk to fetch logs from CloudWatch. SIEM solutions are used to gather data from multiple systems and analyze this data to detect unusual behavior. You might experience API throttling when you use the Splunk plugin. In order to avoid this issue, create a CloudWatch logs subscription filter with Amazon Kinesis Data Firehose and deliver the log data to Splunk. For more information, see the Splunk documentation for Configure Kinesis inputs for the Splunk Add-on for AWS.

Throttling errors when integrating PutLogEvents API calls with Lambda function

The PutLogEvents API call is used to upload logs to a specified log stream in batches of 1 MB. This API has a rate limit of 800 TPS, per account, per Region. This applies except for the following Regions where the quota is 1500 TPS per account per Region: US East (N. Virginia), US West (Oregon), and Europe (Ireland). You can request a quota increase.

For more information on this, and to request a quota increase, see CloudWatch Logs quotas.

Manage your CloudWatch Logs service quotas

AWS defines quotas for services to protect performance and to be sure of availability. CloudWatch has quotas for metrics, alarms, API request, and alarm email notifications. Use these steps to visualize your service quotas and set alarms if you reach the threshold:

1.    Open the Service Quotas console.

2.    In the navigation pane, choose AWS services.

3.    From the AWS services list, search for Amazon CloudWatch logs.

4.    The Service quotas list shows you several attributes or options: the service quota name, the applied value (if its available), AWS default quota, and whether the quota value is adjustable.

5.    To view more information about a service quota, such as the description, choose the quota name.

6.    After you choose the quota name, you can see more information about this quota. For example, if you choose GetLogEvents throttle limit in transactions per second, then you see the following items:

Description

Quota code

Quota ARN

Utilization: %

Applied quota value

AWS default quota value

Adjustable: Y/N

7.    In the Amazon CloudWatch alarms section, choose Create alarm, and enter an Alarm threshold and Alarm name.


Related information

How do I avoid throttling when I call PutMetricData in the CloudWatch API?

AWS OFFICIAL
AWS OFFICIALUpdated a year ago