STS token failure in AWS leads to ClickHouse Cloud Outage
Resolved·Degraded performance

We received confirmation from AWS support that the issue has been resolved, and our monitoring no longer detects any errors. This incident is now considered resolved. For additional details regarding the AWS IAM issue, you can refer to https://health.aws.amazon.com/health/status?eventID=arn:aws:health:global::event/IAM/AWS_IAM_OPERATIONAL_ISSUE/AWS_IAM_OPERATIONAL_ISSUE_62881_637B393821C

Mon, Dec 18, 2023, 04:02 AM
(1 year ago)
·
Affected components
Updates

Resolved

We received confirmation from AWS support that the issue has been resolved, and our monitoring no longer detects any errors. This incident is now considered resolved. For additional details regarding the AWS IAM issue, you can refer to https://health.aws.amazon.com/health/status?eventID=arn:aws:health:global::event/IAM/AWS_IAM_OPERATIONAL_ISSUE/AWS_IAM_OPERATIONAL_ISSUE_62881_637B393821C

Mon, Dec 18, 2023, 04:02 AM

Monitoring

Our team has validated that the creation of new ClickHouse services is operational across all AWS regions. While awaiting the definitive confirmation from the AWS support team, we will continue to monitor all regions for any potential issues.

Mon, Dec 18, 2023, 03:50 AM(12 minutes earlier)

Identified

Update: The AWS support team has verified that this issue specifically impacts the creation and modification of IAM roles, users, and policies, with no impact on existing IAM configurations. Our team conducted a thorough check with our internal services, existing customer instances, and observed no disruptions. Consequently, we can deduce that this incident solely affects newly created ClickHouse instances.

Mon, Dec 18, 2023, 03:36 AM(13 minutes earlier)

Investigating

We have just been informed about a persistent problem affecting the AWS IAM STS token endpoint. This issue has led to disruptions in ClickHouse services for all customers in all AWS regions. Currently, it is not possible to provision new services, and a small number of existing services may experience some disruption. Our team is collaborating with AWS to resolve this issue at the earliest opportunity.

Mon, Dec 18, 2023, 03:19 AM(16 minutes earlier)
Powered by

Availability metrics are reported at an aggregate level across all tiers and error types.
Individual customer availability may vary depending on their workload, autoscaling settings and API features in use.