Availability metrics are reported at an aggregate level across all tiers and error types.
Individual customer availability may vary depending on their workload, autoscaling settings and API features in use.
ClickHouse continues to scale external-dns in each region. The following regions are complete and operating normally: us-east-1, af-south-1, ap-east-1, ap-northeast-1, ap-northeast-2, ap-south-1
Additional regions are in progress or will begin shortly, and the ETA to complete the remaining regions is estimated as 3-5 hours: us-east-2, us-west-2, eu-central-1, eu-west-1, eu-west-2, ap-southeast-1, ap-southeast-2, il-central-1
Monitoring
ClickHouse continues to scale external-dns in each region. The following regions are complete and operating normally: us-east-1, af-south-1, ap-east-1, ap-northeast-1, ap-northeast-2, ap-south-1
Additional regions are in progress or will begin shortly, and the ETA to complete the remaining regions is estimated as 3-5 hours: us-east-2, us-west-2, eu-central-1, eu-west-1, eu-west-2, ap-southeast-1, ap-southeast-2, il-central-1
Monitoring
We applied the fix for a subset of regions and see signs of improvements. We are gradually applying the fix to all other regions and monitoring the status.
Identified
We identified the issue and applying the remediation. The issue was caused by manual operation performed on the AWS Route53 configuration earlier today. It caused increase in the rate of requests to Route53 API, leading to throttling and retries (including provisioning of new ClickHouse instances).
Investigating
We spotted an issue where creation of new ClickHouse instances is delayed or stuck. Team is investigating the issue and already applying a temporary remediation.