Yesterday morning starting around 8:30 AM US-Eastern time, the Konnected Cloud service stopped getting responses from SmartThings. We strongly suspect that it's related to this incident raised by SmartThings that occurred at the same time.
When the SmartThings connection became unresponsive, our cloud infrastructure on Amazon AWS started scaling up capacity as your sensor updates were piling up, ultimately overwhelming some services and hitting AWS rate limits. This caused all Konnected Cloud features, including device provisioning, to fail for a brief period.
Unfortunately, this caused a complete disruption of all Konnected devices connecting to SmartThings via the Konnected Cloud, and also prevented users from provisioning new devices and/or adding and updating zones. We estimate that a partial or total outage lasted for about 3 hours.
What We've Done in Response
This was our first major outage since Konnected Cloud launched in January of this year and helped reveal some weaknesses and potential points of failure in our cloud infrastructure, and lack of comprehensive alerting. Konnected Cloud is built using "infinitely" scaling technology using AWS Lambda and AWS IoT. Infinite scalability is usually a good thing -- but in a case like this when a 3rd-party service (SmartThings) fails, our scaling technology needs to be smarter to respond appropriately without causing other things to fail.
We've implemented a few remediations to prevent a problem like this in the future:
Introduced Concurrency Limits
We introduced concurrency limits in our AWS Lambda infrastructure to limit the chance of "runaway" scaling like we experienced yesterday.
Improved Alerting & Monitoring
AWS provides robust alerting and monitoring technology with AWS Cloudwatch. We've added and improved our Cloudwatch metrics and Alerts, and integrated them with an alerting notifications app so that Konnected staff will be notified much more quickly if and when problems like this happen again.
Transparency and Communication
We also launched a Konnected status page that's also integrated with the new alerts and automatic monitoring. You can check here for real-time updates on the state of our cloud services, and even subscribe to updates to get email or SMS notifications in the event of a future outage.
Cloud Confidence and Your Choices
We're very confident that the Konnected Cloud infrastructure, built on the world class Amazon Web Services, will be stable and scalable for years to come. We've built this service to provide the ultimate in convenience and ease-of-use for integrating with SmartThings (and in the future, other cloud-based smart home platforms), and are 100% committed to maintaining it's stability and security.
That said, you always have choices with Konnected. The Konnected Cloud service is never required to use our devices. Our Home Assistant and Hubitat integrations work 100% locally, and without any dependency on any 1st-party or 3rd-party cloud service.
If you have any questions, don't hesitate to reach out to us at firstname.lastname@example.org.