Frontend Builds Not Completing
Incident Report for Shogun
Postmortem

Summary:

From 8:40 pm ET on September 21st, 2022 to 11:52 AM ET on September 22nd, 2022, Shogun Frontend’s site builder was unavailable. When a new build was triggered, customers received an error message asking them to contact Shogun support. This was due to the underlying service that powers the site builder being unavailable for the duration of this incident.

Details:

Shogun partners with a global leader in infrastructure as a service as its infrastructure provider for storefront builds as well as storage of the resulting static assets output by the Frontend builder. Shogun maintains a geographically redundant builder and storage infrastructure within our provider.

At 8:40 PM ET Shogun was alerted to errors within the Frontend builder infrastructure. The error indicated Shogun's account could no longer access the service underlying the builder. Our provider did not provide Shogun with any prior communication or warning that access to the service would be affected in any way.

Shogun’s Engineering team immediately engaged with support at our provider to re-establish access to the underlying service.

In parallel, Shogun’s Engineering team evaluated and planned alternative solutions if access was not restored in a timely fashion. Due to the fact that access was cut on an account level, the geographic redundancy previously established by Shogun did not allow for failover.

Shogun’s Engineering team then began re-creating the Frontend builder infrastructure in a separate account at our provider to restore service.

At 11:52 AM ET Shogun’s Engineering team completed the installation, configuration and testing of the new builder infrastructure.

Shogun Engineering continues to work with our partners to restore access to the underlying service on our original account.

Preventative Measures:

Shogun will be investigating other technology partnerships for the Frontend site builder and will maintain backup builder infrastructure in a separate account which will allow for an instantaneous failover to a backup builder in case of similar issues in the future.

Shogun is continuing to work with our provider on their own RCA for this issue and we will provide relevant updates as more information becomes available.

Posted Sep 23, 2022 - 00:05 UTC

Resolved
We have resolved this incident, and have put in place additional measures to allow for instantaneous failover to a backup builder in case of similar issues in the future. A post-mortem will be posted shortly.
Posted Sep 23, 2022 - 00:03 UTC
Update
We are aware of intermittent build failures and are working to increase the amount of resources allocated to our builder infrastructure. In the event that a build fails, please wait a few minutes and retry.
Posted Sep 22, 2022 - 16:37 UTC
Monitoring
The issue has been resolved and users can now build and publish. We will be actively monitoring this for the rest of the day. If you experience any difficulties building or publishing please contact support at https://getshogun.atlassian.net/servicedesk/customer/portals
Posted Sep 22, 2022 - 16:02 UTC
Update
We are continuing to work on a fix for this issue.
Posted Sep 22, 2022 - 15:22 UTC
Update
We are continuing to work with our hosting provider on this issue and are simultaneously working to bring redundant infrastructure online.
Posted Sep 22, 2022 - 07:38 UTC
Identified
There is currently an issue preventing Frontend builds from completing. We have identified the source and are working to resolve it. Thank you for your patience.
Posted Sep 22, 2022 - 02:43 UTC
This incident affected: Frontend (AWS codebuild-us-east-1).