In August, we skilled no incidents leading to service downtime. This month’s GitHub Availability Report will dive into updates to the GitHub Standing Web page and supply follow-up particulars on how we’ve addressed the incident talked about in July’s report.
At GitHub, we’re all the time striving to be extra clear and clear with our customers. We all know our clients belief us to speak the present availability of our providers and we at the moment are making that extra clear for everybody to see. Beforehand, our standing web page shared a 90-day historical past of GitHub’s availability by service, however this historical past can distract customers from what’s occurring actively, which throughout an incident is an important piece of knowledge. Beginning at present, the standing web page will show our present availability and inform customers of any degradation in service with real-time updates from the staff. The historical past view will proceed to be out there and might be discovered underneath the “incident historical past” hyperlink. As a reminder, if you wish to obtain updates on any standing modifications, you may subscribe to get notifications every time GitHub creates, updates, or resolves an incident.
Try the new GitHub Standing Web page.
Because the incident talked about in July’s GitHub Availability Report, we’ve labored on plenty of enhancements to each our deployment tooling and to the best way we configure our Kubernetes deployments, with the objective of enhancing the reliability posture of our techniques.
First, we audited all of the Kubernetes deployments utilized in manufacturing to take away all usages of the ImagePullPolicy of All the time configuration.
Our philosophy when coping with Kubernetes configuration is to make it straightforward for inside customers to ship their code to manufacturing whereas persevering with to observe greatest practices. Because of this, we carried out a change that robotically replaces the ImagePullPolicy of All the time setting in all our Kubernetes-deployed functions, whereas nonetheless permitting skilled customers with specific must choose out of this automation.
Second, we carried out a mechanism equal to the one among Kubernetes mutating admission controllers that we use to inject the newest model of sidecar containers, recognized by the SHA256 digest of the picture.
These modifications allowed us to take away the sturdy coupling between Kubernetes Pods and the supply of our Docker registry in case of container restarts. Now we have extra enhancements within the pipeline that can assist additional enhance the resilience of our Kubernetes deployments and we plan to share extra details about these sooner or later.
We’re excited to have the ability to share these updates with you and look ahead to future updates as we proceed our efforts in making GitHub extra resilient day by day.