Skip to content

Instantly share code, notes, and snippets.

@zheng022
Last active August 30, 2024 17:22
Show Gist options
  • Select an option

  • Save zheng022/1d63f840f3e13c8b1d61e297fb53b470 to your computer and use it in GitHub Desktop.

Select an option

Save zheng022/1d63f840f3e13c8b1d61e297fb53b470 to your computer and use it in GitHub Desktop.
revive runbook

We would like to have runbook to facilitate engineer oncall

  1. review existing runbook from support which are related to GHES engineering.

A spreadsheet is created now to start with the runbooks that are over 1 year old

  1. create runbooks for GHES engineering with catagorized topics
image

A runbook is a how-to guide that focuses on solving a practical problem or targeting a real-world task. Refer to the comapass to decide whether you want to document is a runbook or something else.

  1. Runbooks will be created with an issue template:
image

Engineer will fill in the template and corresponding labels for its sub topics will be added accordingly. When the content is filled in, the issue can be marked with runbook-review-ready label, and a reviewer (could be engineer from the topic's AOR or the availability champion) will be assigned to review the issue. image

Once review is done runbook-review-approved label will be added by reviewer. An action workflow will create a PR to add content to its corresponding folder with information filled in the issue. image

  1. on closing of ANY sev 1 issue, a comment will be emitted to tag assignees on creating a runbook for the issue.
  2. on closing of a weekly support issue from oncall project board, always ask if any runbook can be improved.
  3. summarize periodically on runbook contribution and announce in ghes team slack channel to promote maintenance of runbooks.
@gamefiend
Copy link

Looks good! I'd love if we could use the diataxis framework for how-to guides as a structure for our runbooks: https://diataxis.fr/how-to-guides/

@gamefiend
Copy link

Also, I think it would be good if we differentiate reference from the runbooks as well. I think a lot of the docs we do have try to act as both reference and how-to/runbook. Keeping the purposes and forms more focused will be a huge help in this IMO.

For structure of docs, I think the compass that Diataxis uses might be useful: https://diataxis.fr/compass/

Last thought: I think it would be very useful if we built a library of black box flow and architecture diagrams that show how the components interact. Having an up to date list of those would be IMO a huge help, since it lets on-call engineers understand the shape and flow of a GHES component quickly.

@lindarodgers
Copy link

This looks great Zheng ! I will echo what Quinn said . . . I'm hoping we can get a good set of runbooks (aka"how to" guides) for specific issues that list a set of instructions describing how a technical issue can be solved (or a customer support issue was solved in the past). I'm hoping that we can create a runbook out of the majority of our SEV-1s that we review in our GHES Availability meetings.

@tonytrg
Copy link

tonytrg commented Aug 29, 2024

Thanks for creating the runbook. its always hard to start defining a process.

I love this idea of automation, especially on closing issues, it would be good to remind people to do a routine:

  • fix title message for search improvement
  • add correct labels for search improvement
  • update the topics runbook if we made some interesting finding

@phumpal
Copy link

phumpal commented Aug 30, 2024

summarize periodically on runbook contribution and announce in ghes team slack channel to promote maintenance of runbooks

Would it be worth posting these updates to a ghes team Slack channel? Or possibly a weekly summary?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment