What SREs have to do with project-based services?

What SREs have to do with project-based services?

A lot has been said and written about how site reliability engineers (SRE) are shaping (and reshaping) the IT operations of modern applications. There are a few areas where we still debate about the benefits of having SREs in the scene. Nevertheless, for most cases, site reliability engineering is the answer.

Recently I was asked what SREs have to do with project-based services. Imagine short-term projects where we do a technology refresh or implement a new infrastructure model through the work of architects in the majority. They typically don't have Ops per se as such projects have a few months of duration. Would the site reliability engineering still be relevant?

My immediate reaction was to defend our flag. If we think about "secure by design" and "build to manage" principles, an SRE would make a difference in such projects. Design applications (and IT infrastructure) to be secure from the start and build them to be easier to manage (and observable) are things SREs can drive.

Of course, I was not satisfied with my answer, so I decided to put my head down and think about this more pragmatically. Sometimes when you need to correlate different things, there's no better way than back to the basics. Then, I revisited the SRE tenets and checked which ones would apply to short-term projects execution.

Scale Ops with Load

Most likely that on a 3-months project, we won't be able to automate all runbooks (the way we operate an application on production). However, SREs can design such runbooks in a way that automation is not only possible but effortless. The adoption of writing runbooks as algorithms or pseudocodes might allow future automated code generation in any scripting language. Since SREs are developing such documentation, they will also feedback the architectural design to improve it from an operations scalability perspective. Not just that, better architecture also leads to fewer Ops toil. On this aspect, architects and SREs should be best friends forever.

Observability

As a common ground with DevOps, the observability requirements should be part of the software engineering as functional and non-functional requirements. SREs are experts in system reliability that can provide valuable insights into the application or infrastructure designs to make them more observable. It's not about mere infrastructure monitoring, but it's related to the golden signals where the end-user experience is incorporated and considered.

Ops Readiness Review (ORR)

That's the last tenet that I can correlate between solution architecting and site reliability engineering. It is by far the most straightforward one as it's all about readiness for the operations phase. SREs can devise continuous improvement in operational readiness reviews (ORR). Working closely with architects, they can check the system readiness before releasing an app or infrastructure component version to production. ORR scorecard definitions and metrics will be optimal when it's an output of such collaboration.

What other activities can you uncover for project-based services with SREs? Please put your suggestions in the comments section below.

I hope you enjoyed this post. Stay tuned by following me on Twitter @ranami and LinkedIn profile.

As the time demands, stay safe, protect your loved ones, and help others!

Thanks, Rod.

James Maclean

Site Reliability Engineer

2y

Blameless Post Mortems is another tenet that is useful in Project Delivery. Most projects will encounter some kind of issue during testing phases of the project delivery (that's why we test). Embedding this process into projects can really fast track remediation of issues encountered to keep project delivery on track.

Kitty Smith

Retired Distinguished Engineer

2y

“Architects and SREs should be best friends forever” I love it!

To view or add a comment, sign in

More articles by Rod Anami ☸

  • Crossing the continent to attend LFMS23

    Crossing the continent to attend LFMS23

    I felt blessed when I received a letter from the Linux Foundation saying: "Congratulations! Your submission, Building…

    10 Comments
  • Reliability Lessons from the End of the World - Part II

    Reliability Lessons from the End of the World - Part II

    In my last post, I talked about the lessons I learned at the "end of the world" - Ushuaia Patagonia - from the lens of…

  • Reliability Lessons from the End of the World - Part I

    Reliability Lessons from the End of the World - Part I

    Ushuaia (pronounced oo·swai·uh) in Argentina is the most southern city in the world and just 1,100 km from Antarctica…

    2 Comments
  • Will ChatGPT replace SREs?

    Will ChatGPT replace SREs?

    First of all, let me level set the expectations here. I'm not an artificial intelligence (AI) scientist but a site…

    2 Comments
  • Creating my first Chatbot – Part II

    Creating my first Chatbot – Part II

    This is the second part of creating a chatbot for Slack from scratch! If you missed the first part check it out here! I…

  • Creating my first Chatbot – Part I

    Creating my first Chatbot – Part I

    I’ve been talked about ChatOps, chatbots, and how they can help speed resolution and knowledge transfer up for service…

    1 Comment
  • ChatOps as a collaborative model to incidents

    ChatOps as a collaborative model to incidents

    In my previous posts, I explained what is ChatOps and how chatbots relate to it. Today, we are going to understand one…

  • Chatbots! Chatbots everywhere!

    Chatbots! Chatbots everywhere!

    I talked about how you can easily explain what ChatOps is in my previous post. If you didn’t read it yet, I recommend…

  • ChatOps! Bless you!

    ChatOps! Bless you!

    I know we’re all afraid of hearing an achoo close to us nowadays, but that’s not the case here. Actually, this almost…

    2 Comments
  • Last Morning Walking

    Last Morning Walking

    And like a blink of my eyes, I was leaving Prague. One month passed away and without warnings or signs, I took my last…

    2 Comments

Insights from the community

Others also viewed

Explore topics