CloudBees IT Operations - Background

CloudBees IT Operations

CloudBees engineering has grown from it’s early days of a handful of engineers through multiple iterations that have been challenging both technically and operationally.

Products such as DEV@cloud, CloudBees Jenkins Enterprise, RUN@cloud, RUN@cloud ecosystem providers and most recently Codeship have all shared one common feature - sandwiches

CloudBees in general has undergone dramatic growth As the CloudBees offering has evolved, revolved, died and been replaced, so to and

Day 2 Operations

An interesting description of what we do in CloudBees Ops is the “Day 2 Operations” term.

It’s unclear where it was coined - but DC/OS has a definition here - https://dcos.io/day2ops/ - and Ben Hindman of Mesosphere has a video here - https://www.youtube.com/watch?v=gqwcUgZOoyI

The basic description is

  • Day 0 Ops - Discovering a product, getting it running locally and proving it works
  • Day 1 Ops - It’s deployed - but not actually stable and easily managed
  • Day 2 Ops - The long term maintenance, management and operation of products

The amount of time spent operating in “Day 2” dwarfs all other phases - so it makes sense to focus on optimisation and operationalisation of Day 2.

But DevOps is cool, and CloudBees is kind of “all in” on Continuous Delivery

So if you are doing all this Day 2 Ops with a team of dedicated Operations engineers, then what are the developers doing? Are they doing no operations.

That’s the interesting thing - as we progressed through cloud “friendliness” we went from DevOps (A) to dedicated Ops to embedded Ops and back to DevOps (B) again.

In all these changes we discovered that what works for 4 engineers, simply doesn’t work for 100s of engineers / support / professional services etc.

The cloud is getting bigger and more complex

So many services!