Joel Tosi Joel Tosi

lessons learned from five years in the dojo - part 1

Having helped organizations with Dojos for five years, we felt it was the right time to share what we’ve learned so far. In this series of blog posts, we want to offer you our “best tips” for starting your own Dojo or for improving your existing Dojo. We’ll wrap up with our thoughts on where Dojos are going next.

Without further ado…

Man sitting on a wooden dock on a lake with mountains reflected in the lake’s surface.

Having helped organizations with Dojos for five years, we felt it was the right time to share what we’ve learned so far. In this series of blog posts, we want to offer you our “best tips” for starting your own Dojo or for improving your existing Dojo. We’ll wrap up with our thoughts on where Dojos are going next.

Without further ado…

Dojos Need to Support a Strategy

We talk with many organizations excited about starting a Dojo. The concept of teams learning new skills while building products is enticing and practical. Excitement is great, but Dojos work best when there is an overarching strategy the Dojos serve. Without a strategy, Dojos can flounder.

For example, an organization invested in moving from a project model to a product model would leverage that desired outcome as their strategy for the Dojo to support. The strategy frames the purpose for the Dojo. The Dojo is there to help teams understand how to work in a product model and learn the skills they don’t already have required to work in that model.

Another strategy we often see is leveraging Dojos to adopt DevOps. While this is more narrow than we recommend, it is still a nice frame for what purpose the Dojo is serving. (We prefer to see Dojos address as much of the product development value stream as possible. We’ll cover this in our next post in this series.) A “DevOps Dojo” would focus on helping teams learn how to build continuous delivery pipelines and automate infrastructure setup while foregoing other skills like product discovery practices.

A third example is using a Dojo to help the organization migrate applications to the cloud. This is an interesting start, but for the Dojo to truly be effective the strategy should be clear on how teams should migrate their applications. Will teams refactor applications for the cloud, move them in a “lift and shift” model, or follow a “re-platforming” model? If it’s a combination of those approaches, what are the criteria for determining which migration model for an individual application? And what is more important - having teams leave the Dojo with deep knowledge of the cloud or getting their applications into the cloud with sufficient knowledge of how to support them? Knowing the answer to these questions is critical if you want to use your Dojo to drive toward specific outcomes.

Starting from a sound strategy is key. It provides the following benefits:

  • Teams understand the value of why they should participate in the Dojo

  • The skills and topics taught in the Dojo are well-defined

  • Growing coaches is easier because coaches can focus on specific skills

  • Measuring the success and impact of the Dojo is easier since you can  measure outcomes against the strategy

The strategy your Dojo supports should be clear and easily stated. If the strategy is nebulous or complicated, your Dojo will struggle to provide value to the rest of the organization.

What strategy is your Dojo supporting?

Be on the lookout for our next topic in this series: why Dojos need to address the entire value stream.



Read More
Joel Tosi Joel Tosi

Measuring Impact In The Dojo

Last month at Agile Day Chicago, I (Joel) had the pleasure of listening to Mark Graban speak about separating signal from noise in our measurements. Mark referenced Process Behavior Charts, a technique described in the book Understanding Variation: The Key to Managing Chaos by Donald J. Wheeler. This simple tool helps us look at metrics over time and understand the difference between naturally occurring variations (what Wheeler calls “The Voice of the Process”) and signals, or variation in the metrics representing real changes. Signals can be indicators that a desired change is manifesting, or they can be indicators that something is wrong and requires further investigation.

This is the sign you’ve been looking for - measuring impact in the dojo

Last month at Agile Day Chicago, I (Joel) had the pleasure of listening to Mark Graban speak about separating signal from noise in our measurements. Mark referenced Process Behavior Charts, a technique described in the book Understanding Variation: The Key to Managing Chaos by Donald J. Wheeler. This simple tool helps us look at metrics over time and understand the difference between naturally occurring variations and signals, or variation in the metrics representing real changes. Wheeler calls both of these (signal and noise) “The Voice of the Process,” with the key being able to distinguish between the two. Signals can be indicators that a desired change is manifesting, or they can be indicators that something is wrong and requires further investigation.

We immediately saw the value in being able to separate signal from noise when evaluating the types of metrics we’re capturing in the Dojo that we talked about in our last post. We both grabbed copies of the book, devoured it quickly, and started brainstorming on applications for Process Behavior charts.

Let's look at an example of how to use Process Behavior Charts in the Dojo.

BEFORE YOU START

This may sound obvious, but before you start any measurement think about the questions you want to answer or the decisions you want to make with the data you’ll collect.

In the Dojo, we help teams shift from a project to product mindset. We focus on delivering specific outcomes, not simply more features . When delivering a new feature the obvious question is – did the feature have the desired outcome?

THE SCENARIO

Imagine yourself in this type of situation…

We’re working with a team and we’re helping them move from a project model to a product model. In the past, the team cranked out features based on stakeholders’ wishes and success was simply judged on whether the features were delivered or not. We’re helping the team shift to judging success on whether outcomes are achieved or not.

We’re also working with the stakeholders and there’s resistance to moving to a product model because there’s fear around empowering the teams to make product decisions. New features are already queued up for delivery. Before we give the team more ownership of the product, the stakeholders want delivery of some of the features in the queue.

We can use this as a coaching opportunity.

The stakeholders believe the next feature in the queue will lead to more sales - more conversions of customers. The team delivers the feature. Now we need to see if we achieved the desired outcome.

Our first step is to establish a baseline using historical data. Luckily, we’re already capturing conversion rates and for the 10 days prior to the introduction of the new feature the numbers look like this:

Sample Data Set

Sample Data Set

Then we look at the data for the next 10 days. On Day 11, we have 14 conversions. Success, right? But on day 12, we have 4 conversions. Certain failure?

Here’s the full set of data for the next 10 days:

Sample Data Set, Next 10 Days After Feature Introduced. Did It Matter?

Sample Data Set, Next 10 Days After Feature Introduced. Did It Matter?

Overall, it looks better, right? The average number of conversions have increased from 6.1 to 7.9. The stakeholders who pushed for the new feature shout “success!”

PROCESS BEHAVIOR CHARTS

Given a system that is reasonably stable, a Process Behavior Chart shows you what values the system will produce without interference. In our case, that means what values we can expect without introducing the new feature. Let's create a process behavior chart for our example and see if our new feature made a difference.

First Step - Chart Your Data In A Time Series and Mark the Average

Plotting Daily Conversions, Average Marked In Dotted Red Line

Plotting Daily Conversions, Average Marked In Dotted Red Line

What does this show us? Well, not much. Roughly half of our points are below average and half are above average (some might call that the definition of average).

Second Step - Calculate the Moving Range Average

Our next step is to calculate the average change from day to day. Our day to day changes would be 2, 4, 4, 2, 6, 3, 2, 5, 3 for an average change of 3.4. All this means is that on average, we see a change in the number of conversions day to day of about 3. If we were to plot the number of changes in conversion day to day, we would see roughly half above and half below - again, the definition of average.

Third Step - Calculate The Upper And Lower Bounds

To calculate the upper and lower bounds, you take the moving range average and multiply it by 2.66. Why 2.66? Great question - and it is well covered in Don Wheeler's book. In brief, you could calculate out the standard deviation and look at 3 sigma, but 2.66 is faster, easier to remember, and ultimately tells the same story.

We take our moving range average of 3.4 and multiply it by 2.66 giving us 9.044. What does this number mean? It means that with normal variance (the Voice of the Process), we can expect conversions to fluctuate 9.044 above or below our average number of conversions which was 6.

To put it more clearly, without any intervention or new features added, we should expect between 0 and 15 conversions per day - and that would be completely normal.

Let's visualize this data. We add our upper and lower bounds to our chart for our first 10 days. It now looks like this:

Data With Process Control Limits Applied. UPC - Upper Process Control, LPC - Lower Process Control. NOTE - Since the LPC is actually -3, we use 0 since a negative is not possible

Data With Process Control Limits Applied. UPC - Upper Process Control, LPC - Lower Process Control. NOTE - Since the LPC is actually -3, we use 0 since a negative is not possible

Fourth Step - Introduce Change & Continue To Measure

We have established the upper and lower bounds of what we can expect to happen. We know that after the feature was introduced, our conversion numbers looked better. Remember, the average went up almost 30% (from 6.1 to 7.9) - so that is success, right?

We extend our chart and look to see if the change actually made a difference.

 

Conversion Chart With Upper And Lower Process Controls. Note - Average, UPC, and LPC Were Not Updated With New Data Points To Prove The Next 10 Days Fell Within Previous Dataset Limits

Conversion Chart With Upper And Lower Process Controls. Note - Average, UPC, and LPC Were Not Updated With New Data Points To Prove The Next 10 Days Fell Within Previous Dataset Limits

Our average for the next 10 days was higher, but looking at what we could normally expect the system to produce, all of the conversions were within the expected range. In essence, the feature we delivered did not create a meaningful impact to our conversions.

Note, we’re not saying that nothing could be learned from delivering the new feature. The point we’re making is that prior to delivering the feature we assumed it would lead to an increase in conversions. Using a Process Behavior Chart we were able to show our assumption was invalid.

Now we can continue the conversation with the stakeholders around empowering the team to improve the product. Maybe now they'll be more open to listening to what the team thinks will lead to an increase in conversions.

MORE USES FOR PROCESS Behavior CHARTS

We like using this visual display of data to help us concretely answer questions focused on whether or not our actions are leading to the intended outcomes. For example, we are experimenting with Process Behavior Charts to measure the impact of teaching new engineering and DevOps practices in the Dojo.

REMEMBER - MEASURE IMPACTS to the WHOLE Value Stream

Process Behavior Charts can be powerful, but they require that you ask the right questions, collect the right data, AND and take the right perspective. Using a Process Behavor Chart to prove a change is beneficial to one part of the value stream (e.g., the “Dev” group) while not taking into consideration the impact to another group (e.g., the “Ops” group) would be missing the point. Consider the complete value stream when you are looking at these charts.

FURTHER READING

For more information on these charts, as well as the math behind them and what other trends in data are significant, we recommend the following:

 

If you found this helpful and you adopt Process Behavior Charts, please let us know how you are using them and what you are discovering.

Read More
Joel Tosi Joel Tosi

Dojo Metrics - Moving From What is Easy to capture to What Matters

A fair question to ask when starting a Dojo (or any initiative for that matter) is “how do we know this is working?” Invariably, right on the heels of that question somebody always brings up the idea of capturing metrics. Then they turn to us and say “What are the right metrics for the Dojo?”.

Measuring depth of water - measuring impact of the dojo

A fair question to ask when starting a Dojo (or any initiative for that matter) is “how do we know this is working?” Invariably, right on the heels of that question somebody always brings up the idea of capturing metrics. Then they turn to us and say “What are the right metrics for the Dojo?”.

The best metrics provide insights that help us take action to improve the current situation. In the case of a new initiative like a Dojo, that action might be making a decision to continue the initiative, modify it, or end it.

Sadly, metrics are often arbitrary or they tell an incomplete story. Single metrics fail to capture the interplay and tradeoffs between different metrics. We’ve heard many stories of how organizations optimizing for one metric created detrimental results overall. (We’re looking at you, capacity utilization.)

how do we measure the effectiveness of the Dojo?

The primary goal of the Dojo is to foster learning. We need to measure the effectiveness of that learning and ultimately, we need to measure the economic impact that learning has on the organization. But it’s not learning at any cost. We’re aligned with Don Reinertsen on this point.

In product development, neither failure, nor success, nor knowledge creation, nor learning is intrinsically good. In product development our measure of “goodness” is economic: does the activity help us make money? In product development we create value by generating valuable information efficiently. Of course, it is true that success and failure affect the efficiency with which we generate information, but in a more complex way than you may realize. It is also true that learning and knowledge sometimes have economic value; but this value does not arise simply because learning and knowledge are intrinsically “good.” Creating information, resolving uncertainty, and generating new learning only improve economic outcomes when cost of creating this learning is less than its benefit."

Don Reinertsen - "The Four Impostors: Success, Failure, Knowledge Creation, and Learning"

 

Reinertsen stresses the need to generate information efficiently. This is easy to understand when thinking in terms of generating information that helps you make decisions about your product. For example, it’s a fairly straightforward exercise to determine the costs for generating information by running low-fi, paper prototype tests that answer the question “should we include this feature or not?”

It’s also easy to understand how you might measure the effectiveness of knowledge creation when helping teams make improvements on their continuous delivery pipelines. We can calculate the cost of learning DevOps practices and compare that to expenses saved by automating manual processes.

What’s not as easy to understand is how to measure the impact of learning cloud native architecture or micro services - or something even more nebulous, like product thinking and the impact of learning a design practice like personas.

We would expect the impact of these learnings to result in lower development costs, decreased cycle times, and increased revenues resulting from better market fit for our products. But – there is a high degree of uncertainty as to the level of impact these learnings are going to have on the organization. (Again, hat tip to Don Reinertsen. His post about looking at the economics of technical debt influences our thinking here.)

In addition, during a team’s tenure in the Dojo it’s quite probable that their productivity will decrease as the team is creating new knowledge and incorporating new practices. The team's investment in learning carries a cost.

Ultimately, we need to understand the impact the Dojo has on lifecycle profits. That impact will often occur after a team has left the Dojo.

We have started organizing metrics in the Dojo into three groups. Our goal is to help orient stakeholders, leaders, and teams around what actions these metrics will help them take. We also want to help them understand the level of effort required to collect the metrics and the timeframes in which they will be available.

Three Categories of Metrics for the Dojo

Simple To Capture - Organizational Reach

These metrics simply show the amount of “touch” the Dojo has.

Examples include:

  • Number of teams going through the Dojo

  • Total number of attendees

  • Number of Programs / Portfolios touched

Astute readers may critically call these “vanity metrics” and they would not be wrong. These metrics do not equate to impact. They don’t help us answer the questions “Were the right teams involved?”, “Did the amount of learning that happened justify the investment?”, or “How much learning stuck?”

However, these metrics are simple to collect and can be used as leading indicators once we have metrics on the economic impact the Dojo has on teams. For many organizations, these metrics are important because they imply value as the Dojo is being bootstrapped, even though they don't prove it. They are metrics everyone is comfortable with.

Harder To Capture – Directional/Team Based Improvements

Metrics in this category are more important than the previous category in the sense that these metrics look at the directional impact of learning in the Dojo and how that learning is impacting teams.

Examples include:

  • Number of automated tests

  • SQALE code quality index

  • Percentage reduction in defects

  • Cycle time reduction to deliver a product increment

  • Velocity / Story count (with the obvious caveat that these can be easily gamed)

Again, these metrics are far from perfect. The testing related metrics do not prove the right tests were written (or the right code for that matter). Metrics showing products were built faster don’t shed any light on whether those products should have been built in the first place (what if nobody buys them?).

What these metrics do show is the incorporation of product delivery practices that are being taught in the Dojo - practices that our experience and the experiences of other organizations have shown to have a positive impact on lifecycle profits. These metrics can be collected with agile project management software, SonarQube, Hygieia, or other comparable tools.

When we use these types of metrics we need to have a baseline. It’s helpful to have data for teams for two to three months prior to when they enter the Dojo. We don’t always have this baseline, however, and in some cases the best we can do during a team’s tenure in the Dojo is help them establish the baseline. Obviously, we want to track these metrics for teams after they’ve left the Dojo to see how well new practices are sticking.

Difficult To Capture – Impact/Economic Improvements

Metrics in this group are challenging - not only to collect but also because using them to drive action challenges the way many organizations work. These are the metrics that force us to look at the question “Is this initiative having a positive economic impact on the organization?”

Examples include:

  • Increase in sales conversion

  • Cycle time reduction for a delivery with impact (not just delivery, but a delivery that mattered)

  • Systematic cost reductions (not silo optimizations that may have detrimental effects in other areas)

  • Savings resulting from killing bad product ideas early in the discovery/delivery cycle

Metrics like these can prove initiatives like the Dojo are having a positive impact on lifecycle profits. These metrics will be substantially harder to collect. We need to collect data for a much longer period of time. We need to align with the finance department in our organizations. And, we need whole product communities aligned around a shared understanding of what successful outcomes look like. In addition, we need to understand how to separate real signals of change from noise. (This post has more on that topic.)

Ultimately, this last category of metrics is what matters. This is where the Dojo shines. We work with teams to teach the practices, thinking, and communication strategies that will have an impact on lifecycle profits.

This is an ongoing area of improvement for us. This is what we are currently practicing. These categories of metrics are helping foster conversations, understanding of what knowledge individual metrics can provide, and the value of investing in the Dojo.

Read More
Joel Tosi Joel Tosi

Technical Debt - Learning in (and With!) the Face of Debt

Technical Debt has a few definitions ranging from 'the previous person's bad code' to 'the shortcuts taken to hit a deadline' to my favorite - Technical Debt is 'the gap in the code between what we knew when we started building our product and what we know now'.

Room full of garbage - working with technical debt

Technical Debt has a few definitions ranging from 'the previous person's bad code' to 'the shortcuts taken to hit a deadline' to my favorite - Technical Debt is 'the gap in the code between what we knew when we started building our product and what we know now'.

It's easy to look at a codebase with no automated tests, high cyclomatic complexity, and a manual build process and say 'look - Technical Debt'. It is more challenging to work with a team implementing new features where previous technical choices are making it costly to improve the product. In other words, the code is not maneuverable (Nygard).

These situations happen frequently in the dojo. Let's look at a couple examples.

THE UNTESTED 20-YEAR OLD MONOLITH

Imagine a 20+ year old code base, modified every year by the lowest cost outsourcing firm the organization could find. Now imagine you inherited that codebase with such delights as 900 line constructors that directly connect to databases, establishing pools and locks.  An agile coach could (and had) come into that team and said “You need to get to 70% code coverage for your unit tests”. The team laughed because crying was too obvious.

When teams are in this kind of bind, entering the dojo isn't just about learning how to write unit tests. The team already knew how to write unit tests and was eager to do so. The challenge was finding a strategy to attack the beast. And when deadlines were hitting, it was easier to keep adding code and crossing your fingers.

We did a few things to address the problem. First was getting the team together to identify and discuss quick wins they all wanted to knock out. We did some light brainstorming and affinity mapping which led to a new set of lightweight design changes - all in about an hour or two - and the team had a nice shortlist of changes to start working on. The list would take a week to complete and would be worked on concurrently with new work. It was a small set of changes, but it gave the team a few wins to build upon.

Next, we came up with strategies for attacking the technical debt while working on new stories. One strategy we used quite a bit was using a test-driven approach, focusing on the design of the tests themselves. We’d have a quick discussion around the tests – “where should the responsibility of the tests lie”? With that in mind we’d ask – “what is the delta between where those responsibilities should be and where they'd be given our current design? Can we get there now?”

For some of the stories, we could implement our desired changes immediately. For others, we couldn't. In those cases, we kept a visual list of the spots we wanted to work on. We also created higher-level tests that we could implement immediately with the knowledge that we would remove those tests once we had implemented lower-level tests.

THE MULTI-TEAM MATRIXED DESIGN

Another team was similar in that they had a lack of testing in place, but their challenges were slightly different. This team was formed as an organization’s first attempt at creating 'product teams.' The codebase was gnarly and had contributions from multiple teams over multiple years without much knowledge sharing.

Enter the dojo. The team was doing a little product exploration around a new feature using impact mapping. They came up with a few ideas that would have more impact for the customer than the new feature as it was originally defined. Excellent, right? Not quite. While these ideas were great and the team agreed they were better approaches to the problem, the technology would not allow the better ideas to be built. The way the product was designed made it difficult for required data to be accessed when needed, resulting in unnecessary extra steps. Streamlining a separate process would break downstream services. And so on...

Teaching ideas like parallel change can help in this space. But the real value here came from the whole team learning the cost of technology decisions together, and working together to learn approaches to attack technical debt.

When items like this arise, first and foremost it is still a good thing. Now we can start quantifying design problems and communicating the opportunity cost of technical debt. In this example, we could calculate the cost of the options we could not deliver. The learning expanded beyond the team and became organizational learning.

And in the dojo – we don’t simply teach approaches for tackling technical debt and send the on their way – we help the teams work through it.

What have you done to help teams with technical debt?

 

Read More