Risk-First Analysis Framework
In the previous sections on Dependency Risk we’ve touched on Boundary Risk several times, but now it’s time to tackle it head-on and discuss this important type of risk.
In terms of the Risk Landscape, Boundary Risk is exactly as it says: a boundary, wall or other kind of obstacle in your way to making a move you want to make. This changes the nature of the Risk Landscape, and introduces a maze-like component to it. It also means that we have to make decisions about which way to go, knowing that our future paths are constrained by the decisions we make.
As we discussed in Complexity Risk, there is always the chance we end up at a Dead End, and we’ve done work that we need to throw away. In this case, we’ll have to head back and make a different decision.
Boundary Risk is an emergent risk, which exists at the intersection of Complexity Risk, Dependency Risk and Communication Risk. Because of that, it’s going to take a bit of time to pick it apart and understand it, so we’re going to build up to this in stages.
Let’s start with an obvious example: say you want to learn to play some music. There are a multitude of options available to you, and you might choose an uncommon instrument like a Balalaika, or you might choose a common one like a piano or guitar. In any case, once you start learning this instrument, you have picked up the three risks from the diagram above:
Those risks are true for any instrument you choose. However, if you choose the uncommon instrument you have worse Boundary Risk, because the ecosystem is smaller. It might be hard to find a tutor, or a band needing a balalaika, and you’re unlikely to find one in a friend’s house (compared to the guitar, say).
If you spend time learning to play the piano, you’re mitigating Communication Risk issues, but mostly, your skills won’t be transferable to playing the guitar. Your decision to choose one instrument over another cements the Boundary Risk: you’re following a path on the Risk Landscape and changing to a different path is expensive.
Also, it stands to reason that making any choice is better than making no choice, because you can’t try and learn all the instruments. Doing that, you’d make no meaningful progress on any of them.
Let’s look at a software example now.
As discussed in Software Dependency Risk, if we are going to use a software tool as a dependency, we have to accept the complexity of its protocols. You have to use its protocol: it won’t come to you.
Let’s take a look at a hypothetical system structure, in the diagram above. In this design, we have are transforming data from the input
to the output
. But how should we do it?
The choice of approach presents us with Boundary Risk, because we don’t know that we’ll necessarily be successful with any of these options until we go down the path of choosing one to see:
… and so on.
Wherever we integrate dependencies with complex protocols, we potentially have Boundary Risk. The more complex the dependencies being integrated, the higher the risk. As shown in the above diagram, when we choose software tools, languages or libraries to help us build our systems, we are trading Complexity Risk for Boundary Risk. It could include:
As we saw in Software Dependency Risk, Boundary Risk is a big factor in choosing libraries and services. However, it can apply to any kind of dependency:
Because of Boundary Risk’s relationship to Learning Curve Risk, we can avoid accreting it by choose the simplest and fewest dependencies for any job. Let’s look at some examples:
mkdirp
is an npm module defining a single function. This function takes a single string parameter and recursively creating directories. Because the protocol is so simple, there is almost no Boundary Risk.Sometimes, one choice leads to another, and you’re forced to “double down” on your original choice, and head further down the path of commitment.
On the face of it, WordPress and Drupal should be very similar:
In practice, they are very different, as we will see. The quality, and choice of plugins for a given platform, along with factors such as community and online documentation is often called its ecosystem:
“Software Ecosystem is a book written by David G. Messerschmitt and Clemens Szyperski that explains the essence and effects of a “software ecosystem”, defined as a set of businesses functioning as a unit and interacting with a shared market for software and services, together with relationships among them. These relationships are frequently underpinned by a common technological platform and operate through the exchange of information, resources, and artifacts.” - Software Ecosystem, Wikipedia
You can think of the ecosystem as being like the footprint of a town or a city, consisting of the buildings, transport network and the people that live there. Within the city, and because of the transport network and the amenities available, it’s easy to make rapid, useful moves on the Risk Landscape. In a software ecosystem it’s the same: the ecosystem has gathered together to provide a way to mitigate various different Feature Risks in a common way.
Ecosystem size is one key determinant of Boundary Risk: a large ecosystem has a large boundary circumference. Boundary Risk is lower in a large ecosystem because your moves on the Risk Landscape are unlikely to collide with it. The boundary got large because other developers before you hit the boundary and did the work building the software equivalents of bridges and roads and pushing it back so that the boundary didn’t get in their way.
In a small ecosystem, you are much more likely to come into contact with the edges of the boundary. You will have to be the developer that pushes back the frontier and builds the roads for the others. This is hard work.
In the real world, there is a tendency for big cities to get bigger. The more people that live there, the more services they provide, and therefore, the more immigrants they attract. And, it’s the same in the software world. In both cases, this is due to the Network Effect:
“A network effect (also called network externality or demand-side economies of scale) is the positive effect described in economics and business that an additional user of a good or service has on the value of that product to others. When a network effect is present, the value of a product or service increases according to the number of others using it.” - Network Effect, Wikipedia
You can see the same effect in the software ecosystems with the adoption rates of WordPress and Drupal, shown in the chart above. Note: this is over all sites on the internet, so Drupal accounts for hundreds of thousands of sites. In 2018, WordPress is approximately 32% of all web-sites. For Drupal it’s 2%.
Did WordPress gain this march because it was always better than Drupal? That’s arguable. Certainly, they’re not different enough that WordPress is 16x better. That it’s this way round could be entirely accidental, and a result of Network Effect.
But, by now, if they are to be compared side-by-side, WordPress should be better due to the sheer number of people in this ecosystem who are…
Is bigger always better? There are five further factors to consider…
When a tool or platform is popular, it is under pressure to increase in complexity. This is because people are attracted to something useful, and want to extend it to new purposes. This is known as The Peter Principle:
“The Peter principle is a concept in management developed by Laurence J. Peter, which observes that people in a hierarchy tend to rise to their ‘level of incompetence’.” - The Peter Principle, Wikipedia
Although designed for people, it can just as easily be applied to any other dependency you can think of. This means when things get popular, there is a tendency towards Conceptual Integrity Risk and Complexity Risk.
The above chart is an example of this: look at how the number of public classes (a good proxy for the boundary) has increased with each release.
As we saw in Software Dependency Risk, The art of good design is to afford the greatest increase in functionality with the smallest increase in complexity possible, and this usually means Refactoring. But, this is at odds with Backward Compatibility.
Each new version has a greater functional scope than the one before (pushing back Boundary Risk), making the platform more attractive to build solutions in. But this increases the Complexity Risk as there is more functionality to deal with.
You can see in the diagram above the Peter Principle at play: as more responsibility is given to a dependency, the more complex it gets, and the greater the learning curve to work with it. Large ecosystems like Java react to Learning Curve Risk by having copious amounts of literature to read or buy to help, but it is still off-putting.
Because Complexity is Mass, large ecosystems can’t respond quickly to Feature Drift. This means that when the world changes, new systems will come along to plug the gaps.
This implies a trade-off:
Sometimes, technology comes along that allows us to cross boundaries, like a bridge or a road. This has the effect of making it easy to to go from one self-contained ecosystem to another. Going back to WordPress, a simple example might be the Analytics Dashboard which provides Google Analytics functionality inside WordPress.
I find, a lot of code I write is of this nature: trying to write the glue code to join together two different ecosystems.
Protocol Risk From A | Protocol Risk From B | Resulting Bridge Complexity | Example |
---|---|---|---|
Low | Low | Simple | Changing from one date format to another. |
High | Low | Moderate | Status Dashboard. |
High | High | Complex | Object-Relational Mapping (ORM) Tools. |
High + Evolving | Low | Moderate, Versioned | Simple Phone App, e.g. note-taker or calculator |
Evolving | High | Complex | Modern browser (see below) |
Evolving | Evolving | Very Complex | Google Search, Scala |
As shown in the above diagram, mitigating Boundary Risk involves taking on complexity. The more Protocol Complexity there is to bridge the two ecosystems, the more Complex the bridge will necessarily be. The above table shows some examples of this.
From examining the Protocol Risk at each end of the bridge you are creating, you can get a rough idea of how complex the endeavour will be:
Calculator
app on my phone remains the same, but new versions have to be released as the phone APIs change, screens change resolution and so on.Standards allow us to achieve the same thing, in one of two ways:
The C programming language provided a way to get the same programs compiled against different CPU instruction sets, therefore providing some portability to code. The problem was, each different operating system would still have its own libraries, and so to support multiple operating systems, you’d have to write code against multiple different libraries.
Java took what C did and went one step further, providing interoperability at the library level. Java code could run anywhere where Java was installed.
ASCII: fixed the different-character-sets boundary risk by being a standard that others could adopt. Before everyone agreed on ASCII, copying data from one computer system to another was a massive pain, and would involve some kind of translation. Unicode continues this work.
Internet Protocol. As we saw in Communication Risk, the Internet Protocol (IP) is the lingua franca of the modern Internet. However, at one period of time, there were many competing standards. and IP was the ecosystem that “won”, and was subsequently standardised by the IETF. This is actually an example of both approaches: as we saw in Communication Risk, Internet Protocol is also an abstraction over lower-level protocols.
Boundary Risk seems to progress in cycles. As a piece of technology becomes more mature, there are more standards and bridges, and boundary risk is lower. Once Boundary Risk is low and a particular approach is proven, there will be innovation upon this, giving rise to new opportunities for Boundary Risk. Here are some examples:
Although ecosystems are one very pernicious type of boundary in software development, it’s worth pointing out that Boundary Risk occurs all the time. Let’s look at some ways:
Unless your project ends, you can never be completely sure that Boundary Risk isn’t going to stop you making a move you want. For example:
mkdirp
might not work on a new device’s operating system, forcing you to swap it out.This third point is perhaps the most interesting aspect of Boundary Risk: how can we ensure that the decisions we make now are future-proof? You can’t always be sure that a dependency now will always have the same guarantees in the future:
In Feature Risk, we saw that the features people need change over time. Let’s get more specific about this:
The only thing we can expect in the future is that the lifespan of any ecosystem will follow the arc shown in the above diagram, through creation, adoption, growth, use and finally either be abstracted over or abandoned.
Although our discipline is a young one, we should probably expect to see “Software Archaeology” in the same way as we see it for biological organisms. Already we can see the dead-ends in the software evolutionary tree: COBOL and BASIC languages, CASE systems. Languages like FORTH live on in PostScript, SQL is still embedded in everything
Let’s move on now to the last Dependency Risk section, and look at Agency Risk.