Boundary Risk

In the previous sections on Dependency Risk we’ve touched on Boundary Risk several times, but now it’s time to tackle it head-on and discuss this important type of risk.

In terms of the Risk Landscape, Boundary Risk is exactly as it says: a boundary, wall or other kind of obstacle in your way to making a move you want to make. This changes the nature of the Risk Landscape, and introduces a maze-like component to it. It also means that we have to make decisions about which way to go, knowing that our future paths are constrained by the decisions we make.

As we discussed in Complexity Risk, there is always the chance we end up at a Dead End, and we’ve done work that we need to throw away. In this case, we’ll have to head back and make a different decision.

Emergence Through Choice

Boundary Risk is due to Complexity Risk, Dependency Risk and Communication Risk

Boundary Risk is an emergent risk, which exists at the intersection of Complexity Risk, Dependency Risk and Communication Risk. Because of that, it’s going to take a bit of time to pick it apart and understand it, so we’re going to build up to this in stages.

Let’s start with an obvious example: say you want to learn to play some music. There are a multitude of options available to you, and you might choose an uncommon instrument like a Balalaika, or you might choose a common one like a piano or guitar. In any case, once you start learning this instrument, you have picked up the three risks from the diagram above:

Dependency Risk: You have a physical Dependency on it in order to play music, so get to the music shop and buy one.
Communication Risk: You have to communicate with the instrument in order to get it to make the sounds you want. And you have Learning Curve Risk in order to be able to do that.
Complexity Risk: As a music playing system, you now have an extra component (the instrument), with all the attendant complexity of looking after that instrument, tuning it, and so on.

Those risks are true for any instrument you choose. However, if you choose the uncommon instrument you have worse Boundary Risk, because the ecosystem is smaller. It might be hard to find a tutor, or a band needing a balalaika, and you’re unlikely to find one in a friend’s house (compared to the guitar, say).

If you spend time learning to play the piano, you’re mitigating Communication Risk issues, but mostly, your skills won’t be transferable to playing the guitar. Your decision to choose one instrument over another cements the Boundary Risk: you’re following a path on the Risk Landscape and changing to a different path is expensive.

Also, it stands to reason that making any choice is better than making no choice, because you can’t try and learn all the instruments. Doing that, you’d make no meaningful progress on any of them.

Boundary Risk For Software Dependencies

Let’s look at a software example now.

As discussed in Software Dependency Risk, if we are going to use a software tool as a dependency, we have to accept the complexity of its protocols. You have to use its protocol: it won’t come to you.

Our System receives data from the `input`, translates it and sends it to the `output`. But which dependency should we use for the translation, if any?

Let’s take a look at a hypothetical system structure, in the diagram above. In this design, we have are transforming data from the input to the output. But how should we do it?

We could use library ‘a’, using the Protocols of ‘a’, and having a dependency on ‘a’.
We could use library ‘b’, using the Protocols of ‘b’, and having a dependency on ‘b’.
We could use neither, and avoid the dependency, but potentially pick up lots more Codebase Risk and Schedule Risk because we have to code our own alternative to ‘a’ and ‘b’.

The choice of approach presents us with Boundary Risk, because we don’t know that we’ll necessarily be successful with any of these options until we go down the path of choosing one to see:

Maybe ‘a’ has some undocumented drawbacks that are going to hold us up.
Maybe ‘b’ works on some streaming API basis, that is incompatible with the input protocol.
Maybe ‘a’ runs on Windows, whereas our code runs on Linux.

… and so on.

Boundary Risk Pinned Down

The tradeoff for using a library

Wherever we integrate dependencies with complex protocols, we potentially have Boundary Risk. The more complex the dependencies being integrated, the higher the risk. As shown in the above diagram, when we choose software tools, languages or libraries to help us build our systems, we are trading Complexity Risk for Boundary Risk. It could include:

The sunk cost of the Learning Curve we’ve overcome to integrate the dependency, when it fails to live up to expectations.
The likelihood of, and costs of changing to something else in the future.
The risk of Lock In.

As we saw in Software Dependency Risk, Boundary Risk is a big factor in choosing libraries and services. However, it can apply to any kind of dependency:

If you’re depending on a Process or Organisation, they might change their products or quality, making the effort you put into the relationship worthless.
If you’re depending on Staff, they might leave, meaning your efforts on training them don’t pay back as well as you hoped.
If you’re depending on an Event occurring at a particular time, you might have a lot of work to reorganise your life if it changes time or place.
If you are tied into a contract, you might have to pay for something despite no longer using it.

Boundary Risk and Sunk Costs

Because of Boundary Risk’s relationship to Learning Curve Risk, we can avoid accreting it by choose the simplest and fewest dependencies for any job. Let’s look at some examples:

mkdirp is an npm module defining a single function. This function takes a single string parameter and recursively creating directories. Because the protocol is so simple, there is almost no Boundary Risk.
Using a particular brand of database with a JDBC driver comes with some Boundary Risk: but the boundary is specified by a standard. Although the standard doesn’t cover every aspect of the behaviour of the database, it does minimise risk, because if you are familiar with one JDBC driver, you’ll be familiar with them all, and swapping one for another is relatively easy.
Choosing a language or framework comes with higher Boundary Risk: you are expected to yield to the framework’s way of behaving throughout your application. You cannot separate the concern easily, and swapping out the framework for another is likely to leave you with a whole new set of assumptions and interfaces to deal with.

Lock-In

Sometimes, one choice leads to another, and you’re forced to “double down” on your original choice, and head further down the path of commitment.

On the face of it, WordPress and Drupal should be very similar:

They are both Content Management Systems
They both use a LAMP (Linux, Apache, MySql, PHP) Stack
They were both started around the same time (2001 for Drupal, 2003 for WordPress)
They are both Open-Source, and have a wide variety of Plugins. That is, ways for other programmers to extend the functionality in new directions.

In practice, they are very different, as we will see. The quality, and choice of plugins for a given platform, along with factors such as community and online documentation is often called its ecosystem:

“Software Ecosystem is a book written by David G. Messerschmitt and Clemens Szyperski that explains the essence and effects of a “software ecosystem”, defined as a set of businesses functioning as a unit and interacting with a shared market for software and services, together with relationships among them. These relationships are frequently underpinned by a common technological platform and operate through the exchange of information, resources, and artifacts.” - Software Ecosystem, Wikipedia

You can think of the ecosystem as being like the footprint of a town or a city, consisting of the buildings, transport network and the people that live there. Within the city, and because of the transport network and the amenities available, it’s easy to make rapid, useful moves on the Risk Landscape. In a software ecosystem it’s the same: the ecosystem has gathered together to provide a way to mitigate various different Feature Risks in a common way.

Ecosystem size is one key determinant of Boundary Risk: a large ecosystem has a large boundary circumference. Boundary Risk is lower in a large ecosystem because your moves on the Risk Landscape are unlikely to collide with it. The boundary got large because other developers before you hit the boundary and did the work building the software equivalents of bridges and roads and pushing it back so that the boundary didn’t get in their way.

In a small ecosystem, you are much more likely to come into contact with the edges of the boundary. You will have to be the developer that pushes back the frontier and builds the roads for the others. This is hard work.

Big Ecosystems Get Bigger

In the real world, there is a tendency for big cities to get bigger. The more people that live there, the more services they provide, and therefore, the more immigrants they attract. And, it’s the same in the software world. In both cases, this is due to the Network Effect:

“A network effect (also called network externality or demand-side economies of scale) is the positive effect described in economics and business that an additional user of a good or service has on the value of that product to others. When a network effect is present, the value of a product or service increases according to the number of others using it.” - Network Effect, Wikipedia

WordPress vs Drupal adoption over 8 years, according to [w3techs.com](https://w3techs.com/technologies/history_overview/content_management/all/y)

You can see the same effect in the software ecosystems with the adoption rates of WordPress and Drupal, shown in the chart above. Note: this is over all sites on the internet, so Drupal accounts for hundreds of thousands of sites. In 2018, WordPress is approximately 32% of all web-sites. For Drupal it’s 2%.

Did WordPress gain this march because it was always better than Drupal? That’s arguable. Certainly, they’re not different enough that WordPress is 16x better. That it’s this way round could be entirely accidental, and a result of Network Effect.

But, by now, if they are to be compared side-by-side, WordPress should be better due to the sheer number of people in this ecosystem who are…

Creating web sites.
Using those sites.
Submitting bug requests.
Fixing bugs.
Writing documentation.
Building plugins.
Creating features.
Improving the core platform.

Is bigger always better? There are five further factors to consider…

1. The Peter Principle

When a tool or platform is popular, it is under pressure to increase in complexity. This is because people are attracted to something useful, and want to extend it to new purposes. This is known as The Peter Principle:

“The Peter principle is a concept in management developed by Laurence J. Peter, which observes that people in a hierarchy tend to rise to their ‘level of incompetence’.” - The Peter Principle, Wikipedia

Although designed for people, it can just as easily be applied to any other dependency you can think of. This means when things get popular, there is a tendency towards Conceptual Integrity Risk and Complexity Risk.

Java Public Classes By Version (3-9)

The above chart is an example of this: look at how the number of public classes (a good proxy for the boundary) has increased with each release.

2. Backward Compatibility

As we saw in Software Dependency Risk, The art of good design is to afford the greatest increase in functionality with the smallest increase in complexity possible, and this usually means Refactoring. But, this is at odds with Backward Compatibility.

Each new version has a greater functional scope than the one before (pushing back Boundary Risk), making the platform more attractive to build solutions in. But this increases the Complexity Risk as there is more functionality to deal with.

3. Focus vs Over-Reach

The Peter Principle. Backward Compatibility + Extension leads to complexity and learning curve risk

You can see in the diagram above the Peter Principle at play: as more responsibility is given to a dependency, the more complex it gets, and the greater the learning curve to work with it. Large ecosystems like Java react to Learning Curve Risk by having copious amounts of literature to read or buy to help, but it is still off-putting.

Because Complexity is Mass, large ecosystems can’t respond quickly to Feature Drift. This means that when the world changes, new systems will come along to plug the gaps.

This implies a trade-off:

Sometimes it’s better to accept the Boundary Risk innate in a smaller system than try to work within the bigger, more complex system.

4. Ecosystem Bridges

Boundary Risk is mitigated when a bridge is built between ecosystems

Sometimes, technology comes along that allows us to cross boundaries, like a bridge or a road. This has the effect of making it easy to to go from one self-contained ecosystem to another. Going back to WordPress, a simple example might be the Analytics Dashboard which provides Google Analytics functionality inside WordPress.

I find, a lot of code I write is of this nature: trying to write the glue code to join together two different ecosystems.

Protocol Risk From A	Protocol Risk From B	Resulting Bridge Complexity	Example
Low	Low	Simple	Changing from one date format to another.
High	Low	Moderate	Status Dashboard.
High	High	Complex	Object-Relational Mapping (ORM) Tools.
High + Evolving	Low	Moderate, Versioned	Simple Phone App, e.g. note-taker or calculator
Evolving	High	Complex	Modern browser (see below)
Evolving	Evolving	Very Complex	Google Search, Scala

As shown in the above diagram, mitigating Boundary Risk involves taking on complexity. The more Protocol Complexity there is to bridge the two ecosystems, the more Complex the bridge will necessarily be. The above table shows some examples of this.

From examining the Protocol Risk at each end of the bridge you are creating, you can get a rough idea of how complex the endeavour will be:

If it’s low-risk at both ends, you’re probably going to be able to knock it out easily. Like translating a date, or converting one file format to another.
Where one of the protocols is evolving, you’re definitely going to need to keep releasing new versions. The functionality of a Calculator app on my phone remains the same, but new versions have to be released as the phone APIs change, screens change resolution and so on.

5. Standards

Standards allow us to achieve the same thing, in one of two ways:

Abstract over the ecosystems. Provide a standard protocol (a lingua franca) which can be converted down into the protocol of any of a number of competing ecosystems.

The C programming language provided a way to get the same programs compiled against different CPU instruction sets, therefore providing some portability to code. The problem was, each different operating system would still have its own libraries, and so to support multiple operating systems, you’d have to write code against multiple different libraries.
Java took what C did and went one step further, providing interoperability at the library level. Java code could run anywhere where Java was installed.

Force adoption. All of the ecosystems start using the standard for fear of being left out in the cold. Sometimes, a standards body is involved, but other times a “de facto” standard emerges that everyone adopts.

ASCII: fixed the different-character-sets boundary risk by being a standard that others could adopt. Before everyone agreed on ASCII, copying data from one computer system to another was a massive pain, and would involve some kind of translation. Unicode continues this work.
Internet Protocol. As we saw in Communication Risk, the Internet Protocol (IP) is the lingua franca of the modern Internet. However, at one period of time, there were many competing standards. and IP was the ecosystem that “won”, and was subsequently standardised by the IETF. This is actually an example of both approaches: as we saw in Communication Risk, Internet Protocol is also an abstraction over lower-level protocols.

Boundary Risk Cycle

Boundary Risk Decreases With Bridges and Standards

Boundary Risk seems to progress in cycles. As a piece of technology becomes more mature, there are more standards and bridges, and boundary risk is lower. Once Boundary Risk is low and a particular approach is proven, there will be innovation upon this, giving rise to new opportunities for Boundary Risk. Here are some examples:

Processor Chip manufacturers had done something similar in the 1970’s and 1980’s: by providing features (instructions) on their processors that other vendors didn’t have, they made their processors more attractive to system integrators. However, since the instructions were different on different chips, this created Boundary Risk for the integrators. Intel and Microsoft were able to use this fact to build a big ecosystem around Windows running on Intel chips (so called, WinTel). The Intel instruction set is nowadays a de-facto standard for PCs.
In the late 1990s, faced with the emergence of the nascent World Wide Web, and the Netscape Navigator browser, Microsoft adoped a strategy known as Embrace and Extend. The idea of this was to subvert the HTML standard to their own ends by embracing the standard and creating their own browser Internet Explorer and then extending it with as much functionality as possible, which would then not work in Netscape Navigator. They then embarked on a campaign to try and get everyone to “upgrade” to Internet Explorer. In this way, they hoped to “own” the Internet, or at least, the software of the browser, which they saw as analogous to being the “operating system” of the Internet, and therefore a threat to their own operating system, Windows.
We currently have just two main mobile ecosystems (although there used to be many more): Apple’s iOS and Google’s Android, which are both very different and complex ecosystems with large, complex boundaries. They are both innovating as fast as possible to keep users happy with their features. Bridging Tools like Xamarin exist which allow you to build applications sharing code over both platforms.
Currently, Amazon Web Services (AWS) are competing with Microsoft Azure and Google Cloud Platform over building tools for Platform as a Service (PaaS) (running software in the cloud). They are both racing to build new functionality, but at the same time it’s hard to move from one vendor to another as there is no standardisation on the tools.

Everyday Boundary Risks

Although ecosystems are one very pernicious type of boundary in software development, it’s worth pointing out that Boundary Risk occurs all the time. Let’s look at some ways:

Configuration. When software has to be deployed onto a server, there has to be configuration (usually on the command line, or via configuration property files) in order to bridge the boundary between the environment it’s running in and the software being run. Often, this is setting up file locations, security keys and passwords, and telling it where to find other files and services.
Integration Testing. Building a unit test is easy. You are generally testing some code you have written, aided with a testing framework. Your code and the framework are both written in the same language, which means low boundary risk. But, to integration test you need to step outside this boundary and so it becomes much harder. This is true whether you are integrating with other systems (providing or supplying them with data) or parts of your own system (say testing the client-side and server parts together).
User Interface Testing. If you are supplying a user-interface, then the interface with the user is already a complex, under-specified risky protocol. Although tools exist to automate UI testing (such as Selenium, these rarely satisfactorily mitigate this protocol risk: can you be sure that the screen hasn’t got strange glitches, that the mouse moves correctly, that the proportions on the screen are correct on all browsers?
Jobs. When you pick a new technology to learn and add to your CV, it’s worth keeping in mind how useful this will be to you in the future. It’s career-limiting to be stuck in a dying ecosystem and need to retrain.
Teams. if you’re asked to build a new tool for an existing team, are you creating Boundary Risk by using tools that the team aren’t familiar with?
Organisations. Getting teams or departments to work with each other often involves breaking down Boundary Risk. Often the departments use different tool-sets or processes, and have different goals making the translation harder.

Likelihood of Change

Unless your project ends, you can never be completely sure that Boundary Risk isn’t going to stop you making a move you want. For example:

mkdirp might not work on a new device’s operating system, forcing you to swap it out.
You might discover that the database you chose satisfied all the features you needed at the start of the project, but came up short when the requirements changed later on.
The front-end framework you chose might go out-of-fashion, and it might be hard to find developers interested in working on the project because of it.

This third point is perhaps the most interesting aspect of Boundary Risk: how can we ensure that the decisions we make now are future-proof? You can’t always be sure that a dependency now will always have the same guarantees in the future:

Ownership changes Microsoft buys GitHub. What will happen to the ecosystem around GitHub now?
Licensing changes. (e.g. Oracle buys Tangosol who make Coherence for example). Having done this, they increase the licensing costs of Tangosol to huge levels, milking the Cash Cow of the installed user-base, but ensuring no-one else is likely to use it.
Better alternatives become available: As a real example of this, I began a project in 2016 using Apache Solr. However, in 2018, I would probably use ElasticSearch. In the past, I’ve built web-sites using Drupal and then later converted them to use WordPress.

Patterns In Boundary Risk

In Feature Risk, we saw that the features people need change over time. Let’s get more specific about this:

Human need is Fractal: This means that over time, software products have evolved to more closely map to human needs. Software that would have delighted us ten years ago lacks the sophistication we expect today.
Software and hardware are both improving with time: due to evolution and the ability to support greater and greater levels of complexity.
Abstractions accrete too: As we saw in Process Risk, we encapsulate earlier abstractions in order to build later ones.

The only thing we can expect in the future is that the lifespan of any ecosystem will follow the arc shown in the above diagram, through creation, adoption, growth, use and finally either be abstracted over or abandoned.

Although our discipline is a young one, we should probably expect to see “Software Archaeology” in the same way as we see it for biological organisms. Already we can see the dead-ends in the software evolutionary tree: COBOL and BASIC languages, CASE systems. Languages like FORTH live on in PostScript, SQL is still embedded in everything

Let’s move on now to the last Dependency Risk section, and look at Agency Risk.

Start Here

Discuss

Publications