Best to break this into a couple of key areas: a) major investments, and b) baseline risks, things you as an Exec team and/or Board of Director’s should probably be aware of.
Major Investments tend to get a lot of focus up front, but often go sideways in similar ways and for similar reasons. Baseline Risks on the other hand, are more hidden. Those tend to sneak up and bite you on the ass, especially if you don’t know what to look for. Both can be extremely painful – or even fatal – for organisations.
This, from a while ago, is a decent starting point on the topic of major investments. These may get pitched as “transformation” or a Programme, or perhaps a single project. The bigger they are the harder the fall.
For Boards, specifically, I’d recommend getting independent advice. Not from the usual audit firms or big consultancies — find a smaller local firm who have a good track record of delivering meaningful value through delivery. Make sure they report directly to the Chair, not via Management. Give them access to the teams right at the coal face, and any code base.
You may think this undermines trust in management, but a good exec team should welcome the independent lines of communication. It’s about keeping each other safe, and creating a no surprises culture. After all, for the Exec team, a major project failure can be career defining – and not in a good way.
Additional questions that are super important:
- Do we need to build this ourselves? (If the tech isn’t a key foundation for your strategy, and others are more likely to push the category forward, better to find a cloud-native SaaS offering.)
- Show us working software within the first few months at least, and iterate from there. Don’t believe any project or programme plan you are shown at the beginning. Trust, but verify. And regularly. Let them get on with it, but monitor it very carefully.
- Set it up in such a way that you can pull the plug (with, say, less than a months notice) and you will still have delivered something of value to the organisation. At the start, most of the value may be risk reduction and learning, but that has GOT to come from working software, not analysis, design and “planning”.
- What’s our long-term run rate, that we are happy to fund in perpetuity. Start there and work backwards is a good rule of thumb. If you can’t deliver working software with the same size team as you expect to look after it at the end, that’s a bad sign. Don’t attempt to “scale” if you can’t do that. Think big, start small.
- What evidence do we have that this is the right option for this business? Have we got hands-on and built out a prototype that specifically attacks our top 3-4 risks? Procurement is your friend. Work with them.
- Be wary of outsourcing delivery. It’s tempting, but you cannot outsource the risk. There’s little comfort in “warranty” periods, or any other attempts to share risk. The big SI firms are MUCH better at this game than your company, and they’ll always walk away unharmed, leaving you with a mess.
Warning: this is not going to be exhaustive, more highlighting a couple of areas where problems often crop up, and some pointers of things to ask…
Some of the best indicators of underlying tech risk are synonymous with the DevOps Research Agency metrics.
- Lead Time: what is your lead time for changes (that is, how long does it take to go from code committed to code successfully running in production)?
- Deploy Frequency: how often does your organisation deploy code to production, or release it to end users?
- Time to Restore: how long does it take to restore service when a service incident or a defect that impacts users occurs?
- Change Failure Rate: what percentage of changes to production or releases to users result in degraded service (for example, lead to service impairment or service outage) and subsequently require remediation (for example, require a hotfix, rollback, fix forward, patch)?
Long lead times, infrequent deployments, slow time to restore and high change failure rates tells you that you’re likely carrying quite a lot of risk in your tech landscape. Some of these are counterintuitive. For instance, it may seem like it’s more risky to be doing releases multiple times a week, but it’s the organisations that are doing releases quarterly or slower that are much more likely to have quality problems and hidden risks. Being unable to release more frequently is often because of underlying quality issues, especially in a lack of test automation and/or an architecture that no longer reflects how the software is now being used.
You may also hear the phrase “tech debt”. This isn’t a bunch of idiots making bad decisions, or poor programming. The debt metaphor attempts to convey that we need to be continually weeding and tidying up our codebase to reflect what we now understand that it needs to do. If you put that work off, over time, your risks start to creep up on you.
If your DORA metrics are toward the better end, then that’s a decent indicator that you’ve probably got a team you can trust to look after risks. Listen to them. Ask them what keeps them up at night. Ask them what they need to keep the organisation safe. Make it safe for them.
There’s no version where the risk is zero, of course, but if you’re on the Exec team or the Board and you do nothing about known risks, that’s something that might come back to bite you. Negligence is the word you want to avoid. We have a duty of care, to act in “good faith”. That absolutely extends to the technology that the organisation needs to operate, and survive for the long term.
What about Data Security?
Data in the Cloud: the key word here is encryption. You really want any customer data to be double encrypted: in transit and at rest.
- “Encrypted at rest” means that if someone happens to get access to your data, they can’t do anything with what they find. It’s a jumble of nonsense that requires keys to decrypt.
- “Encrypted in transit” means that if someone manages to get in between you (or your customers) and your cloud storage, again, all they can see is jumbled nonsense. Not very exciting.
Needless to say, the above only works if you have solid security around the handling of those keys. Good practice is always evolving, but you might want to ask if you’re using a Key Vault, and whether security practices are up to date, whether Developers and others have the appropriate Security training. If your org doesn’t have policies and procedures in place for revoking access for staff when they leave that’s another red flag.
It’s worth mentioning that “Cyber Security” is not something you can really “bolt on” after the fact. We used to rely on high levels of security at the perimeter, protecting our data centres and offices to keep data safe, but that doesn’t really work these days. Security is more of a mindset in how we build, improve and maintain our technology landscape. You may have heard the term “DevSecOps”, which is mostly just making it clear that DevOps always included a collective responsibility for Security. So, make sure your Developers get training! Some training for Execs and the Board would also help. Call Laura Bell and the team at SafeStack. They know their shit.
That’s a decent start, off the top of my head while on the train. What am I missing? Anything you’d add? Will add some extras when they come to mind…