Mitigations for Technical Risk

20 minutes, 10 links
From

editione1.0.1

Updated August 7, 2023

Divide and Conquer

People often compare the ability to program a computer with superhuman powers. Sure, it may seem like that at times when you see programs do things that are seemingly impossible or futuristic, but programmers are only human. Thereโ€™s a limit to how much the human brain can comprehend at any given time, and we often find that limit when learning a new codebase or managing a large project at work.

There are some projects that are so incredibly complex that they cannot be built or fully understood by a single individual. To complete these projects, a team of developers needs to work together to build individual components that fit together to build a complete system.

Large software projects are inherently risky. They take up huge chunks of the engineering organizationโ€™s time and resources in an effort to build something that no one fully understands and that no one fully knows will succeed or not in the end.

Itโ€™s impossible to completely eliminate the risk involved in these large initiatives, but there is a useful tool for managing the complexity: decomposition.

Decomposition involves breaking down the problem into smaller and smaller pieces until each individual piece can be comprehended and completed individually. When decomposing large-scale projects, look for patterns or common components of work within the requirements and group them to create boundaries around related tasks. Doing this will help you expose natural hierarchies that simplify complex systems.

Breaking down tasks into smaller, more manageable chunks has the added benefit of exposing relationships and dependencies between the tasks. You may find that one task has to be completed before another can begin, or you may be able to identify tasks that can be worked in parallel by you and your team members so that the team can move quickly. Sometimes, things need to be built in a specific order, so use this technique to help expose critical dependencies and identify risks that may delay the project or prevent your team from meeting their deadline.

When to use decomposition:

  • When dealing with large projects: In an agile development shop, you would break down long-term initiatives into medium-term epics, which are further broken down into short-term user stories. You can then help prioritize the order of user stories that should be worked on first.

  • When refactoring large pieces of a codebase: Big changes equal big risk. Break up the changes into small pieces and refactor them piece by piece over time. Thereโ€™s much less risk in deploying incremental changes to a production environment than there is to deploying one large change.

  • When dealing with quarterly or annual goals: You may have a few main short-term and long-term priorities, but what you need to do to meet those goals may not be obvious. Breaking them down to smaller subgoals will help you work backwards and figure out a plan of action.

No matter how large a task or project is, decomposing the problem is all about breaking down the requirements into smaller puzzle pieces. While this allows you to organize the pieces so they are easier to understand, what it really comes down to is managing risk by planning ahead.

Unlock expert knowledge.
Learn in depth. Get instant, lifetime access to the entire book. Plus online resources and future updates.
Now Available

Planning Ahead

Every battle is won before itโ€™s ever fought.Sun Tzu

Part of our strategy is getting the programmers to think everything through before they go to the coding phase. Writing the design documents is crucial, because a lot of simplification comes when you see problems expressed as algorithms.Bill Gates*

Whether youโ€™re assigned a ticket to work on or youโ€™re able to choose which ticket to pull in next, taking a little time to put a plan together can go a long way in reducing wasted time and effort on coding the wrong solution. Depending on how much information is in the ticket, it may be a straightforward change thatโ€™s been thought through already, which is great! But there will be times where you donโ€™t have quite enough information from the ticket, and youโ€™ll need to do some research and planning before writing any code.

A common habit among junior programmers is that theyโ€™ll begin writing code as soon as they pull a ticket from the backlog. They may not fully understand the problem or may not have a complete grasp on the codebase, so they start making small changes here and there to see if they can come up with a solution that works. While you might find a good solution now and then, thereโ€™s a good chance some other programmer on your team had a different implementation in mind, and oftentimes, theirs might be better because they understand the problem or the codebase better.

Coding without a plan is a mistake that telegraphs your inexperience to your manager and the rest of your team. It often results in a lot of rework because you donโ€™t fully think through the problem and have to change direction before coming up with the final solution. You should be deliberate with most changes and not settle on the first solution that comes to mind, because there are oftentimes better ways to solve a problem.

An easy technique you can use to reduce the risk of rework is to plan out your work ahead of time. In fact, this is a technique youโ€™ll be using quite a bit as a professional programmer. Youโ€™ll find that this will improve your decision-making skills because you can work through different scenarios and eliminate ones that are insufficient or that would be difficult to maintain or extend in the future. Planning gives you an opportunity to iterate on your solution before writing code, rather than having to rewrite large chunks of code. After all, itโ€™s faster and cheaper to refactor an idea on paper than it is to refactor code thatโ€™s already been written.

Rewriting code is expensiveโ€”it can cost hundreds or thousands of dollars because you didnโ€™t consider the consequences or side effects of a solution before implementing it. You may have to toss out code you spent all day writing because your coworker pointed out an edge case you didnโ€™t think about ahead of time. Itโ€™s frustrating when you have to throw out work, especially after spending a long time working on the solution. Planning ahead hedges against the risk of having to toss out code.

Planning ahead also helps you see the bigger picture of the problem youโ€™re trying to solve, because it forces you to think about important decisions upfront when itโ€™s cheap to iterate to better solutions. Over time, the software industry has adopted different tools and frameworks for writing and structuring your planning. Design documents are the most common tool used for planning out the technical details of a software project.

A design document is a template or worksheet that helps you think about how your solution will meet a set of technical requirements. The technical requirements describe what the end result should be, and your design document describes how your solution will meet those requirements. A thorough design document should contain everything you and the other developers need to write the code to satisfy the projectโ€™s requirements. Once youโ€™ve spent the time thinking through the solution, the design document will serve as a guide for you and the other programmers throughout the life of the project.

Not only does a design document serve as your guideline, but it allows your teammates to evaluate and peer review your ideas before you spend valuable time and money implementing your solution. Itโ€™s nearly impossible to know every side effect and dependency of the code youโ€™re writing, so the more kinks you can iron out in the design, the easier your code will come together when itโ€™s time to write it. Spending time compiling your ideas in a design document forces you to think through the architecture and how it will integrate with other parts of your codebase, as well as helping you gather valuable feedback before spending developer-hours implementing the wrong solution or even solving the wrong problem.

A primary purpose of using design documents is to come to a consensus on a solution before implementing it, which helps avoid costly disagreements in the future. By getting all parties on board with a solution, you can be sure that what you deliver is what was agreed upon. And if the stakeholders come to you and try to increase the scope of the project or change direction, you can point to the design document and show them what they agreed to at the beginning of the project.

It may feel like more work up front, but itโ€™s much cheaper to change the design of a solution during the planning period than it is to make an expensive change once the code has already been written.

Code Reviews

While this one may seem obvious to most people, the number of teams that ship code to production without a proper code review process is probably higher than youโ€™d think. Under tight deadlines and stressful or even lax work environments, it is easy to skip the code review process altogether, and that introduces the risk youโ€™ll ship buggy code to production. Itโ€™s especially common on small teams or on newer projects because things change so quickly as youโ€™re building out a minimum viable solution.

While it may allow you to ship code faster, skipping code reviews comes at the expense of code quality. Adding a second pair of eyes to peer review your code increases the quality of your work because your coworkers might catch bugs that you didnโ€™t even know existed. In addition to catching syntactic errors or nitpicking on coding standards, your coworkers may also catch potentially dangerous errors in the logic itself. Code you were confident worked one way, may work a completely different way if thereโ€™s a misplaced operator or parentheses. Your coworkers may also have more knowledge about a specific part of the codebase that youโ€™re changing and can help identify unforeseen circumstances with the changes youโ€™re proposing.

Thereโ€™s no doubt that code reviews can be frustrating. You may think youโ€™ve done a good job coming up with a good solution to the problem, and youโ€™re probably proud of the code youโ€™ve written, but your teammates may pick apart your code and ask for changes. Theyโ€™ll ask questions about why you built something the way you did and suggest edits that you may not think are correct.

Itโ€™s easy to get defensive when it feels like youโ€™re being attacked, but itโ€™s important to remember that theyโ€™re not criticizing you personally. Youโ€™re all part of the same team and itโ€™s everyoneโ€™s responsibility to ship quality code. Try to keep in mind that theyโ€™re just trying to help you make improvements to your code.

Plus, there are plenty of benefits to code reviews that you may not realize, such as:

  • When you review other peopleโ€™s code, it helps you learn the codebase.

  • The codebase is constantly changing, so it also helps you stay up-to-date with the modifications being made.

  • Youโ€™ll be exposed to new techniques and patterns from the code that your coworkers write, and it will help you write better code.

  • Your coworkers may offer advice on a better way of solving a problem, helping you learn and grow as an engineer.

  • Requiring one or two pre-merge code review approvals adds checks and balances to reduce the risk of shipping buggy code.

  • Having your code reviewed forces you to tie up any loose ends and make sure your code works and has been tested before submitting it for peer review. Just knowing that your coworkers will catch bugs means youโ€™ll spend extra effort making sure your code works properly.

  • Code reviews give developers a chance to enforce consistency within the codebase, from patterns to naming conventions and syntax.

  • Code reviews help catch critical mistakes that are often overlooked or misunderstood by the author.

  • Your coworkers will help ensure your code meets the project requirements as well as your organizationโ€™s coding standards.

  • Your coworkers may find performance issues in your code and suggest ways to improve the efficiency of your algorithms.

  • Likewise, your coworkers may find security issues in your code that could compromise your businessโ€™s credibility or, worse, your customerโ€™s data.

The list above is by no means exhaustive, and there are many more benefits to the process of reviewing code before merging it into the main branch. While it can feel like a burden and extra overhead to some programmers just starting their career, the benefits outweigh the costs in the long run based on the number of issues that are caught during the development phase instead of allowing them to slip through to the staging and production environments.

Code reviews are all about managing and reducing the risk involved in shipping defective code. Just like authors, researchers, and students need to have their writing peer reviewed, so do programmers. Weโ€™re not able to catch every mistake, especially when weโ€™re deep in the weeds trying to get our code to compile correctly. Having other developers double-check your work benefits everyone in the long run.

โ€‹resourcesโ€‹

Static Code Analysis

Static code analysis is the act of analyzing a codebase without actually executing its code. The technique is gaining popularity among software organizations, and many teams are adopting tools to help standardize and find vulnerabilities within their code.

There is an entire industry dedicated to automating static code analysis so that you can focus on what you do best, building value for your customers. Some of the more advanced static code analysis tools will scan your software dependency graph for vulnerabilities and alert you to any libraries that you should upgrade and replace due to security issues. They often use proprietary or open-source databases, maintained by security researchers, to track known software vulnerabilities.

โ€‹exampleโ€‹Here are a few examples of some great static code analysis tools:

  • SonarCloud helps you quantify code coverage and identify security vulnerabilities, duplicate code, and code smells.

  • Snyk helps you find and automatically fix security vulnerabilities in your code, open-source dependencies, and infrastructure code so you can focus on building.

  • GitHubโ€™s Dependabot helps you keep your dependencies up-to-date by automatically opening pull requests against your GitHub repositories to install updates.

If your team doesnโ€™t already use static code analysis to aid in finding and fixing vulnerabilities, consider suggesting that they try out some tools. Youโ€™d be surprised at what vulnerabilities may be lurking in your codebase, and you can leverage these tools to harden your systems and build more reliable software.

โ€‹resourcesโ€‹

Automated Tests

In the previous section you learned how an automated test suite can provide immense value to your team. Automated testing is so important that itโ€™s worth mentioning again, because it doubles as a way to manage and reduce the risk of introducing defects when making changes to existing code. A team with sufficient automated test coverage across their codebase can proactively catch bugs faster and cheaper before their code changes hit production.

Building good habits like writing unit and functional tests when you commit new code is one of the best things you can do as a junior programmer. If your team doesnโ€™t already have a test suite or a continuous integration system in place, use that as an opportunity to suggest one and implement it yourself. Itโ€™s a lot of work up front, but itโ€™s a long-term investment that will bring improvements to developer productivity for years to come.

โ€‹exampleโ€‹Here are examples of how automated testing can help you and your team:

  • Automated tests lead to increased productivity, because you can make changes to parts of the codebase with confidence that youโ€™re not breaking existing functionality.

  • Faster feedback loops because you can run the tests locally or on your continuous integration server as youโ€™re making changes. Thereโ€™s no need to deploy your code to hosted environments to make sure itโ€™s working properly.

  • The overall software development life cycle can be shortened because you can make changes and write new tests to ensure the code is working properly.

  • Youโ€™re able to reduce the risk of introducing new defects because you can write test code that checks for specific edge cases and then run those tests over and over again.

  • Automated tests allow you to focus on feature development and building for scale, rather than tracking down and fixing bugs introduced into the system when you make changes to legacy code.

If your team already has a continuous integration system in place, thatโ€™s great. All you have to do then is build the habit of adding new tests with every change you make. Youโ€™ll be surprised at how quickly your test suite grows, and pretty soon youโ€™ll have good coverage over the business-critical components of your system. The more test cases you can cover, the lower the probability of introducing regression issues into your codebase. And lowering the probability of introducing breaking changes lowers the risk when refactoring or making changes to the system.

Postmortems

Failure is only the opportunity more intelligently to begin again.Henry Ford*

This whole section has been about managing and reducing risk, but an unfortunate fact of life is that itโ€™s nearly impossible to completely eliminate all risk involved in writing software. With any moderately complex software, things will go wrong at some point. And sometimes things will go very wrong. Failure is inevitable, and at some point, youโ€™ll be pulled into an incident. When these incidents happen, itโ€™s important to use them as learning experiences and take the time to reflect on the preceding events in order to better understand how and why they happened. In doing so, youโ€™ll be able to learn from your mistakes and make any appropriate changes to prevent them from happening again in the future.

The best thing you can do in the aftermath of an incident is to capture and document what happened leading up to, during, and after the incident so that you can reflect, learn, and share that knowledge with others within your organization. This process is known as a postmortem.

Youโ€™re reading a preview of an online book. Buy it now for lifetime access to expert knowledge, including future updates.
If you found this post worthwhile, please share!