Making sense of code
Know your angle
Knowing why you have the code in front of you will help you to define the detail in which you need to understand it. Here are some possible options.
You’ve been asked to QA it- you will probably need to understand the nuts and bolts of functionality, at least enough to
try and break ittest its robustness.You need to run it in totality- in this case, you probably just need a high level view of what the code does and the key steps. You need to understand enough to debug errors and understand outputs but the nitty gritty is less important.
You want to use elements for yourself- you can focus just on the functionality you want, no need to understand everything, just how to find the key bits and why they do what they do.
You will need to maintain and adapt the code yourself- in this case, you will probably need an in-depth understanding of the code and its processes.
Know what the code does
Get an idea of what the piece of code / project was designed to do. Maybe it’s well-documented or its author has already told you what the purpose is, or maybe it’s a complete mystery. Either way, pick a simple example to work through or just run it all and see what it does.
Understand the geography
Most code will be split into sections, e.g. libraries, functions, data import, wrangling / manipulation, visualisation, output. Just knowing how the script is organised can help hugely with unpicking sections. Try and figure out broadly what each part of the code is doing and how the pieces fit together. Maybe it’s helpful to sketch out the different sections and what they do?
Familiarise yourself with the data
Have a look at the data that the script uses. Familiarise yourself with the format, column headers, metric values. This will help you to understand what specific bits of code and focusing on and, if you want to reuse any code, this can also help you to understand what format you need your data to be in.
Pick apart small sections
It’s often difficult to read code from start to finish- partly because the optimal organisation for code is not often the way that code is initially written. Try and focus on trying to understand one bit in detail at a time.
Think of something that you know that the code does and find that bit.
Make notes- maybe run the code line by line and make a comment detailing what that bit did.
Don’t get sidetracked- try and focus on one bit at a time.
Start from the beginning of a process, or the middle, or the end! Sometimes it’s easiest to start from the beginning and understand each step in-turn, other times the steps taken can feel illogical without knowing the end point. In these cases, you might want to start from the output and understand the previous steps. Other times you might do a bit of both and meet in the middle- do whatever helps you!
Trust the code
If there’s a section of the code that you don’t understand but it isn’t part of the process that you need to understand at the moment, it’s ok to treat it as a magical black box. You can always come back to it later but, for now, it’s just a part of the code that serves a purpose. Examples of chunky bits you might be able to ignore are anything that ‘makes things pretty’. Unless you’re specifically interested in the nuts and bolts of e.g. visualisations, that’s often quite a bit of scary looking that you can happily ignore without compromising your understanding of code processes.
The internet is your friend
Error messages are not always the most helpful. Fortunately, you won’t be the first person to have encountered an error. Copy the error message, minus any bits that are specific to your instance e.g. local paths, file names, etc., and ask the internet for help. If you encounter multiple error messages, start with the first (earliest) one- often it’s a single problem causing the subsequent errors.
Tips
Note: these are R specific tips to help with organising your own, and others’, code.
Indentation:
Highlighting code and hitting ctrl + i will indent all code. This makes it easier to read and spot the starts and ends of complex functions and loops.
Rainbow parenthesis:
This is a really handy way to patch opening to closing parenthesis. You can spend more time understanding what things are doing rather than counting and matching opening and closing brackets.
Tools > Global Options > Code > Display > Rainbow parenthesis
Searching functions:
If you don’t know what a function does or what its inputs / outputs are, you can search in the console e.g. don’t know what mutate does, type ?mutate and hit enter and you will see a short description of what mutate does and an explanation of the variables it takes as inputs.
Sections:
Create a section by ctrl + shift + r and giving it a name. Sections created like this can be collapsed and navigated to using the menu between the script and the console. This can help you navigate your code and focus only on the section you need.