How long does it take to get used to a codebase?
How long does it normally take for a developer to get used to a code base, at least familiar enough for him or her to become productive? It is a difficult question, but long story short. You can’t know, however if you are going to guess, 6 months is a great guess.
I’ve heard people say that hiring a brand new developer usually gets about 6 months of warm up time. However, that’s more like a wild guess.
The trouble is that no two projects are the same. They’re difficult to compare and measure. Their complexity and size will differ. Their size isn’t exactly relevant, because it’s hard to tell which code is in use, or part of an imported library. Sometimes you can’t even trust the git history to give you an idea of who and what you’re dealing with.
The general rule is “The larger code base is the longer it’ll take to understand it”, but the relationship is not linear but exponential. If the code you’re working on is big enough you’ve got no hope in being able to understand it within your lifetime! Think of it, even if you spent a lifetime doing nothing but trying to read all day every day you still wouldn’t reach the end. I’ve heard stories of Oracle or Google’s code bases being so large that just running unit tests can take multiple hours sometimes more than a day. So the fact is you’ll never completely understand it. Even with the medium size code base you’re not likely to understand every piece of it, but the best you can do is arm yourself with some tools and techniques to help you understand the code you’ll actually be touching yourself.
Understanding how to see how a function fits in.
Understanding how to find all the touch points, all the interactions with the code you’re looking at. For instance, one function can be called many times, you’ll want to understand how it is called, and find out what’s going on in each place that calls the function you’re looking at changing.
After you make absolutely sure that you understand all the different places where it’s called you’ll find that you understand the purpose of your function pretty well.
An example of this would be finding a function that gets users or something very basic and common like that. What you could do is find that function find every usage of that function name if it’s inside of a class you’ll need to find every usage of that class (every time the class gets instantiated) you might find it useful to put down the different classes and usages in a note on the side to keep track of them. Then mark them off one by one as you inspect them. but this is the problem when you’re making changes in a large code base, there’s a lot of things and they’re all connected somehow. Since you need to understand what is going to get effected.
Most often, If you’re going to make a change and there are a lot of tests, it seems like there is good test coverage, and everyone has high confidence in the test suite, then perhaps it’s safe to make a change, run the tests, and see if everything is still working. However if you’re working in an environment that’s more chaotic than that, what you’re doing can be pretty risky, and you’ll need more time to understand your surroundings.
Another thing you can do is demystify little pieces of code that don’t make any sense is a nice technique I learned from an old coworker. Just delete it and see what breaks. You should be able to see changes in the application. When and where errors start to crop up, or if there’s no errors at all. Maybe the code doesn’t get used anymore, it’s worth investigating further. I find deleting pieces of code works wonders and quickly shows you the significance of the code in question. This will work a lot better in interpreted languages because everything else seems normal, but a compiled language will give you different compiler messages that will help you in your search to understand.
Another thing you can do is to take some of the features that you know exist inside the code base and write them yourself. What this does is it puts you in the mood to see how it was done inside the actual code base. Just as soon as you run into a problem or a really tricky part. It’ll help drive that motivation to cheat, and just see how it’s done in the current code base.
If you’re in a hurry and you want to understand quickly what’s going on, often times just reading the function names in the class names can give you a general idea. You’ll be introduced to the biggest actors, or classes, and the kinds of behaviors they are expect to do.
If a code base is very large and very confusing you might consider looking at the history and watching from the beginning as the project as it gets created by git commits. You can see the parts added by using the git log command, or viewing it in some Git GUI. This helps you understand the rationale used in structuring the whole program. What parts of the codebase are the oldest, and what’s new.
Using a debugger is another way that you can get a good idea of what’s going on exactly. Especially if you have a bug somewhere or a function you know you’ll use. Just to put down a breakpoint and try to run your program. You will see when it’s called, and what kind of context it’s called in. You’ll find out if it’s in a loop. You can see the stack trace, and find it it’s towards the front or the back of a long line of execution. Sometimes just seeing what the values are inside the variables can help clear up the behavior the code is causing.
Sometimes you have access to the authors of the code base. I don’t think you should contact strangers, but if the author is in your company, or if the situation is dire enough, you can go find the authors. If you know how you can use the get blame function to see who committed it. You can find who to talk to. Generally you’ll be able to see their email address, and you can ask to meet with them, or buy them lunch.
If you’re stubborn, then digging through the code yourself and not asking for help can take hours. But if you’re more flexible and humble, then the original author of the code might supprise you with friendly and helpful advice, or understanding. And if they refuse then you’re just back to square one.
You might get other information that’s not in the code like. What parts are finished, what parts aren’t finished but seem to be. What the plan was, what the intention was. Business logic, special cases etc.
Even if it was too long ago and they don’t remember, they can look at their code and jog their own memory.
Getting information from the author can be easy, but be warned. Don’t be rude, and don’t criticize them about their code! They will probably be happy to give you the story behind the code that exists already but if you ruin your opportunity by messing up your relationship with them, they probably won’t want to help you.
Be tactful when asking about someone else’s code, a lot of programmers do get sensitive about their work, it’s just a fact.
Another issue here is you might ask how long will it take me to get a hold of this code base and understand it, but that’s not the right question because no matter how long it takes you totally to understand the whole code base. The real question is. How long until you can make good work?
That can happen quickly as long as you can get a grip on little pieces of the code over and over again as needed.
Now in the case of language features like PHP functions, built in JavaScript functions, libraries, API’s etc. Those are examples of things that are generally limited to a size where you can learn every function, or at least see every piece of it that you need to know about. With these you can get after it and come to a pretty complete understanding.
Anyway happy code reading, it’s the thing you will probably end up doing more than writing code in your career, and it’s a great skill to learn.