Graham King

Solvitas perambulum

Growing software, in two tweets

software
Summary
I find gardening to be the perfect metaphor for software development, as I navigate its complexity through a clear process. I start with a simple `main` function, then extract and organize functions into objects when it grows too large. When the number of objects increases, I group them into packages and protect their internal methods. If the package structure becomes unwieldy, I create a new project for logical clusters. This iterative approach, reinforced by unit tests, helps me manage complexity effectively.

Gardening is my preferred metaphor for software. When I grow a new piece of software, there is a very predictable path I go through as I attempt to manage it’s complexity. It is, in fast-forward, the history of programming language design. That process is short enough to capture in about 400 characters, as two tweets.

1. Write main. Put your program in there.

Call it fear of the blank page, but I always start by writing a program that prints “Start”. I build and run it. From then on, it’s all maintenance. Next I put some things after “Start”.

2. main too big? Extract functions. Use global vars.

Eventually the single function grows too big. Exactly what “too big” means depends on you and the function, but a solid guideline is “easily fits on the screen”. The big function is obviously doing a few things. One easy way to identify the “things” is they often appear as a block of code preceded by a comment. I move each of those blocks into it’s own function, passing the data as function parameters. Some data will be share by many functions. Make those variables global.

Different programs grow to different sizes. It’s OK to stop at any point, so sometimes I’m done. Simpler is better. Find a different way to impress your intelligence on people.

3. Too many functions? Group functions and their data (replacing global vars) into objects.

Gradually I notice that some of those functions belong together. They have similar names. They work on the same data. Time for an object (struct or class, depending on the language). A new object means a new file, even if the language doesn’t force that. The functions that belong together become methods on this new object, and the global variables they use become the object’s data.

4. Too many objects? Group objects into packages. Make some methods private.

When there are too many files in the current directory (i.e too many objects), it is time to group some of those objects together. The way to do that is packages. Some set of objects form a logical grouping, usually because they relate to the same real-world thing and collaborate. I move those into their own directory and make that a package.

Some parts of this new package are used from outside. That is the package’s interface. I protect the other methods so that they can only be used from inside the package. This public interface might later become an actual interface, with multiple implementations. This is particularly helpful when writing unit tests.

5. Too many packages? Split out a new project, either library or network service.

Occasionally even a tidy hierarchy of packages is not enough to control the complexity. Time to split the project, to make a new repository, and move a logically consistent set of packages to that new project. The original project might use the new project as a library, as a sub- or peer process, or as a network service. Occasionally they don’t need to talk at all.

6. Iterate from 2 when other functions grow too big.

Things that are too complicated feel uncomfortable. As soon as a function trips that switch I start splitting it up, iterating from step 2 above.

7. (bonus) Write unit tests

Writing tests makes it more obvious when things are too complicated, and makes it easier to safely refactor to reduce that complexity. By far the best book here is Growing Object-Oriented Software, Guided by Tests. It dramatically improved my code.