Have you ever thought about large software systems as organic living organisms? This sounds a bit odd and, maybe, philosophical. Starting with first principles: everything in the world is based on fundamental physical laws. When we think about the organic world we often imagine fuzzy, imprecise, evolving and reproducing things. But when we think about a software program, we often imagine strictly deterministic outputs to given inputs with no typical attributes of living organisms, though on a completely fundamental level both are build with the same basic physical particles and follow same laws. At the level of modelules of even proteins organic world is fairly deterministic similarly to small software systems, but when we get to a system that is built by hundreds of software engineers over a decade that serves millions or billions of requests per day it has more resemblance of a living organism than of that exact deterministic machinery.

I could think of multiple similarities, for instance:

  • Growth and evolution. A codebase is constantly adapting to a changing environment, engineers constantly modify code. Some sub-systems die-off and some new ones are being added.
  • Metabolism. The system consumes resources (CPU, ram, gpu, storage, network, or whatever other capacity) in order to operate (live). It has a “metabolic rate” (your cloud bill) and produces “waste” (logs, error reports, heat).
  • Homeostasis. The system attempts to maintain a stable internal state. Monitoring, alerting, and auto-scaling are all of the things to keep it healthy and in balance. At work people even say “system health”.
  • New Behaviors. This is a bit interesting. In any system with enough complexity, the interactions between individual parts sometimes create unpredictable, emergent behaviors that no single person designed. This sometimes results in extremely difficult to debug bugs or system behaviours that are challenging to explain.

Counterarguments could be things like:

  • Replication. Arguably complex systems are not that advanced in self-replication, at least this isn’t their primary goal (unless it is computer virus). We have some data replication and scaling as some analogies, but maybe not quite enough.
  • Consciousness. Although large systems don’t have “consciousness” we can argue that encoded adoptions, distributed algorithms, automatic maintenance, state monitoring are partially related to this. This becomes more blurry in the world of AI.
  • Non-determinism. Systems are often designed to be deterministic. When they are large and complex enough randomness is necessarily introduced and ML models leverage randomness in principle.

Working for multiple very large companies serving billions of users definitely made me think that large software systems grow beyond human-scale comprehension. Sometimes I think these large systems are organisms and software engineers are part of this organism working on making parts of this organism alive and evolving. Also thinking of an analogy of software engineers looking after a large system as gardeners looking after a living garden. If it’s not looked after properly it becomes messy and bushy.

What does this mean practically? Well, this was just a thought I typed in, so not 100% sure but, maybe, it could mean:

  • Stop trying to fit the entire system in our heads and instead build world-class observability (metrics, distributed traces, and logs) and keep parts of the system you are responsible for healthy.
  • Embrace a bit of chaos. The system would sometimes behave unexpectedly. Some practices to stress test the system are good (chaos engineering) the same as organisms are sometimes tested with adverse environments or viruses/infections.
  • Design with evolution in mind and do not expect your solution will be final or will never be pruned by other engineers.
  • The more parts of the system are built with AI assistance the more the system becomes organic. We should focus on ‘trunk’ of trees while we might allow AI to build leaves and small branches.