First craft a language then build your software

Babel Tower

A language suit for purpose enables us to express our thoughts clearly and concisely. In programming, this language consists of our actual programming language, the libraries we use, and the abstractions we build. I believe that most of the incidental complexity in code stems from an unsuitable language. A misfit language forces you to speak in a lengthy and roundabout way, never quite getting at what you actually want to say.

Imagine you lacked the word "snow" and had to always say "white, fluffy, crystalline, frozen water". Now imagine you are writing instructions for waxing of cross-country skiis and you lack the words for snow, skiis, and the different kinds and states of snow. How verbose and incredibly hard to understand that would be! And this is exactly how most of our code looks. As Alan Kay puts it: "Most software today is very much like an Egyptian pyramid with millions of bricks piled on top of each other, with no structural integrity, but just done by brute force and thousands of slaves." [Kay] Let’s explore this idea further.

We also think that creating languages that fit the problems to be solved makes solving the problems easier, makes the solutions more understandable and smaller, and is directly in the spirit of our “active-math” approach. These “problem-oriented languages” will be created and used for large and small problems, and at different levels of abstraction and detail.

Sources of inspiration

The idea of the critical role of a "language" has been inspired by three sources. The first one is the STEPS Project 2012 report, which postulates that our code bases are 100, 1000, or more times larger than should be necessary. And the authors go on to demonstrate how equivalent software could be built with much less, mainly by leveraging custom, purpose-built languages. For example, their "Nile Language creates standard graphics rendering, compositing, filtering, etc., which cover virtually all personal computing graphics functions in just 496 lines of code." (If this topic interests you then you must check out the article Little Languages Are The Future Of Programming.) So the language we use really matters. (To contrast this, I have seen a damn context menu that took over 500 lines of code!)

The second source of inspiration was Christian Johansen’s observation that when we build web UIs, we are in the UI domain and the terms we use should deal with the visual structures we work with, and not the data entities in our database. You could say that we need to build our own little language of "design-based components" for data visualization and use this language to present domain data to the user.

The third source was the old claim that LISP programmers spend a good amount of time crafting a language for the problem at hand, which then makes solving the problem easy(er).

What constitutes a good solution language?

What ingredients do we need to craft the "ideal" language to solve a particular problem? The answer is somewhat fractal, because you need the best programming language / DSL at the lowest level, then the best libraries, and finally the best abstractions at the top level. The higher level the more important it is to make good choices. But a bad choice at any level will cost you.

You need a powerful enough programming language that allows you to express the abstractions you want. One that allows you to focus on the business problem and not waste time on technical details. You likely want a high-level language and give up a lot of control in exchange for conciseness. Compare adding a text field to a UI with a single word vs. specifying every single pixel of the component. The STEPS project indicates to me that a general purpose programming language might be too low level and thus not a good enough choice here. (Even if it is Clojure!) The language also includes the platform and its capabilities in the case of languages that come with one, such as Smalltalk or Erlang.

Next you need good libraries that provide you with capabilities (such as interacting with the AWS cloud), abstractions (e.g. state machines), and "architectural patterns" [I need a better term here] that allow you to structure your code in a particular way (for example core.async’s communication via queues). Libraries provide you with bigger building blocks to build upon.

The final key ingredient is the right abstractions for your problem domain. I would think that Domain Driven Design is a good starting point here. This requires a deep understanding of the domain you are working in, one that can only be gained incrementally, through trial and error.

On languages and architectures

How does solution language relate to software architecture?

A language is about possibilities, about everything that can be built. An architecture is about structure, namely the underlying, elementary structure of the final solution. It is the skeleton we build - using the language - so that we can layer the meat of the solution onto it. Thus architecture determines the shape of the solution. In other words, the language provides us with the building blocks and ways or relating them together. But what structures we build from those …​ that is architecture.

I believe someone described architecture as a particular solution to a set of conflicting forces. That is an excellent way of thinking about architecture.

Solution language evolves over time

A solution language is not like the ten commandments, set in stone and immutable once finished. It is rather a living, evolving system, perhaps most similar to the DNA of a virus: there are parts that do not change or change very little, while other parts change constantly to adapt to the constantly evolving environment.

It should be noted that the whole process of crafting a solution language, including its initial version, is rather evolutionary. You start with a best guess draft solution and a "cost function" - likely your experience-based intuition - telling you how good fit it is for the problem at hand. Then you continuously iterate towards a better solution, perhaps applying heuristics to avoid local maxima.

A key design skill is to separate the stable and the dynamic parts of the language so that a new insight can be incorporated without reworking everything.

The friends and enemies of brevity

The more code you write, the more complex it is and the more likely it has defects and the harder it is to keep it in a developer’s mind. Thus brevity is a worthy goal. Even with the ideal language, your code can be shorter or longer depending on your choices. Let’s explore some of the relevant factors.

Automation

Automation allows brevity - the more you let the system determine for you, the less you need to type. We touched on that briefly before. Giving up control may feel painful but it may really be worth it:

[..] a lot of the Cairo code was dedicated to hand-optimization of pixel compositing operations; work that could, in theory, be offloaded to a compiler. Dan Amelang from the Cairo team volunteered to implement such a compiler, and the result was Jitbit. This meant that the optimization work in the graphics pipeline could be decoupled from the purely mathematical descriptions of what to render, which allowed Nile to run about as fast as the original, hand-optimized Cairo code.

Side note: Wolfram is a fascinating language - primarily for scientific computations - that automates far more than your average language. Absolutely worth a look!

Customization

The more you want to customize something, the more you want to make it unique, the more code will you need. This is true even if you have the perfect language, and double so if you don’t. For example if you want to give each pixel in a dialogue a unique color, you have to at least type out those (count pixels) colors. So beware customization and its lifetime cost.

Corner cases

Corner cases are a huge source of complexity, possible bugs, and tons of extra code. So when you design your language and discuss requirements with your clients, try to minimize corner cases. And for those you cannot design out, consider early how to handle them. Could you ignore them, allowing the software to behave in a bad / undefined way, because their frequency and impact are not worth the effort and increased complexity? (Though you likely do not want to follow the example of C++ here :-).) Or could you handle them in a suboptimal but cheap way?

Examples

A few examples of efforts that are moving in the right (or wrong) direction.

  • Above all, the Nile Language described above

  • HyperFiddle / Photon, which tries to solve the hairy, time-devouring problem of making sure that frontend has the data it needs and that it (and its changes) are in sync with the backend (they observe that "most web application complexity is spent coordinating data flow between places in your system over network"). A key part of the solution is compiling your single source code to a dataflow signal graph, with parts distributed over backend and frontend. I am not sure how well this works in practice, but it sure is fascinating.

  • Clojure - one of the motivators that brought me to Clojure was my frustration with the inability to express certain abstractions (mostly related to control flow) in Java. Paul Graham has praised the virtues of a more powerful language ages ago. Clojure is better because it allows me to craft more my solution language.

  • Fulcro and its for Rapid App Development extension demonstrate that by building the right abstractions (here in the domain of full-stack Single-Page web Apps), we can become much more productive. The author, Tony Kay, has an uncanny ability to see problems clearly and in depth and thus distinguish what can and cannot be abstracted away. Most abstractions / generalizations we devs create turn out to be unnecessarily abstract in same directions and too constricting in others. Fulcro’s are spot on.

  • /2020/spring-nevermore/[Spring Framework ☠️ is an anti-example] of an effort to abstract and simplify that resulted in a complex, incomprehensible black box, which works at first but starts sucking your life out of you as soon as something is wrong (YMMV).

Summary

The software we build is unnecessarily complex and the code far too long. But we can do better. The key problem is that we use a misfit solution language, which forces us to speak in a lengthy and roundabout way, unable to clearly and concisely express our business logic and deal with essential technical complexity.

The solution then is to craft a "language" fit for the problem and domain. And by language I mean to complete tooling we use.

Normally, little thought is given to crafting a solution language for the problem at hand. We take the programming language we are given as the only option. We hopefully think a little more about libraries but not with the lens of language building. And finally we seek domain abstractions as something orthogonal to the first two, not seeing all three as parts of the same, synergetic whole. I would like to challenge and change this and introduce language crafting as a holistic, key part of software development. A good solution language is expressive and clear with respect to our particular needs.

We all know that a good domain model is crucial for building good software. Though knowing is not enough and it depends on the team’s discipline and prioritization how well they will do in discovering and maintaining their domain model. But as we have just seen, this goes much deeper than the domain model and its abstractions. That is just one of the three parts of a solution language. The programming language or problem-specific language we use - or ideally craft - and the libraries we pick both also contribute critically to the quality of our solution language. All three together determine how well we will be able to express business concepts and rules and solve the problem at hand. And each of these three legs supports and strengthens the other two, the whole being far more than a simple sum of its parts.

Aside: Question the status quo

This whole exploration has been triggered by my questioning of the status quo of software development. We take too many things for granted. General purpose programing languages as the proper tool, the necessity of the deep separation between frontend and backend (x Photon). Don’t. Look around, challenge all established "truths," keep looking for better ways.

References


Tags: opinion productivity


Copyright © 2024 Jakub Holý
Powered by Cryogen
Theme by KingMob