Supersonic Man

March 27, 2019

what makes one programming language better than another?

Filed under: Hobbyism and Nerdry,technology — Supersonic Man @ 3:26 pm

Every programmer who knows more than one language has opinions about which languages are better than which other languages. I’m no different from anyone in that aspect, but I realize now that I’ve never taken the time to clearly think through what criteria to use in such a comparison. Of course there are times when different language features and styles are suitable for different tasks, but some generalities are pretty universal, and I think different situations mostly just change the emphasis and priority between them.

What got me to take a closer look was hearing someone state the opinion that the measure of the better language is simply the one which forces you to write less repetitive boilerplate. That turns out to be a surprisingly valid and comprehensive metric, despite how plodding and un-abstract it sounds, and I had never thought in those specific terms before.

So, what are some useful criteria for distinguishing a good language from a bad language? Here’s what comes to mind for me:

1. Efficiency. In general, the language should not penalize you for using it by producing a slow-running program, nor should it waste your time during the writing process. Typing a lot of boilerplate is definitely a loss of efficiency.

Supporting good execution performance implies that the language should include some capability to manually optimize the most performance-sensitive parts of the code — the little inner loops in which nanoseconds count because they might go through billions of repetitions. This in turn also implies that there needs to be some access to low-level primitive data types such as bytes. The performance criterion generally favors strongly typed compiled languages over loosely typed scripted ones.

2. Clarity. Simple and commonplace actions should be expressed simply, and complex innovative actions should be expressible as clearly and elegantly as possible, without being forced to translate the basic idea into some other idiom which doesn’t fit it. The language should support writing for ease of reading, as well as for effectiveness of execution.

Sometimes clarity comes a bit into conflict with writing efficiency: a more verbose language can be more readable than a terse one. And sometimes having to type more boilerplate makes the resulting code easier to follow, because the structure is easier to see. There are exceptions to this correlation, but languages that are famed for their brevity are often the same ones with a reputation for unreadability.

For clarity, language features which some consider overkill, or ripe for abuse, can be quite beneficial overall. Capabilities such as redefining what the “+” operator means, or using a novel definition of how iteration works in some new context, are beneficial in that they allow what may be complex inner behaviors to be used and combined in a very readable way.

3. Expressive power and convenience. If a language helps the writer express in one statement what might otherwise have to be broken down into several, that’s beneficial. This aids both writing and reading, and also improves code quality: research shows that the number of bugs introduced is proportional to the number of distinct statements written by the coder, not to the total complexity of the resulting program. When the language empowers the coder to use complex tools in a simple way, the software they produce can be more advanced and powerful for the same effort, and with the same quality.

It is beneficial to support templates with type parameters, so that it’s possible to support a substantial library of collection and iterable types which can be used flexibly and easily. To include features that ease common operations such as “for each distinct” or “first matching” or “all for which” on such collections is also of value — perhaps more value than is commonly appreciated. This is one of the most important factors for avoiding boilerplate: not just avoiding empty verbiage, but making sure the coder does not have to write their own version of a well-known standard operation.

Property getters and setters also help here. They were ideologically controversial at first because of the potential for misuse, but are popular for good reason. So is the fluent pattern — chaining consecutive method calls.

4. Nonrepetitiveness. DRY — don’t repeat yourself — is a rule to help coders produce quality work, and the language should follow it too. The code should let you express a fact once, rather than making you say the same thing twice. Repetition counts as boilerplate, but beyond that, there should be a single authoritative source for each piece of information, with no ambiguity.

This can be a surprisingly challenging goal once you start bringing in “glue” for APIs and data that exist outside of the language. Even within a language, mirroring the same data in different formats can lead to confusion over which version is the one that matters. When this occurs, the development environment needs to be very clear about which one is the original and which one is a conversion byproduct — a rule that is frustratingly ignored by popular systems such as Microsoft Visual Stupido.

“Glue” should be simplified as much as possible; if interfacing with something external which needs wrappers or something, try to either find a way to express the interface compactly without requiring manual coding of a translation, or provide a tool to completely automate the translation process. This particularly applies to ORM classes which wrap database tables. Expressing all this with fewer and more general language features is better than having to memorize a larger number of specific features or idioms of narrower use.

Another aspect of nonprepetitiveness is that arbitrary restrictions which make the coder jump through hoops, in the name of enforcing safety and best practices, should be kept relatively minimal, or offer only a low hurdle if some inconvenience is necessary. This favors weak, dynamic, or “duck” typing over strong static typing, but there is also a solid argument the other way, as such approaches can sometimes make mistakes more difficult to find, thereby wasting coders’ time.

Some commonplace language features to be avoided if possible because they are repetitive include:

  • forward or import declarations, especially in separate header files, or anything similar which requires you to re-specify an interface that’s already declared elsewhere
  • writing multiple overloads of a function just to give it optional parameters
  • having to spell out something’s type both when declaring it and when initializing it, particularly when generic parameters are involved
  • having to put the same modifiers onto many consecutive similar declarations
  • glue which has to be manually updated to be kept in sync with outside changes

5. Locality. All of the material which describes a given code or data entity should be readable in one place, not divided up into different sections or different files. For instance, if a piece of code represents a component of a web page, you should ideally have the markup, the styles used by it, the clientside script, and the serverside logic all packaged together. This is a goal of web framework systems such as Angular.

ORM in particular should strive to come as close as possible to seeing the database definition and the business logic spelled out in very nearby locations, even if normally they would have to be expressed in completely separate unrelated languages.

Exception handlers reduce locality; inline error handling may be better, though it has its own downsides in terms of clarity, as exception handlers let you express clearly how the algorithm is supposed to work when everything is as expected. I guess which is better depends on whether the errors in question are considered part of the normal flow (such as validating user input), or something rare and unexpected (like a disk failure).

Note that locality conflicts with the traditional teaching that code and classes should be broken up into many small units. Breaking things up adds clarity if the behavior that’s extracted can be fully described in a short phrase — otherwise, it impedes clarity.

6. Separation of concerns, and scalability in general. The language should aid teams of coders in dividing work, and minimize the need for people working in different areas to have to communicate and coordinate details. This sometimes makes locality more difficult, but there are major productivity costs when coders start working in large cooperative teams, and clear separations keep those costs to a minimum. The best thing you can do for one coder in a group, or one group in a larger team, is minimize the amount that they have to pay attention to what everyone else is doing.

Conflicts with the goal of locality can be mitigated by having exports and glues and such be automatically generated quietly and behind the scenes, so that depending on your context, you either never have to look at the glue file or never have to look at the original. Language features that are good by this metric include namespaces, packages, automatic dependencies, and automatically generated documentation. Also important here is the clear separation of public from private knowledge; you should be able to present your code, even if it has high locality, as an exported API that the reader doesn’t need to ask about the internals of.

* * * * * * * * * * * * * * * * *

So having set out some criteria, how do I feel some languages stack up by these metrics?

Javascript — the most widely used language in the world:
Efficiency is rather low, as it’s interpreted and lacks primitive type access, but this mitigated by aggressive optimizations that competing interpreters one-up each other with. Clarity and convenience of expression are medium, but suffer from strange workarounds and unwieldy syntax for common cases, such as using closures as class constructors (pre-ES6). Locality and nonrepetitiveness are good. Scalability and separation are a struggle, but improving as the ECMAScript 6 standard gains support. Overall avoidance of boilerplate is decent, given the necessarily rather small size of the language. Given its mandatory ubiquity, we could do a lot worse. (The TypeScript dialect improves expressive convenience and scalability, but is much more limited in where you can use it, and may raise the boilerplate level.)

Java — the inspiration of both JavaScript and .Net:
Efficiency is nothing special. Clarity and expressive convenience are definitely lacking compared to its contemporary competitors. Locality is fairly good, and so is scalability. But the boilerplate factor is frustratingly high, even before something like ORM gets involved. I call Java the oldest new language — the one that defines the cutoff between what’s obsolete and what’s current.

C — the classic that many of today’s popular choices are descended from:
Efficiency is exceptionally high, but clarity and convenience are way behind the times. Locality is not very good, and scalability is not either. The boilerplate factor is pretty bad when it comes to the actual algorithms you’re coding, unless you have an extensively updated library to mitigate this, but then readability suffers.

C++, a halfway modernized version of C:
Efficiency is not as high as C but is certainly competitive. Clarity and convenience are greatly improved in some areas but backward in others. Locality is possibly worse than C, despite extensive updating. Scalability is pretty good. Overall boilerplate factor is still not very good. This language is the last strong survivor from premodern times, and though it is still widely seen as the default for heavy-duty development, at its core it is now obsolete.

C# and related .Net languages:
Efficiency is pretty fair. Clarity and convenience are dramatically further improved over C++, but at a cost in complexity and feature-creep. Locality is pretty good — of course it falls down if you get ORM and glue involved, but that’s no different from its predecessors. Scalability is a strong point. Overall boilerplate factor is low-medium and has improved with language revisions.

PHP — the new default first language for noob web developers:
Efficiency sucks, clarity sucks, expressive convenience sucks, and scalability sucks. The only bright spot is locality. Boilerplate factor is moderate. This language fully deserves its bad reputation.

SQL dialects with coding extensions, such as PL/SQL and Transact-SQL:
Efficiency can’t be measured like other languages. Clarity and expressive convenience are a struggle, even in the specialized niche uses for which these languages primarily exist, which is where they’re at their strongest. Locality and repetitiveness are poor. Scalability is an area which remains awkward despite a lot of effort. Boilerplate factor is on the high side.

. . .

And now, let’s try forming some baseless opinions about some additional languages which I have not actually learned, going just by what I’ve picked up about them through idle curiosity:

Python — a popular scripting language which supersedes the likes of Perl:
Efficiency is not going to be a strong suit. Clarity is probably pretty good, but it sounds like expressive convenience is nothing to brag about. I’m not aware of any bad issues with locality or avoiding repetitiveness. Scalability sounds like it’s not too bad. Boilerplate factor is… I don’t know, kind of medium?

Ruby — a competitor to Python:
Efficiency is again probably nothing to brag about. Clarity is, I suspect, frequently on the challenging side, depending very much on the coder. Expressive convenience is apparently a strong suit; this is its selling point against Python. Again, I’m not aware of any bad issues with locality or avoiding repetitiveness. Scalability sounds like it’s probably not bad. Boilerplate factor might be pretty low, but I don’t know. This language is now dropping in popularity; the plain speaking of Python has won over more users than the more florid expressiveness of this alternative.

Go — the Google language:
Efficiency is said to be surprisingly poor. Clarity may be decent, but expressive convenience is not a strong suit. Again, I’m not aware of any bad issues with locality or avoiding repetitiveness. Scalability is… I have no idea, but it can’t be too bad. Boilerplate factor appears to be mild, as far as I can tell.

Rust — the competitor to Go from the Mozilla Foundation:
Efficiency is said to be much better than Go. Clarity looks decent. Expressive convenience is by all accounts higher than that of Go, at the cost of a tougher learning curve, but it’s still more of a detail-oriented language than Python or Ruby, being meant for systems work. It makes much heavier demands on the coder to adopt a novel idiom and jump through hoops in the name of safety. I’m not aware of any bad issues with locality, but there may be more repetitiveness than in Go. Boilerplate factor might be higher than in Go — maybe the difference is minor, but I suspect it might be substantial. This language is aggressively innovative in some areas, and we’ll need time to see whether its concepts move us forward or lead to a dead end. It’s starting to look trendy, but if it catches on I bet someone else will find an easier way to express similar ideas.

Swift — Apple’s language:
I really don’t know enough about this one to have any idea. What little I’ve glimpsed of it make it seem quite middle-of-the-road, with nothing exceptional about it.

Some other languages I know too little about but are probably worth mentioning include R, Haskell, Clojure, Erlang, OCaml, and Scala. Some of these languages are of interest because they are outside of the mainstream and have intriguingly unusual approaches such as functional programming, not because they are widely used. (Actually, Erlang just made codementor.io’s list of the top five languages to not bother learning.) I will say in general that the paradigm of functional programming, to me, is not ideal for clarity unless the problem being solved is of a mathematical nature, as it forces commonplace concepts into a new idiom — one which doesn’t fit well with interacting with live users.

* * * * * * * * * * * * * * * * *

My overall takeaways:

First: modern languages good, old languages bad. There are lots of languages people used to use thirty years ago which I have experience in and could elucidate above, but which I don’t consider to be worth listing because there are so few bright spots. For instance, many people are nostalgic about BASIC, but I consign it to the same trashcan as PHP. Just take it as read that even the ones I liked at the time, such as Pascal, generally suck.

Second, I feel pretty okay about sticking with C# and JavaScript as my default languages. Nothing else out there is giving me all that much grounds for envy. Been thinking about looking deeper at Python or Ruby or Rust, but they all have aspects that make me doubt whether the effort is worthwhile. Rust might be the most important one to look at.

Third, if someone can come up with a really practical way to move beyond SQL, they’ll get free drinks for life.

Advertisements

Blog at WordPress.com.