dependencies

Software with too many dependencies is an obstacle to sustainability.

This is because each software dependency :

  1. follows its own life cycle, its own version scheme ;

  2. is dependent on the machine where the software is installed;

  3. makes it difficult to rebuild the software exactly;

  4. can “break” the software at any time;

  5. increases the size of the software deliverable;

  6. can be a source of anomalies.

Not all dependencies have these disadvantages. On the other hand, this list is not exhaustive. And every dependency often has dependencies too, increasing the potential drawbacks exponentially.

Making software dependent on a new dependency can save time in the short term. However, this time saving must always be paid off in the long term. The longer a software’s lifespan, the less ROI it has [TechDebt].

However, scientific projects should in most cases be designed for the long term only. As a result, all their software components should pay utmost attention to their dependencies management.

direct or indirect dependencies

All the disadvantages listed above apply to a project’s dependencies, but also to the dependencies… of its dependencies. And things can escalate quickly [1] [2].

If a project depends on even a single dependency, and if this dependency has a complex dependency graph, not only will your project have an exponential chance of suffering from the disadvantages listed, but you may also be unable to even circumvent them, since you have no control over the life cycle of your dependency!

internal or external dependencies

This section discusses the disadvantages for of having too many dependencies external to the project, i.e. dependencies not directly managed by the members of the project.

Conversely, a project can be developed on a modular way. In this case, some of its components depend on other components that are internal to the project. This offers a number of advantages, which are described in chapter modularity.

“dev” dependencies

Some dependencies declaration files [3] differentiate between main dependencies and devDependencies. The difference is that “dev” dependencies are not essential for the software to fulfill its functional scope, once it is built.

In particular, “dev” dependencies include:

  • tools used to build the software itself (ie. build tools) ;

  • tools used to check software quality (ie. test engines, linters, …);

  • tools for generating software documentation;

  • tools to make life easier for developers (ie. generators, fixtures, configurations, …).

The same precautions apply to “dev” dependencies as to others. This means that you shouldn’t just declare any new dependency, even under the pretext that it’s a “dev” dependency. Everyone should be able to contribute with the tools they favor, without being overly constrained by the practices of others.

Sharing techniques is profitable and worthy of encouragement. However, everyone needs to be able to differentiate between what is optional and what is not.

Incidentally, being able to “do without” wheen needed is also a guarantee of sustainability.

Not Invented Here

This goal of minimizing the number, the size and the importance of external dependencies should not be interpreted as evidence of NIH syndrome [NIH].

The aim is not to redevelop everything yourself, or to neglect the benefits of shared development. The challenge is to think carefully about the management of technical debt [TechDebt], before contracting it.

In practice, this can be summed up by a series of questions:

  1. Do I really need this functionality?

  2. Do I really need to implement it this way?

  3. Do I really need this dependency, or just a subset of its functional scope?

  4. Is there a library that offers only those features I need?

And so on…

practical implications

All software presented here pay utmost attention to their dependencies management.

In the best case, they have no dependencies.

Otherwise, the use of each of their dependencies :

  • must be justified: that is, the dependency must have a real benefit, and this benefit must be all the greater if the dependence is huge or presents a risk to its long-term sustainability;

  • must be declared explicitly and in accordance with the standards of the technology used [3] ;

  • must be justified in the same place as its declaration [4] ; i.e. :

    1. what actual benefit does the dependency offer for the project?

    2. where is the documentation for the dependency?

    3. what are the alternatives, and why aren’t they preferable?

  • must be as flexible as possible in terms of versions accepted –at best, all versions of the dependency (and at worst, at least the most recent version) must behave perfectly with the software. In particular, the dependency must not be “frozen” in a specific version, even a recent one.

Dependencies versions

The fact that the components of a project do not wish to “freeze” any of their dependencies in a specific version, even a recent one, may seem to run counter to the idea of a stable, reproducible software environment.

Version “freeze” tools have two objectives:

  1. to stabilize software development and maintenance, so that every member of the development team, and every production target, has the same software environment;

  2. to make the software and its environment reproducible.

These two objectives are based on the assumption that the software life cycle is divided into two phases: an active development phase, and an inactive phase, often much longer than the development phase.

This assumption is null and void if the software is continuously under development, over a very long period of time, i.e. long enough for the latest versions of your dependencies (at the time when you started development) to be obsolete today.

But then, why do the software presented here choose not to do “what everyone else does”, i.e. to update previousely “frozen” dependencies from time to time, in a manner in line with the project schedule?

The answer is that, with such an approach, all dependencies tend to “break” at the same time, and such a situation can quickly become a hell to fix. Yet, time-consuming and risky tasks represent an investment in human resources that research projects generally just don’t have.

So, if when a component must break, it’s best that it does so in a spectacular, localized and quickly detectable way, so the issue can be quickly adressed.

Consequently, you should never see any package-lock.json, pnpm-lock.json, composer.lock and the like at the root of one of our sofware. If you are Contributing to any of them (thank you! ᵔᴥᵔ), make sure the ones generated locally don’t prevent you to upgrade packages on your dev workstation often.

  • Our minimum reproducible environment is everything in its latest (stable) version.

  • Our maximum reproducible environment is everything in all (stable) versions.

If a dependency is unable to provide its functionality in its last stable version, maybe it is time to get rid of it and choose a more stable alternative…

References