Drupal’s increasing complexity is becoming a turnoff for developers

August 10, 2011

I’ve been developing custom applications with Drupal for three years, a little with 4.7 and 5, primarily with 6, and lately more with 7. Lately I’ve become concerned with the trend in Drupal’s code base toward increasing complexity, which I believe is becoming a danger to Drupal’s adoption.

In general when writing code, a solution can solve the current scenario in front of us right now, or it can try to account for future scenarios in advance. I’ve seen this referred to as N-case or N+1 development. N-case code is efficient, but not robust; N+1 code is abstract and complex, and theoretically allows for an economy of scale, allowing more to be done with less code/work. In practice, it also shifts the burden: as non-developers want the code to accommodate more use cases, the developers write more code, with more complexity and abstraction.

Suppose you want to record a date with a form and save it to a database. You’d need an HTML form, a timestamp (integer) field in your schema, and a few lines of code. Throw in a stable jQuery date popup widget and you have more code but not much more complexity. Or you could imagine every possible date permutation, all theoretically accessible to non-developers, and you end up with the 14,673 lines in Drupal’s Date module.

Drupal is primarily a content management system, not simply a framework for efficient development, so it needs to account for the myriad use cases of non-developer site builders. This calls for abstracting everything into user interfaces, which takes a lot of code. However, there needs to be a countervailing force in the development process, pushing back against increasing abstraction (in the name of end-user simplicity) for the sake of preserving underlying simplicity. In other words, there is an inherent tension in Drupal (like any big software project) between keeping the UI both robust and simple, and keeping the code robust and simple - and increasingly Drupal, rather than trying to maintain a balance, has tended to sacrifice the latter.

User interfaces are one form of abstraction; N+infinity APIs - which I’m more concerned with - are another, which particularly increase underlying complexity. Drupal has a legacy code base built with partly outdated assumptions, and developers adding new functionality have to make a choice: rewrite the old code to be more robust but less complex, or add additional abstraction layers on top? The latter takes less time but easily creates a mess. For example: Drupal 7 tries to abstract nodes, user profiles, actions, etc into “entities” and attach fields to any kind of entity. Each of these still has its legacy ID, but now there is an additional layer in between tying these “entity IDs” to their types, and then another layer for “bundles,” which apply to some entity types but not others. The result from a development cycle perspective was a Drupal 7 release that, even delayed a year, lacked components of the Entity system in core (they moved to “contrib”). The result from a systems perspective is an architecture that has too many layers to make sense if it were built from scratch. Why not, for example, have everything be a node? Content as nodes, users as nodes, profiles as nodes, etc. The node table would need to lose legacy columns like “sticky” - they would become fields - and some node types like “user” might need fixed meanings in core. Then three structures get merged into one, and the system gets simpler without compromising flexibility.

I recently tried to programatically use the Activity module - which used to be a simple way to record user activity - and had to “implement” the Entities and Trigger APIs to do it, requiring hundreds of lines of code. I gave up on that approach and instead used the elegant core module Watchdog - which, with a simple custom report pulling from the existing system, produced the same end-user effect as Activity with a tiny fraction of the code and complexity. The fact that Views doesn’t natively generate Watchdog reports and Rules doesn’t report to Watchdog as an action says a lot, I think, about the way Drupal has developed over the last few years.

On a Drupal 7 site I’m building now, I’ve worked with the Node API, Fields API, Entities API, Form API, Activity API, Rules API, Token API… I could have also worked with the Schema, Views, Exportables, Features, and Batch APIs, and on and on. The best definition I’ve heard for an API (I believe by Larry Garfield at Drupalcon Chicago) is “ the wall between 2 systems.” In a very real way, rather than feeling open and flexible, Drupal’s code base increasingly feels like it’s erecting barriers and fighting with itself. When it’s necessary to write so much code for so many APIs to accomplish simple tasks, the framework is no longer developer-friendly. The irony is, the premise of that same Drupalcon talk was the ways APIs create “power and flexibility” - but that power has come at great cost to the developer experience.

I’m aware of all these APIs under the hood because I’ve seen them develop for a few years. But how is someone new to Drupal supposed to learn all this? (They could start with the Definitive Guide to Drupal 7, which sounds like a massive tome.) Greater abstraction and complexity lead to a steeper learning curve. Debugging Drupal - which requires “wrapping your head” around its architecture - has become a Herculean task. Good developer documentation is scarce because it takes so much time to explain something so complex.

There is a cycle: the code gets bigger and harder to understand; the bugs get more use-case-specific and harder to nail down; the issue queues get bloated; the developers have less time to devote to code quality improvement and big-picture architecture decisions. But someone wants all those use cases handled, so the code gets bigger and bigger and harder to understand… as of this writing, Drupal core has 9166 open issues, the Date module has 813, Rules has 494. Queues that big need a staff of dozens to manage effectively, and even if those resources existed, the business case for devoting them can’t be easy to make. The challenge here is not simply in maintaining our work; it’s in building projects from the get-go that aren’t so complicated as to need endless maintenance.

Some other examples of excessive complexity and abstraction in Drupal 7:

  • Field Tokens. This worked in Drupal 6 with contrib modules; to date with Drupal 7, this can’t be done. The APIs driving all these separate systems have gotten so complex, that either no one knows how to do this anymore, or the architecture doesn’t allow it.
  • The Media module was supposed to be an uber-abstracted API for handling audio, video, photos, etc. As of a few weeks ago, basic YouTube and Vimeo integration didn’t work. The parts of Media that did work (sponsored largely by Acquia) didn’t conform to long-standing Drupal standards. Fortunately there were workarounds for the site I was building, but their existence is a testament to the unrealistic ambition and excessive complexity of the master project.
  • The Render API, intended to increase flexibility, has compounded the old problem in Drupal of business logic being spread out all over the place. The point in the flow where structured data gets rendered into HTML strings isn’t standardized, so knowing how to modify one type of output doesn’t help with modifying another. (Recently I tried to modify a date_select field at the code level to show the date parts in a different order - as someone else tried to do a year ago - and gave up after hours. The solution ended up being in the UI - so the end-user was given code-free power at the expense of the development experience and overall flexibility.)

Drupal 8 has an “Initiatives” structure for prioritizing effort. I’d like to see a new initiative, Simplification: Drupal 8 should have fewer lines of code, fewer APIs, and fewer database tables than Drupal 7. Every component should be re-justified and eliminated if it duplicates an existing function. And the Drupal 8 contrib space should follow the same principles. I submit that this is more important than any single new feature that can be built, and that if the codebase becomes simpler, adding new features will be easier.

A few examples of places I think are ripe for simplifying:

  • The Form API has too much redundancy. #process handlers are a bear to work with (try altering the #process flow of a date field) and do much the same as #after_build.
  • The render API now has hook_page_build, hook_page_alter, hook_form_alter, hook_preprocess, hook_process, hook_node_views, hook_entity_view, (probably several more for field-level rendering), etc. This makes understanding even a well-architected site built by anyone else an enormous challenge. Somewhere in that mix there’s bound to be unnecessary redundancy.

Usable code isn’t a luxury, it’s critical to attracting and keeping developers in the project. I saw a presentation recently on Rapid Prototyping and it reminded me how far Drupal has come from being able to do anything like that. (I don’t mean the rapid prototype I did of a job listing site - I mean application development, building something new.) The demo included a massive data migration accomplished with 4 lines of javascript in the MongoDB terminal; by comparison, I recently tried to change a dropdown field to a text field (both identical strings in the database) and Drupal told me it couldn’t do that because “the field already had data.”

My own experience is that Drupal is becoming more frustrating and less rewarding to work with. Backend expertise is also harder to learn and find (at the last meetup in Boston, a very large Drupal community, only one other person did freelance custom development). Big firms like Acquia are hiring most of the rest, which is great for Acquia, but skews the product toward enterprise clients, and increases the cost of development for everyone else. If that’s the direction Drupal is headed - a project understood and maintained only by large enterprise vendors, for large enterprise users, giving the end-user enormous power but the developer a migraine - let’s at least make sure we go that way deliberately and with our eyes open. If we want the product to stay usable for newbie developers, or even people with years of experience - and ultimately, if we want the end-user experience to work - then the trend has to be reversed toward a better balance.