So, You Blew Up Your Drupal 8 Site - Now What?

Submitted by Kevin on Wed, 01/24/2018 - 18:14

Sooner or later, it's going to happen.  You'll apply a module update or do some other under-the-hood task and the next thing you know, your Drupal 8 site is no longer responding (often called whitescreening).  While whitescreening is a good security practice (as opposed to the site spewing error messages to the browser that might reveal sensitive details), it's a royal pain when you just want your site to work and you have no idea what's wrong or where to start with fixing it.

What follows are some tips and advice that might get you back in action again without (completely) losing your sanity.

Backup, Backup, Backup (and Restore)

I'll go ahead and get this one out of the way, even though it may be too late at this point to be of any help.  In the future, always have an automatic backup generation plan in place.  The typical advice is to always make a backup before doing anything dangerous to your site, like applying patches and updates, but it's often when you don't think your action could be very dangerous that it ends up frying the whole site.

Many hosting services come with backup systems, but you have to make sure the system is turned on and configured to create backups on a regular basis.  For sites that change daily, daily backups are a good idea.  For sites that change more infrequently, you might be okay with weekly backups instead.  Either way, test the system to make sure it's actually generating usable backup files (make sure you're getting both the filesystem and the database, too.)  After the first scheduled run, locate these backup files, copy them somewhere else and try to bring up a copy of your site with them.  Make sure that process works before going off and assuming you're going to have good backups when you need them.

If you follow this advice, then hopefully the worst you ever have to do is just wipe out the production site and reload the last backup into its hosting space.  You might lose a few hours or days of updates, but that's a lot better than a few months (or years) worth of updates.

Expert Tip: Even when restoring a backup after a site failure, never completely delete a broken site or its database - always save a copy somewhere, as you might be able to recover it in a development workspace eventually and/or be able to dig through the database to recover recent changes and copy them by hand into the restored website.

Recovery Sans Backups

The first step to recovery without backups is to find your web server logs and study them, both of which could be challenging steps.  If you are on a commercial standard hosting platform, such as Plesk (used by the Office of Information Technology) or CPanel, there's likely a tool in your hosting control panel to let you view the raw web server logs for your site.  If you are on something non-standard, you'll have to dig around and figure out where the logs are located.

Expert Tip: If you are on a non-standard hosting system, go ahead now and figure out where your logs are located, so you don't have to scramble to find them when a site crash happens.

Reading the logs hasn't gotten any easier with Drupal 8, as version 8 is built on Symfony, which spits out long backtrace logs, much of which is usually irrelevant to figuring out what's causing the site to whitescreen.  You'll have to skim through all of that output for keywords (usually found at the top of each backtrace) that point you to a specific module or core Drupal component.  Especially be watchful for errors relating to whatever components you might have been tinkering with when the problem occurred.  Even then, the error messages could be kind of cryptic.

If the web server logs don't provide enough detail, you can try looking at the Drupal logs to see if they offer more insight.  You might be asking, "How can I look at the Drupal logs when I can't access my site?"  Well, there's a couple of things you can try here:

  1. See if you can directly access the admin pages of your site (go to your site's url with "/admin" appended to the URL.)  If you are lucky, the problem you are encountering will only affect public pages of the site and not the admin pages, allowing you to dig around in the admin controls and use them to identify and possibly fix the problem.
  2. If you can't get to the admin pages, you can use your hosting account's control panel to dig into the site's database (usually through phpMyAdmin).  Look in the watchdog table to find the internal logs.  This table is actually fairly readable except for the timestamps, which are in UNIX epoch format.  However, since everything comes in chronological order, you can simply skip to the end of the table to see the most recent messages, which are likely to be the ones that will help you the most.

Expert Tip: To get the most out of your phpMyAdmin view, look for the "+Options" control above the header row of table output and open it up.  Select the "Full texts" and "Show BLOB contents" options and then select the "Go" button.  This will let you see a lot more detail without having to open up each record individually.

Dealing with Module Failure

A common problem to that causes whitescreening is a module update failing.  The first thing to try is to simply remove the files of the module from your site's /modules/ directory and replace it with the last known working version.  This may solve the problem until you can get an update that actually works – just make a big note to yourself to not try to apply the update again until a newer version has come out and you've tested it in a development space.

In some cases, though, a module update that fails midstream can leave the database out-of-sync, and simply reverting to the older version won't get the site working.  The next thing to try is to just remove the module's directory completely from your site and see if the site starts working then.  If so, then you can try flushing all caches (and running cron as well for good measure), then back up the site, and finally try installing the latest version of the module again.  This may or may not work, depending on what residue was left behind in the database and whether or not the module's install procedure can deal with that residue without freaking out.

Do note that Drupal will not react well to physically removing the directory for any module that add blocks, fields, field formatters, or other modular components.  That's because there will be pointers in the database to the module code that supports these objects, and Drupal wasn't written to gracefully deal with this code going missing.  While not always impossible to recover from problems with such a module, you will have to do more digging into the database to remove references (see "When All Hope is Lost" below), and even then you may not get your site back up.  If you do, then you'll find key elements missing:  in the case of a module providing a custom field type, you'd be missing all of data connected with that field type in your nodes, taxonomy entries, user records, or anything else that used fields of that type.

Disabling Modules in the Database

I'd only recommend doing the following steps after you've backed up the broken site's filesystem and database, so that you can undo these steps if they don't make things better.

An additional step to try is disabling the module in the database.  This used to be pretty easy in earlier versions of Drupal, but in Drupal 8, it's (of course) gotten more complex.  You'll need to look in the config table for the core.extension record.  The value of this record is a PHP serialized hash array.  You have to extract that value, modify it, and insert the modified version.  This is not for the feint of heart - only seasoned PHP programmers should try this!  The safest approach is to write your own little PHP script with the serialized value pasted into it as a constant or variable value.  Run the unserialize() function against it, delete the module's key from the hash array, then serialize() it again.  Braver souls can try editing the serialized code, but you have to make sure you decrement the key count or the serialized code won't process correctly.

If you are familiar with the module and the data it puts into the database (whether in the config table or in other tables), you can also try manually deleting that data.

When All Hope is Lost, But You Have an Old Backup

Certain types of failures may simply be unfixable without having extreme expert knowledge of the underpinnings of Drupal 8 and the Symfony framework.  A good example is an upgrade failure related to a module that provides a custom field type, as mentioned above.  Field types are so integral to Drupal 8 that getting the system running again when the internal pointers get scrambled is nearly impossible.  This is why these modules are uninstallable as long as you have any fields defined in your system using the provided custom field type.

If you can't find any way to get the broken site working again and feel you need to start over (whether with a really old copy from when you first launched the site in Drupal 8, or starting completely from scratch with a new instance of Drupal 8) you might be able to at least recover a good bit of content from the broken site.

First, copy the broken site to a development space and/or make a full backup of it, so that you can always start over at the current state of the site if your experimentation doesn't work out so well.

Next, go into the database for this copy of the site and delete every table with the prefix "cache_".  This is the equivalent of flushing all of your caches, and if the site just has bad cached config data, this might get it working again.

If that doesn't work, go back into the database and delete anything related to the problem area.  Specifically, for a failure related to a custom field type module, delete every key from the config table that relates to the field, including:

  • field.field.*.field_type_name
  • field.storage.field_type_name

That might be enough (combined with deleting the module's files and disabling the module in the database) to get your copy of the broken site working enough to extract any recent updates from it.  Of course, data that relates to the now deleted field will not be accessible, but you can dig for it in the database, and everything else you can cut-and-paste from the broken copy of the site to a whereever you are rebuilding the site.

If you're rebuilding using an out-of-date backup of the site, to help with identifying what has changed, you can compare the node_revision tables of the broken and restored sites.  Look for all of the new entries at the end of that table in the broken site and pull up each page as /node/## where '##' is the value from the nid column in the table.  These are the pages that have been changed since the date the restored site's backup files were made.  You'll still have to figure out what's changed on each page, but at least you won't have to visit every single page in the site.  For long pages, you can try copying and pasting the page content into local files and running them through some sort of "diff" program to compare them and highlight the differences.

Important:  Don't forget to look for newly added pages as well.  Also, be aware that if revisioning is not enabled, then the node_revision table won't help much, but you can check the changed field in the node_field_data table to see which nodes were changed recently.  And, of course, this process only covers dealing with content – if other areas of the site have changed (views, taxonomies, user profiles, etc), then you'll have to track down and copy over those changes separately.

This is obviously a labor intensive way of restoring a site, but it might be your only option if you don't have a recent backup and the site is important to the stakeholders.

As a Last Ditch Last Resort: Internet Archive

When you have nothing else to work with for a public facing site that seems to be completely unrecoverable, don't forget to check the Internet Archive to see if the site has been scanned in the past.  Available scans are usually at least six months out of date, but if you've gotten to this point, anything is probably better than nothing.  A lot of sites end up getting scanned by Internet Archive if they are public facing and other well known sites link to them.  In this case, you're probably looking at building a whole new site from scratch, but if you can recover some usable content from the Internet Archive, that's content you don't have to have someone rewrite or reconstruct for the site.