Case Study: A National Law Journal

It crashed every Friday morning.

About a decade ago, several aeons in Internet time, my company had the privilege of designing and building a new website for a preeminent national legal journal. We built a good site. It even won a Webby. But it crashed every Friday morning.

It took us longer than it should have to figure out why the site was crashing at the end of every week. But it’s likely it wouldn’t have taken much time at all, if we’d had access to the Web Reliability Framework (aka WR) back then. In fact it’s probable we would have solved the problem very quickly. But how can we know for sure?

Let's go back in time ten years and use that old journal website with its weekly crashes as a case study. We’ll test the WR Framework to see if it works as well as we imagine it will to diagnose the issues and expose the right course of action to address them.


Flow

WR asserts that the most important thing about a website is flow - the flow of the customer to the site, then through the site to achieve their goal. The customer's desire to fulfill a need (their mission) is considered paramount. Because the flow of customer desire comes first in WR, we begin with empathy. WR requires that we begin by working to understand our customer's desire and put it into words. Once we are able to stand in the customer’s shoes and understand their desire empathetically, we can understand how to optimize for flow.

The primary target customer of the journal is a member of the legal profession in the U.S. Lawyers, paralegals and other members of the legal field who sell expertise. They are service providers whose service is knowledge. They must maintain that knowledge in order to serve their clients well. They are also chronically busy people. They have little time to keep their knowledge up to date, but they must nevertheless.

Customers of the journal subscribe to in part for professional networking, but they also depend on it for staying on top of current affairs and obtaining further knowledge within the legal field. Because they are busy, they can commit to only a once weekly journal read. And since they are busy, they appreciate receiving an alert when the weekly issue is published online. The email alert is scheduled for release on Friday mornings. Subscribers receive the alert and can immediately click through to the site, where their goal is to quickly absorb the week's information.

Fiona

Empathy comes more easily when a specific person is imagined. For this case, let’s imagine an attorney in New York. We’ll call her Fiona.

Fiona has just boarded an early morning train at the Ronkonkoma station on Long Island, headed for her office in Manhattan. She sits down next to a window, pulls out her phone, and sees an alert email from the journal site. She clicks on the link to the new issue, looking forward to quickly reading through the week's articles. Her commute is about an hour and twenty minutes. She dedicates a portion of that to reading the journal on Fridays so she can be conversant about current events with her colleagues at the office. Her desire as a customer is for the journal to inform her quickly and meaningfully within the window of time she has allotted for it.

Perfect Storm

As you may see, we already have the conditions for a perfect storm. Some of you probably already see that the website had to crash every Friday. What else could it do when hundreds of thousands of lawyers and judges hit it all at once? You can also already see how frustrated and angry they were going to be. They had blocked off a specific time of week and day for the site. Their desire was boxed into a tiny space. Fulfilling it was going to take some doing.

So we now have enough of an empathetic connection to our customer's desire. We now know what it is that is going to flow into, through and out of our website. Now we need the rest of the framework.

The Board

Grab a napkin or old envelope or something. Flip it over. Draw a tic-tac-toe board on the back; 3 rows, 3 columns - a 3 x 3 matrix. Across the top of the board, above the 3 columns, write Team, Plan, and Action. Down the side, next to each of the 3 rows write Motivation, Resistance, and Management. Above the board, at the top of the page, write the customer mission statement – their reason for coming to the website and why it matters. This is the framework.

Team Plan Action
Motivation
Resistance
Management

To use the board, we’ll work our way through all of the tic-tac-toe squares, which represent the 9 attributes of WR. We’ll score each one with an X or an O. O is awesome. X is terrible.

We're asking if our customer's desire is flowing through the website. The answer is no. It's getting clogged. The website crashes and goes offline every Friday morning. This is our primary symptom. Any version of a customer's desire meeting with resistance falls within the resistance level of the framework.

Ouch, X!

So now we are clear that we have a resistance problem. We look at the board, and across the Resistance row. Which cell of the resistance row gets the X? The resistance row of the framework intersects with 3 columns; Team, Plan and Action. Team represents the group of people who build and maintain a website. Plan represents the strategy for how to optimize flow on that site. Action represents the actual working systems and processes that were strategized and designed by the team and put into motion. Fiona’s action is the thing that’s blocked, so we land on the Action/Resistance cell.

The website is crashing. The servers are dying every Friday morning when a multitude of lawyers hit the site at the same time. Servers live at the action layer of the WR Framework. Our core problem is here; friction at the action level. But we're probably not done. Machines are dumb. They just do what we tell them to do. Humans are always ultimately to blame in the Web Reliability Framework. We mark a big X in the Action/Resistance cell.

Team Plan Action
Motivation
Resistance X
Management

Resistance

So let's see where else we can detect failures. We already identified the first issue – Fiona’s inability to complete her mission. That's Action/Resistance. But what is the cause of it? We have to look back to the plan for the server infrastructure layer and who designed it. Because we find failure there as well. We look on the board where Team and Plan intersect with Resistance. We mark a big X in the Team/Resistance cell, and another big X in the Plan/Resistance cell.

Team Plan Action
Motivation
Resistance X X X
Management

We have identified 3 problems with the website already, clearly defined by the WR Framework. We have a serious resistance problem, with 3 Xs filling in that row on the board. Now, what about the other 2 rows on our tic-tac-toe board, motivation and management?

Motivation

The motivation layer of the framework is concerned with incentive, the driving force that causes you to take an action. Now, you can’t know this from where you sit, but we worked with those folks, and can say from experience that the Team/Motivation cell of the framework should get an O. Everyone involved in creating this legal journal website was highly motivated to serve customer desire. There were no slackers. There were no attitude problems. Everyone was dedicated. So let’s mark a big O in the Team/Motivation cell.

Team Plan Action
Motivation O
Resistance X X X
Management

The Plan/Motivation cell was also in good shape. To be clear, Plan/Motivation is not about the quality of the plan that was ultimately produced. It’s about the driving force behind the plan. Customers with desire were being engaged on a crowded internet and driven in force to the website. There was plenty of traffic from the optimal customer. We can draw another nice big O in Plan/Motivation. (Remember, Os are good.)

Team Plan Action
Motivation O O
Resistance X X X
Management

Last we have Action/Motivation. This level focuses on maintaining the customer's desire to continue to engage actively all through their journey on the site. We found that the journal site did a great job of keeping customers engaged by serving them informative articles that related well to one another, so there was a desire to move from one to the next. The design was lean and clean and stayed out of the way. (Remember there was a Webby.) Customer desire was maintained very well. Write an O here as well.

Team Plan Action
Motivation O O O
Resistance X X X
Management

Management

So we’ve established that the Motivation level of the framework is in good shape as far as the journal site is concerned. Now we drop down to Management. The Team/Management cell is where we make sure the team maintaining the site is aligned properly, and focused on continuously improving themselves and their capabilities over time. This is the monitoring, the validation layer. Each expert is improving their knowledge daily and each team member is working smarter every day to communicate and function better with other team members. You can draw another O in the team management cell. This group, unlike some others I've seen, was quite healthy and enthusiastic about becoming better over time.

Team Plan Action
Motivation O O O
Resistance X X X
Management O

There's a problem in the Plan/Management cell though. You probably already saw this one coming! Plan/Management must always include plan validation. The plan validation process ensures you ask the right questions and get answers based on real evidence. "Ok you have a grand strategy for this site. Have you tested the strategy before implementing it? Have you monitored your own work objectively? You’re about to build some grand apparatus. Is it going to fall over? Is it sustainable with the available resources? Are you in touch with reality?"

Plan Management - For Shame!

As we have already figured out, the web strategy was not properly validated. Our team moved forward without knowing that the journal staff planned to send an email blast to its entire customer base every Friday morning to drive traffic to the site. If we’d had proper strategy validation we would have uncovered this early in the planning process, and gotten very nervous about such a huge traffic spike concentrated into such a small window of time. We would have seen that the Plan/Motivation was going to generate a problem at the Action/Resistance level. But we didn’t, and failure resulted. Draw an X in the Plan/Management cell.

Team Plan Action
Motivation O O O
Resistance X X X
Management O X

Bandaids Won't Fix The Problem

Now, you might be inclined to focus on erasing the O in the Plan/Motivation cell. We were. We wanted to fix the problem of driving so many customers to the website in such a small window of time. If we had done this, our server infrastructure would have been fine. But this wasn’t our choice to make. Remember the plan was a response to the customer's desire. Remember your empathy with our attorney Fiona on the train to Manhattan, who has just the length of her train ride to read the journal, plus whatever else she needs to do. That's our available window of time. We serve her. She doesn’t serve us. So we can't manipulate the motivation that brings her to the site every Friday morning. That plan is already perfect. It serves her desire perfectly. It drives an excellent flow into the site. It's not broken. It's just a problem for us to solve.

Action/Management

The Action/Management cell is where we’ll do the scoring for server uptime monitoring and traffic pattern monitoring, because it’s all about managing ongoing actions occurring on the live site. The Friday morning website crashes happened suddenly, without warning. They happened so fast that no monitoring would have helped. It would have been like sticking your arms out in front of you in a car accident. The crash was going to happen so fast that monitoring for it wouldn't have helped at all. So the problem is not here, with monitoring and management. We’ll mark an O in this Action/Management cell. Once we fix the problem in the Action/Resistance cell, we can come back to management and have another look at it. But in truth, nothing will matter until we fix the bigger problem.

Team Plan Action
Motivation O O O
Resistance X X X
Management O X O

Evaluating The Score

So let’s have a look at our WR Framework scorecard. Remember that only Xs count for scoring. The tic-tac-toe board has 3 Os along the top row, giving us a score of 0 across the whole Motivation row, great! There’s no problem there. We can ignore that row.

The next row down is Resistance, and every cell has an X in it. Resistance gets a total score of 3, the worst!

The last row is Management. The board has an O for Team/Management, an X for Plan/Management and an O for Action/Management. The train wreck on Friday morning happens so fast, no early warning monitoring system would have helped. It may feel like we should be marking more management cells with an X, but remember that’s not the source of our problem. So the Management row has a total score of 1.

Now we add it up, for an overall score of 4. This is our WR flow score. The worst score is 9, so maybe a 4 isn’t really so bad. But scoring doesn’t work like that in WR! Any score at all – even just one X - is a problem. The goal of the score isn’t to tell us how relatively awesome we are. The scoring’s only purpose is to tell us where to focus our attention to improve web flow. So even with a score of 4, we understand that there’s an urgent need to drop whatever else is going on and get to work fixing the issue. Because things are still broken, and our client is still freaking out every Friday morning because her boss is furious that the website is down yet again.

Team Plan Action
Motivation O O O
Resistance X X X
Management O X O

Getting To TGIF

We know that we have customer desire flowing into and through the site. But it's not only being impeded but also totally blocked. The biggest fail is happening at the server level in real time on Friday mornings. Now we know that this is where we must focus our priority attention.

We didn't have the framework at the time the journal site was crashing, so we muddled through this part, after realizing the source of the issue. We worked closely with the server team who managed online publication of the journal. The relationship was very positive and productive. Often there is a big X at the Team/Motivation level when it comes to IT people helping web developers, but in this case there was a big O there. Our IT friends looked into their logs and found problems with our CMS's database. It turned out, once we dove in and looked, that some specific features on the site were locking up the database.

Into the Weeds And Out Again

One of the issues in the Action/Motivation category was a 'most popular articles' widget. This feature kept customers engaged with the site by encouraging their desire to learn and be informed by the same things that other lawyers found interesting. Every time a customer read a specific article, we would increase the count in the database. And every time we wanted to show the list of most popular articles, we would read the counts from that same database table. Because of the kind of database we were using, the writes to the table were locking it. This in turn caused processes to stack up. Consequently, that stack would get so big that the database itself would completely lock up. And when the database died, the site died. When the site died, lawyers could not fulfill their desire.

We eventually found a way to cache the relatively small 'most popular articles' feature so that clicks on it did not kill the database. This freed up enough server resources to allow the site to limp along on Friday mornings. There were other Action/Resistance issues we would have to address later, but this was the big root issue for the problem at hand.

If we’d had the WR Framework 10 years ago, we would have used it before building anything at all, ensuring the plan was validated, and uncovering the intention to produce a Friday morning traffic surge. We’d have adjusted the build plan to accommodate sudden, high levels of traffic and avoided a great deal of expense and unhappiness all the way around. If only we had known then what we know now!

Solspace, Inc.
PO Box 7282
Santa Cruz, Ca. 95061

https://solspace.com

© 2019 Solspace, Inc.