Tag Archives: Analytics

The problem with data visualizations

You’ve probably heard the skeptical aphorism about statistics: “There are three kinds of lies: lies, damned lies, and statistics.” Unfortunately, I worry that we may soon hear “data visualization” tacked on to that list as the fourth and most deceiving way of communicating information. This is a problem.

A couple months ago, I was reading through some materials about a company’s financial health, and they included a dual axis chart showing the company’s Revenue and EBITDA from 2008 to 2014 (projected).  (See below.)

Dual Axes exampleLooking at the chart, it appears that the two measures increase in a near parallel fashion. The slope of their lines is pretty comparable, particularly in the later years on the chart. Problem is, this misrepresents what is actually happening. The axis on the left increases in increments of 20 while the one on the right does so in increments of 10, which means the relationship that appears between the lines is a misrepresentation.

When we chart the same data on a single axis we see that EBITDA fails to increase nearly as dramatically between 2012 and 2014 as revenue does. (See below.) If I’m evaluating the health of this company and its future prospects, that difference may be important!

Single Axis example

Adding a second axis seems like such a simple, innocuous thing, but it changes how the data might be interpreted and understood. This is just one example of the substantial impact a seemingly small design decision can have.

Why should we care about this? There are two main reasons:

  1. As consumers of more and more visual data, we need to be aware of situations like this where visualization design decisions may obscure (or at least distract from) certain critical pieces of information. Just because it’s data (data never lies!) and you can see it (my eyes would never deceive me!), doesn’t mean it is presented in an objective way.
  2. As more of us are in roles where we create data visualizations, we need to be aware that if we are careless, we run the risk of misleading our audience or imposing (hopefully unintentionally) our own viewpoint on the information we present.

Data visualization likely will be one of those things many of us try to do without any formal training, and I worry that, as a result, a lot of folks will do it badly. Am I being overly paranoid? I hope so. But this particular example doesn’t do much to allay that paranoia.


The ‘analytic’ investment: A Response

One of my favorite bloggers/data-geniuses, Kevin MacDonell, wrote a nice post on which I wanted to comment. Unfortunately he turned off comments on his blog (I can’t say I blame him), so I’ll write a post of my own in response.

There are two concepts in Kevin’s post I wanted to respond to:

  1. Becoming a data driven organization is a journey, not a destination
  2. Those of us who are already bought in to the data-driven mentality need to speak the language our bosses respond to.

Point 1: Becoming a data-driven organization is a journey, not a destination

I couldn’t agree more with this concept. It’s not enough to just hire a smart data analyst; start generating more reports; create new ways of scoring and analyzing our prospects; send out charts and graphs; etc. These are all tools that help move us in the right direction, but the real impact of a data-driven way of operating is in how we use all of that information to change behavior. What do those reports and charts and graphs show us that presents new opportunities? How do they inform us about our progress and what adjustments we might need to make in light of that progress?

In fact, I would say that being a data-driven organization is not so much about journeying as it is about completely changing how you think about, manage, and do your work. Moving in that direction is indeed a continual journey, and the end result is a rich, pervasive integration of a data-driven mentality.

Point 2: We need to speak the language that others respond to.

Again – agreed. Kevin’s big point is that if we want to convince others of the value of a data-driven mentality, we need to stop trying to get others to learn the language of data and analytics, and start putting data and analytics into their language, particularly for “bosses” and those leading the organization. I would say we need to make this shift in all settings where we are talking about (and using) data.

It seems that every organization has that brilliant data analyst who can go miles deep with his or her knowledge and analyses, and quickly lose the end users in highly technical language. This person is a huge asset with the potential to add tremendous value, but there’s a missed opportunity if we can’t take their good work and make it understandable and immediately relevant.

To be clear, this absolutely is not an issue of end-users being dumb or needing additional education. People simply think in different ways and they value different things. If I’m presenting new information or analysis to someone, and I don’t tailor my message to fit in with how they think and what they think about, far less of my message will resonate. If I’m lucky they’ll ask good questions that force me to reconfigure what I’m saying so they can take in more of it. But more likely a large portion of what I have to say will simply be ignored.

If we want smart data usage to become a regular part of how we do business, it needs to fit with how we all talk about our work, from the top of the organization to the bottom.

Run your Prospect Research shop like the Google Search page or a Swiss Army knife

Google search, for better or worse, plays a pretty central role in the Research profession. Lots of people use it; the best researchers know how to get a lot out of it; lots of development staff mistakenly think researchers spend all their time just “googling” things; it’s loved for its power and ease of use and sometimes dissed for its search personalization. When we talk about research, it’s hard to avoid Google.

But I came across a quote this morning that suggested an even better way we can harness the power of Google: be more like it.

Google’s first female engineer, Marissa Mayer, is reportedly responsible for the site’s clean, minimalist look. She said of the site, “Google has the functionality of a really complicated Swiss Army knife, but the home page is our way of approaching it closed. It’s simple, it’s elegant, you can slip it in your pocket, but it’s got the great doodad when you need it. …  A lot of our competitors are like a Swiss Army knife open — and that can be intimidating and occasionally harmful.”

The best prospect research and management shops definitely feature the utility of a really complicated Swiss Army knife. From complex prospect identification tools like statistical modeling analysis to robust prospect management systems that track myriad concurrent activities, to rich, in-depth information development – prospect research departments can do a lot of really cool, really useful things.

But do we feature the elegance and simplicity of a closed Swiss Army knife or the Google search interface? Do we make it easy and effortless for our “users” to interact with us? I’m not convinced that we do so as much as we should, and that may be an opportunity for improvement.

A complaint I’ve heard from frontline fundraising staff (not necessarily at my institution) is that it can be too difficult to interact with a prospect management system – entering contact reports or updating prospect or proposal tracking information is too tricky. And that’s a big turnoff and clear deterrent from use.

Similarly, we can analyze and score prospects twelve ways to Tuesday, and the barrage of numbers runs the risk of making people just throw up their hands and say “Uncle! That’s too much for me. I’m just going to ignore all that.”

And if a researcher puts together a prospect profile that goes WAY beyond what is needed for the task at hand, we run the risk of crowding out the most important, most useful information. (Imagine opening ALL the gadgets on a Swiss Army knife and then trying to just use the scissors tool! Anybody need a band-aid?)

To some extent, most of us think about this sort of thing frequently, but I imagine by incorporating more thoughtful, elegant design with a focus on simplicity into everything we do, we make the fruits of our labors more accessible and desirable to use.


The “Next Big Thing”

I’ve been involved with programming for a few different professional associations, and there seems to be a constant hunger for “the next big thing.” I’ve even seen other programming committee members say to potential speakers “we’d like you to do a talk on what you think the next big thing is going to be in prospect research and/or development.”

Over the last decade or so, I’ve noticed a few key areas that, in the prospect research world, have taken their turn as “the next big thing:” prospect management and tracking; data mining and modeling (or just analytics); and probably most recently, social media.

People are getting acclimated to social media and its newness is wearing off a little bit, which makes me wonder again, what will be the next big thing in prospect research circles?

I know what it is. Well, I know what I think it should be: Data Visualization.

“Why?” you may be thinking. “Why is this something that we should spend any time on? Why do you think I need someone to tell me how to do this?”

Data visualization is just one of many things that we do in the workplace that people rarely get appropriate training on. (Other such things include giving presentations, hiring, giving feedback, assigning work, writing effectively, etc; [There’s another post in there somewhere…]) We’re just thrown into situations where we’re expected to do these things, and we don’t always get good training on it (if any), so most of us just make it up as we go along and generally get mediocre results. Which tends not to be a problem, because everyone is probably getting mediocre results, so if I do too, I fit right in and nobody knows there’s a problem!

But we should do data visualization better. There are a number of reasons for this. Not least of which is that we should just strive to do everything better on principal! But also because data visualization can be a powerful tool in all of the work that we do.

That data. ALL that data…

As everyone is well aware, data is becoming more and more a part of our lives and the work that we do. You’ll notice that two of the three past “next big things” I listed above are directly involved with data: prospect management systems and analytics. We have more and more access to data and we are finding more and more powerful ways of working with and capitalizing on that data.

But as we gather more information, the signal-to-noise ratio changes dramatically, and it becomes more difficult to determine what is relevant. Patterns are hard to find. The meaningful information gets buried. And our ability to actually take advantage of the data diminishes.

This is where data visualization comes in: it can help us more effectively explore, interpret, and understand our information.

One of our strongest, most utilized senses is that of visual perception. Most of us experience the world through our eyes, and our brains are wired to process visual cues in a tremendously effective way. (Don’t believe me? Check out this site on perception in visualization. There are some powerful examples and tools that illustrate how amazing our preattentive processing is.)

We can take advantage of this hard-wiring by looking at our data (literally) from different perspectives. In doing so, patterns become evident, data of note comes to our attention, and we will understand where further exploration is warranted to identify actionable information.

Let me tell you what I really want to tell you…

In my mind, using visualization to explore our data is reason enough to embrace it and learn as much as we can about how to do so effectively. Case closed!

But let me put on my infomercial pitch-man hat and say “but wait, there’s more!”

Visualizations can be incredibly effective at communicating information. If I’ve got a table of data, I have a few options for communicating that to you:

  • I can just show you the table. You’ll have to comb through it to see where the high and low points are, what trends are occurring (if it’s longitudinal), etc.
  • I can explain it with words. I will have to comb through it to see where the high and low points are, what trends are occurring, etc., and THEN I’m going to have to put that into words and either write it down or just tell you.
  • I can put it into a chart. (Line graph, bar chart… doesn’t matter) The chart will show visually where the high and low points are, what trends are occurring, etc. Even if I don’t put any numbers on the chart, you can compare one datapoint to another and see them relative to one another.

By using a chart, I save myself and the other person lots of mental energy and drastically reduce the chance that there will be a misunderstanding or that the main message won’t be heard.

One of the benefits of this improved communication is the fact that we can more effectively make a particular point. If I want to highlight to our staff the fact that we have steadily increased the number of visit-ready prospects in our pipeline over the past five years, I can certainly just say that. But if I gather the data that supports that statement and put it into a visual format, it will be much more impactful. (See example below.) It also begins to quickly shed light on a number of other questions of the data:

  • How many prospects do we have now, compared to five years ago? (about four times as many, based on a quick look at the first and last columns)
  • When did we have the greatest increase in prospects? (2007-2008 and 2008-2009 look like they experienced the largest jumps)
  • How has the rate of our prospect identification fared recently? (it slowed about two years ago, but looks like it might be picking up)

As our organizations move more and more toward embracing data, it becomes ever more important that we are able to present that data in a way that is understandable and powerful. In my opinion, the only way to do this is with visualizations, and the only way to become more effective at utilizing visualization techniques is to make data visualization “the next big thing” in our professional development.

Only recently have I become seriously interested in data visualization, so I’m just scratching the surface of what is possible. But in spite of my newness to the field of study, I’m tremendously excited about the possibilities. My current books-to-read list includes a number of titles by Edward Tufte and Steven Few, and I’m excited to get deeper into these. What are your favorites?

Some Thoughts on CRISP-DM (a response)

The always-thoughtful Alex Oftelie at Bentz Whalley Flessner recently posted about the Cross Industry Standard Process for Data Mining (CRISP-DM) and wondered how often it is truly realized. His ultimate question: “Is [CRISP-DM] a firm road map to successful data-mining, or does it suggest merely an outline of processes that is malleable?”

I’m not sure how others feel, but I have a hard time thinking of the process as “firm” or unmalleable, and there are certainly a couple of reasons why this is so.

As its name indicates, CRISP-DM is a standard: it provides a good starting framework. It is generally well-accepted, and seems to have a solid reputation as a common best practice. But like any standard or best practice, it will be most effective if we adapt it to fit the existing situation and business need. This flexibility allows us to capitalize on our capacity for creativity.

Just this weekend, I ran across some great thoughts on standards in Show Me the Numbers by Stephen Few. Chapter 12 is entitled “The Interplay of Standards and Innovation,” and in it Few notes “A good set of standards … provides a framework for innovation in the face of important … challenges. Standards and innovation are partners in the pursuit of excellence.” There may certainly be good reasons to bend the rules of CRISP-DM, especially in the pursuit of creative, effective innovation.

As I thought through this, I also was brought back to my early music theory days in college. In the first term of music theory coursework, we were subjected to loads of “rules” about voice-leading and harmonization and the like. In reality, these weren’t really rules about how music must be written, but they were guidelines: if you wanted your compositions to sound like certain Classical composers, you would be wise to observe the guidelines! Really excellent compositions in the Classical style often demonstrate a clear understanding of these concepts, while still taking liberties to skirt them for the sake of creativity.

Similarly, as we work on analytics projects, we do ourselves a favor by understanding CRISP-DM and where we are within that construct, but allowing ourselves the flexibility to veer from the established path to get to where we need to go.

When models and intuition collide…

“All models are wrong, but some are useful.”

This is a quote (from George E.P. Box, I believe) that I use pretty much any time I’m talking about our statistical models that I create for our major and planned gift fundraising programs. It’s nice in that it covers my rear and makes clear the fact that, while our models are helpful and do point us in the right direction, they are not foolproof, and we’re going to get some false positives and false negatives from time to time.

I found myself reiterating the quote again today as I defended my models to one of our frontline fundraisers.

We had found a couple of prospects who had scored well in the modeling process, and they came to our attention on a wealth screening. From a numerical perspective, these folks looked great! They were predicted to be major donors, and we confirmed that they likely had the assets to be able to make big gifts.

But the problem was the fact that, while they looked great on paper, each of them had none of the hallmark intuitive indicators of a good prospect. In fact, many of their attributes raised red flags for this particular gift officer:

  • poor (if any) giving history;
  • non-grads;
  • had attended few (if any) college events; and
  • one of them had even let the College know  back in ’97 that he wasn’t too fond of us.

This gift officer had a pretty hard time wrapping his head around the idea that he should even try to see this person.

I did my best to sell the model: “It takes a lot of non-intuitive things into consideration, so it catches things we’d never think of!” and “We do have some major donors who match some of these criteria, so it’s not entirely out of the question that he could be a major donor!” But ultimately, I know he wasn’t convinced.

So I was left in an unresolved quandary this afternoon: what do we do about people who score very high on predictive models, but who look terrible according to all the traditional “good prospect” attributes? We can’t just write them off, because then we’re essentially throwing out the model because it doesn’t match our fundraising paradigm. (And this is precisely one of the key benefits of statistical modeling: it brings to our attention those people we wouldn’t think to find on our own.)

On the other hand, don’t instinct and experience play a role in the prospecting process? I truly think there is a place for both science and art in prospect research, so shouldn’t we embrace this notion and let this prospect get vetoed, so we avoid wasting staff time and energy on someone who probably won’t pan out? (And for the record, if it wasn’t for the models, I’d discount these prospect entirely – they had few redeeming qualities as potential donors. This gift officer had a pretty good point.)

I’m not sure what my conclusion is about it yet, but I feel that I lean towards at least giving these ugly duckling prospects a shot. Sure, I’m not the guy who has to make the cold call or sit in these folks’ living rooms, but it seems like we should at least make a respectable effort to give it a shot.

Now if I can just convince this gift officer…

The best fundraising analytics/modeling blog I’ve seen…

I’m always on the lookout for good resources to inform and improve the work I do, especially when it comes to prospect research, analytics, predictive modeling, and fundraising. Generally, it’s hard to say there’s a lot out there that deals with all of these topics. You can find things on prospect research; boatloads of people seem to write about fundraising; analytics and predictive modeling are mushrooming in a number of sectors — and so is the writing about them. But there don’t seem to be a whole lot of people writing about analytics in fundraising, so I was really pleasantly surprised to stumble onto Kevin MacDonell’s blog, “CoolData.”

Kevin clearly knows his stuff when it comes to analytics and statistical techniques. He covers a wealth of topics, and while he claims to be a “non-expert,” his thorough understanding of the nearly-always-complex subject matter is exemplified by how well he explains things in his posts. (Kevin was a journalism major. I’m guessing that, based on the quality of my own writing here, you can tell that I was not.)

As any CRISP-DM process model enthusiast will attest, a quality predictive modeling project requires good business understanding and good data understanding. By extension, I would opine that any quality fundraising analytics professional will also possess solid understanding of both the business (fundraising) and the data and the techniques required to deal with that data. By FURTHER extension, the best fundraising analytics resources also get to the heart of good business understanding and good data understanding. CoolData definitely hits the mark where this is concerned.

Kevin appears to have started CoolData a mere six months ago, and he has already populated the blog with LOTS of great posts. I look forward to seeing what else CoolData will cover.