Making Sense out of Numbers

I was able to attend the Seminar on Innovative Approaches to Turn Statistics into Knowledge held at the Census Bureau in DC last week. It was an interesting mix of officials from government statistical offices, central banks with academics and designer/data geeks. Some of the highlights: Amanda Cox from the New York Time talking about how she is a like a tour bus driver describing the interesting stuff she finds, That she believes in visualizations that pull something forward while pushing back the rest of the data, have an annotation layer. Also that distributions are more interesting than averages, and when you have something move make sure you know why it is moving. Some of these principals you can see here and here

Helen North from the South Africa's Stats Office talking about the need to build trust in the data as well as to educate people in the uses of their data. This was accomplished by bringing together the delegates from the local municipalities so that they could learn about, discuss and debate with each other the demographic statistics collected about their home districts.

Irene Ros from ManyEyes talking about people uploading their personnel data (Warcraft stats, Facebook friends). She described how hey were using this tool to create "Data Mirrors" i.e. a picture about themselves. Also she mentioned that 88% of Wordle users feel creative when using the tool.

Jim Ridgway from Durham University's Smart Centre talked about students (14-15 year old) when face with a media story and data about the same subject were able to critique the story using the data and in some case spontaneously found more data to include in their critique.

David Spiegelhalter from Cambridge University demonstrated a very interesting site designs used to explain uncertainty to people. He stressed that there was no one right method as different people responded to different methods. During the discussion he brought up an important point to keep in mind when creating visualizations: what is the purpose behind the visualizations? Is it the

  1. WOW! reaction?
  2. to increase knowledge?
  3. or to effect behavior?
Data Scientist > Data Geek > Designer

Reading Nathan Yau's recent post about the Rise of the Data Scientist inspired me to take a look Ben Fry's dissertation on Computational Information Design in which he describes the process for understanding data as follows:

  1. acquire – the matter of obtaining the data, whether from a file on a disk or from a source over a network.
  2. parse – providing some structure around what the data means, ordering it into categories.
  3. filter – removing all but the data of interest.
  4. mine – the application of methods from statistics or data mining, as a way to discern patterns or place the data in mathematical context.
  5. represent – determination of a simple representation, whether the data takes one of many shapes such as a bar graph, list, or tree.
  6. refine – improvements to the basic representation to make it clearer and more visually engaging.
  7. interact – the addition of methods for manipulating the data or controlling what features are visible.

I took his process and created a diagram that maps my own skill set with the addition of Interaction Design (my current profession) which I believe covers the represent, refine, and interact steps. CompInfoDesign

While I don't disagree that these steps represent the process for understanding data for the individual creating the data visualization, they don't cover a step needed to create a design that is readily understood or that is persuasive to others.

User research and testing of the design is needed to verify that the representation is clear and appropriate. Although this could be considered part of the refine step, it may be needed at other points in the process (i.e. represent, or interact). For anyone who is interested in creating data visualizations for other people, it should be considered an important part of the design process.CompInfoDesign_2-01

