Data & Design How-to's Note 7: Get the details
Needles and haystacks and such. Jennifer Hagy. Source
As we mentioned in Note 3, on Opening Open Data, it sometimes makes sense for advocates to give their audiences equality of arms by making their evidence base available online, and thus providing the public with the same resources that they use themselves. Publishing detailed information online can increase the credibility of an organisation by demonstrating the depth of its knowledge. It also creates transparency around a claim or argument. Slogan-style campaign messaging has very limited potential to influence people, particularly if they are not already partly convinced. Some audiences need a neutral proposition before they will even engage with you. Making the data available directly, without dictating how it should be interpreted, can help people to make up their own minds. It may swing them towards you. By creating an open evidence base, you are making a different set of offers to your audience: to participate, to invest time, to extend, re-use or add to the information. Although the complete evidence base may in itself be message-free, the art in giving the details is to allow our audiences to explore the evidence for themselves and find the stories that mean something to them.
This Note on Data and Design is about giving people all the information – the details – in ways that are both useful to them and helpful to your campaign. If you look at the diagram above, it's about getting people to the bottom of the “U”. To help you do this we look at two main themes:
- Initiatives that aim to help people get the details are very focussed on technology, and many of the examples we use are digital. However, these projects require insight beyond simply an understanding of the technologies that permit the publication of data. In Designing the journey, we look at the questions we have found it essential to ask about the design of projects that help people get the details. You should, for example, ask yourself why your data is interesting, whether at scale it will stand up to scrutiny, and what role you think visual and interactive design has in your project.
- Many advocates are confronted with problems that are well-known but difficult fully to grasp. Others are working on issues that are far less obvious, that are hard to pin down precisely. In Exploring the Data, we look at ways of helping people get the details of everything between the “known knowns” and the “unknown unknowns”. We look at the work of the artists Trevor Paglen and Ai Wei Wei, work that reveals traces of secret worlds. Alongside this, we consider the tactics used by people who are trying to reconstruct tragic histories, or to increase the transparency of manufacturing and of corporate and government spending.
2. Designing the journey
After examining the data, the coalition decided to make 1000 of the most verifiable records available. The data set was published as the Land Matrix, which enables site visitors to explore different visual representations of key aspects of the data set, such as the top investing countries and who's buying land from whom. There was an overwhelmingly positive response when the set of eight online visualisations and the related database were launched. The coalition received widespread coverage from offline and online media and increased weekly web-traffic ten-fold, from 3,000 to 30,000 visits. One of the challenges the coalition faced was the wide variation in the reliability of the data. Instead of allowing that to hold the publication of the data set back, they decided to make this a feature. Each record was accompanied by an indicator showing how reliable the data was. In addition, an infographic praised the five states that made the most data available, and ''flop-scalded” the five that were the least open. This made the problem of finding information into a feature, rather than a flaw of the data set. The data were released with an invitation for people to add to, comment on and verify the data set. Making it public became a way not only to focus attention on the data but also to correct the weaknesses of the data that had previously been privately held. This open invitation was taken up by many researchers, activists and organisations, who reused the content further.
The past few years have seen a shift in the way raw information is used to expose misconduct and increase accountability. In the examples that make up the bulk of this Note we are inspired by the new journeys that are now possible, thanks (for example) to new “access to information” laws, to the game-changing impact of initiatives such as Wikileaks, to experimentation with data journalism by mainstream media, and to collaborations between hackers and activists. While digital technology has been central to these opportunities, it's the way that the journey is designed that distinguishes a pointless online data dump from a platform that is an effective support for advocacy. Here are three ideas:
Are your data actually interesting to anyone?
It's easy to become enthusiastic about new techniques for presenting data, but it's important to have a sense of perspective about what people are willing to do and why. To non-Americans it doesn't sound like a particularly inviting proposition to trawl through 24,000 pages of the emails of former Governor of Alaska, Sarah Palin – but somehow, the New York Times and Washington Post managed to engage a lot of people online in this hunt for dirt. When releasing data we have to think about why it may appeal to people. We can't anticipate everything, but here are some ideas to help you think it through:
- Why are the data personally relevant to the audience? Have they already thought about the issues raised, or grasped why these issues relate to them? Before publishing, think about how you can give your audiences a very personal reason to care about the data.
- Why should the audience know about these particular data? Think about how you can enable your audiences to explore data related to their own environment –their community, the country they live in or the companies they buy products and services from.
- How are the data relevant to their professional or personal interests? It’s easier to get individuals actively to take part, reuse or contribute to data if you’re presenting information to people who already have an interest in the topic either for personal or for professional reasons.
Are your data good enough to be useful?
When we publish data at scale and ask people to sift through it, it can start to work like a peer review system. We have to be very confident about the validity of the data.
It helps to have good data to start with. Often people think they have a good data set: they may have collected the information themselves or have taken it from a well-researched report. However, when they start trying to make it accessible online in a more dynamic way, or to demonstrate patterns in a visual and interactive way, it becomes obvious that their data is either patchy or hard to interpret.
If you want to start a get the details project, you should be prepared to fill the holes in your data set yourself, and give your project depth by adding contextual information or doing additional research. If the data is really going to influence the audience it has to be comprehensive and stand up to scrutiny. It must speak for itself!
Are you prepared to make the visual design useful, and not just pretty?
Data sets can be overwhelming, and the wrong sort of visual presentation can make this worse rather than better. In get the details projects, the visual aspects of data presentation play a supporting role, and not a leading one as they do in get the idea projects.
Design needs to serve as a way to help people understand the information by reinforcing patterns and aiding reading. Good visual and graphic design in these projects creates usable filter and search aids that support users navigating the data. Far too many online data visualisations focus on ingenious ways of showing huge amounts of data; these can become overly intricate, even Byzantine. Data visualisations that are technically impressive are not necessarily examples of good information or interaction design. Try not to go over the top – you may give your users a headache and deter them from staying around very long, or from coming back! Think of visualisation as an entry point for your data, and tailor each entry point for different audiences. Some people want to see the big, global aggregations, while others want to see small details and micro-views, and others want to see just the most interesting or significant details from the entire data set.
Design should also serve the organising principles of a project, which should be focussed on enabling both shallow and deeper explorations of the data set. Think about complementary ways of presenting the information about your issue, for groups of people with different interests. In the text world, these are things like factsheets and blog posts; for visual products, think about the examples in get the idea and get the picture. It's hard to campaign directly without these.
3. Exploring data
To structure this section we've appropriated a well-known sound-bite from former US Secretary of Defence Donald Rumsfeld:
“There are known knowns; there are things we know that we know. There are known unknowns; that is to say there are things that we now know we don't know. But there are also unknown unknowns – there are things we do not know we don't know.”
Rumsfeld was talking about the nature of military intelligence in the run up to the 2003 invasion of Iraq. This fabulously opaque statement was roundly mocked – it even won the “prestigious” Foot in Mouth award from the British Campaign for Plain English.
Despite our feelings about the ethics of the statement and the context in which it was made, it did become widely notorious because of its built-in irony and simplicity. Since getting the details is about enabling people to explore all sorts of information, whether it concerns issues we know about but don't understand in detail or issues of which we know little or nothing, we're sure Donald won't be too upset if we use his phrase.
A. Exploring “known knowns”
Exploring things that we know we’re familiar with can help to build depth of understanding, particularly in relation to how things work or how they relate to a particular audience. Enabling people to interact with a familiar issue in greater depth serves to deepen their engagement with it. In each of the examples outlined below the audience has the opportunity to become a part of the journey, either by controlling the way the data are presented and how they relate to the interests of the audience, or by becoming part of the project by adding content to it themselves.
Making data meaningful
Even when governments publish their budgets every year it is very hard for most of us to make the mental leap required to grasp the huge figures involved and what they actually imply. In order for the transparency of government spending to have any meaning it needs to be translated into a relevant form, one that is useful to us. The Open Knowledge Foundation in the UK took the details of the annual government budget, starting with information from the Treasury, and transformed these data into something directly relevant. The result was an interactive website called Where does my money go?
The Daily Bread. Visualisation of how an invididual UK citizen's income tax is spent by the UK government. Source
Where does my money go? allows users to see the data directly and to map it on to regional spending, but it also makes the data directly relevant to the individual. One of the interfaces – The Daily Bread – enables users to slide a counter along a scale stating levels of annual income. On the basis of this it shows the taxes paid at different levels of income, and how they are divided across different spending areas. The site provides details to the penny about how much of the money that people pay in taxes goes to which part of the budget each day. For example, of the taxes on a salary of £38,340 per year, £5.69 goes to education every day and £3.43 goes to running government.
This enables users to break down their personal contribution to running the country, something that can have two effects. First, it connects them more directly to what is otherwise an abstract and invisible deduction of funds from their income; second, it drives them to reflect more substantially on public services that they are not currently in need of, but may access in the future, such as health and old age services, or unemployment benefits.
Putting together the facts
Sourcemap. Supply chain of a laptop computer. Source
In some cases it is impossible to help audiences get the details without asking them to find the information themselves. This paradox is the premise behind Sourcemap, a website that enables its users to construct supply chains and calculate the carbon footprints of common products.
Sourcemap started as a tool to help students calculate the environmental impact of the materials they use in product design and grew rapidly into a platform for public transparency about manufacturing. Leo Bonnani, who founded the site, tells us: “One of the reasons we have been successful is that people within manufacturing and retail companies have been dying to get data on the impact of their practices, especially environmental and social impacts; they feel like that information has almost been withheld from them. Companies now ask us to help them figure out where they are buying things from and the impact of this, and what that means for sustainability of their products. For me that has been a focus – from the beginning it has been important – because it was about influencing decision-makers: the people who put products on shelves. Across the board I've seen that they don't have access to information to influence their choices.”
Sourcemap also aims to help consumers better to understand the sustainability aspects of their choices by asking them to take their interest a step further and help pull together data that everyone can use. Bonnani continues: “People want to be able to drill deeper. They are being presented with simplified forms of data. Environmental data is the best example of that. You will have 'eco' labels on products – like FSC (Forest Stewardship Council) certified, 'green' or 'natural' or 'organic'. These are actually very reductive and are data-poor. People don't trust these labels and they look the same whether they are certified and valid or just a greenwashing attempt, like calling something 'farmpicked'.”
Bonnani's view is that there is genuine interest in data about how products are made, but the key is the journey that people go through in piecing this together themselves. “One of the surprises was that people would spend time on the website, browsing and learning and exploring the history of products, going through it and seeing the carbon footprint calculations and how they are made, looking at photo slideshows, watching videos, looking at calculations for carbon footprint assessments. So we're trying to replace that reductive eco-label approach with simple online QR [Quick Response] codes that are linking you to a visually punchy story about the product, [something] that you can spend minutes or hours delving into. I don't expect that people will know about the impact of their choices unless they really play around with the calculations and visualisations, so it's a learning experience for anyone involved.”
Filling in the gaps
Homepage of the Digital Monument to the Jewish Community of the Netherlands. Filtered to show the dead who lived in Amsterdam. Source
The information that you have may in itself be incomplete. Thinking through ways that your audience can help you with this problem can be a key design principle. An interesting example of this is the Digital Monument to the Jewish community in the Netherlands, created by the Jewish Historical Museum in the Netherlands. The Digital Monument is a database of official documents about Jewish people from the Netherlands who were killed by the Nazis between 1941 and 1944. On the website there are personal and family pages for each of the people who died, which include biographical details and home addresses; users can add to this information with photographs and other documents or information they have. This data is made more accessible by a strong search engine and an effective data visualisation to help visitors who don't know where to start.
The key insight of this database is that the process of historical reconstruction is contentious and never finished, and that the audience may have the missing pieces. Between 2005 and 2010, the site editors received 10,000 corrections and additions to the data on the site, reflecting the gaps and absences in the historical sources on which the site is based. Eventually, they decided that this task should be carried on through a community-led process, and set up a Community Memorial to expand and improve what is known about the dead.
This example shows that information products that started off as one-way communications of a traditional nature can be up-ended and turned into something participatory, something owned and sustained by a community of users. This approach can be designed for and encouraged, creating a powerful platform where people can invest their time and efforts in curating knowledge about things that they know something about, but want to understand in more detail or know more about.
View of an individual's details when you click on one of the pixels on the front page; information includes address, relationship to others and a clipping from the newspaper. Note the invitation to fill in the gaps featured on the top right: 'do you have extra information about Abraham Goud? Source
B. Exploring “known unknowns”
This side of data presentation is about digging deeper into issues about which we may have a few facts and some suspicions. Presenting the details on such issues can help move audiences who are open to a particular point of view while either remaining undecided or not having enough information to arrive at an informed opinion. Making such details available can help these people to create a clearer basis for their thoughts on an issue, and may actually motivate them to hold a much stronger position.
"Biggest Exxon Winners". From Exxon Secrets, Greenpeace. Source
ExxonSecrets is a research tool built by Greenpeace. It tracks the flow of money from the oil company ExxonMobil to institutions that have produced research that is critical of or hostile to the notion of climate change. The data about organisations, their funding and affiliations, including statements and public output by the key people who work for them, is very dense and interconnected. It's difficult to understand in the linear, flat form of tables or even as a simple database. What ExxonSecrets does is help the visitor show how a political agenda is influenced and supported by corporate influence. This is a connection that many suspect but don't know the details of. Above is a network map showing the eight institutions that received the most funding from ExxonMobil during the time the data was collected.
The value of this system as an advocacy tool is to help the visitor find out for themselves how interconnected right-wing research organisations are, and how well-organised the movement to deny climate change is. The insight that the designers of this tool had is that most people visiting a site like this don't know where to start, so they created a set of network maps that demonstrate the capabilities of the system while making its message clear. The user can be 'trained' in how to navigate the system and can also make their own journey by clicking on the individuals and then jumping off from these descriptions to look directly at the source material.
Connecting the dots
I Love Mountains' My Connection Tool creates a map showing how the White House is connected to Mountain Top Removal. Source
The connection tool works by stitching together six different public data sets that cumulatively establish and then present visually:
The location of your house The company that runs the electricity grid supplying your house The coal-fired power stations that supply electricity to that company and grid The sources of coal for those power stations, and the companies that run them The sources of coal in the US (and in this case, including coal from mountain top removal in Appalachia)
A year in expenses. The Guardian, 5 February 2010. Source
Table of 'expenses claimed' organised by parliament member, constituency and exact details of what they claimed. Source
By making an executive infographic as well as allowing direct access to the data, the Guardian Datablog changed the dynamic between media producer and consumer. They did frame the story with their own editorial decisions, but by making the details available they put readers in the shoes of an investigative journalist. Readers are able to download a detailed spreadsheet that they can explore line by line (above). The scandal was followed by the media, by campaigning organisations and by activists, and ultimately led to a number of embarrassed MPs paying back expenses that they had wrongly claimed. Some, including the Speaker of the House, resigned from their posts, and criminal charges were brought against three others.
C. Exploring “unknown unknowns”
Details of companies in Germany that make and sell surveillance technology.
Company details and direct access to technology catalogues, published on the Wikileaks site.
The visual output of Paglen's work is often intriguing. However, it is the questions he attempts to ask and the negative spaces he tries to fill in that makes his projects interesting to the audience. His work starts with important questions that become documented journeys as he tries to investigate the answers. The outcomes of his projects are often unknown in advance and this for him is part of the end result.
“[A.C.] says everything that happens in the world generates some kind of paperwork, and even if that paperwork's secret there's going to be some point where that paperwork isn't secret. Paperwork has a ripple effect... if you're interested in an industrial plant and you think that there are environmental crimes being committed there, you're going to have a very hard time turning up at the front door and knocking on the plant door and seeing if they'll let you come in, and tell you whatever crimes they are doing. But what you can do is assume that if they're dealing with toxic chemicals, there's a good chance they have a bad safety record, so what you can do is go to the local fire department and ask if there are any documented incidences of a hazmat response... So you start to build up evidence around the thing that you are looking at when you can't look at it directly.“
4. Analysing data
The map is based on research from a report 'The Effects of Drug Trafficking and Corruption on Democratic Institutions in Mexico, Columbia and Guatemala'.
Wikileaks Iraq SIGACTS redacted - network overview. Jonathan Stray, 2010.
Wikileaks Iraq SIGACTS redacted - network overview (zoom-in) Jonathan Stray, 2010.
5. Wrap-up – using detail to connect people to an issue
- Data & Design How-to's
- Data & Design How-to's Note 1: Where is your evidence?
- Data & Design How-to's Note 2: Data basics
- Data & Design How-to's Note 3: Opening open data
- Data & Design How-to's Note 4: Visualisation basics – the three 'gets'
- Data & Design How-to's Note 5: Get the idea
- Data & Design How-to's Note 6: Get the picture
- Data & Design How-to's Note 7: Get the details