What is Big Data?
Well first we need to know what data is. Data is information. It is the raw material of knowledge. Units of data collected, organised and analysed help us to understand our world.
Data is created every time an action is captured. Before computers, data was largely captured by the written word. Now data is captured in a multiplicity of ways through smartphones and satellites, cameras and cash registers, sensors and social media. We are creating records of our actions at breakneck speed: churning out data in quantities that cannot be adequately expressed in millions, billions or even trillions.
When we talk about the amount of data that people are now creating, we need to talk exabytes.
All the words were ever spoken by human beings = 1 exabyte.
We are now creating a 1 exabyte of data every six hours. That’s Big Data.
Buried in this avalanche of data is valuable information about health, business, education, consumer behavior, and culture. The role of the data scientist is to mine this enormous resource and find answers to society’s biggest questions.
Smart data use for a better society
Smart use of data can inform and influence what is going on in the world around us.
It is also disruptive: the future of many industry sectors will depend on their engagement with Big Data. In fact, many aspects of the world in which we live will be disrupted by the answers we find in our data. Consider surveys. Right now, we still rate consumer or voter sentiment by contacting representative samples directly. Before too long, that function will be obsolete as sentiments on everything from political parties to washing powder will be extractable from the data footprints we leave online. The data is there, the winners will be those who can find and use it.
It doesn’t just have the potential to benefit politicians looking for votes and businesses looking for customers. We as individuals can harness our own data to improve our own decision making. Already, an ultrasound scanner connected to an iPhone could theoretically save our lives by alerting emergency services to an impending cardiac arrest. A €1 sensor on an inhaler can record how well we are managing our asthma.
Gathered together, such sensor-harvested data could give us valuable information about health patterns across entire populations. Are people using their inhalers more frequently at key locations? Is there an issue with air quality?
Big data is transforming traditional industries – google used to buy satellite companies to build maps, now maps are building themselves via apps that track the movement of smartphones. People are invited to become sensors every time they agree to let a smartphone use their location.
The capture of so many tiny movements, communications and transactions creates a data exhaust, a spewing out of data that is useless unless processed. Google have been successful at using the data deluge to build better search engines.
Amazon tracks what we like and dislike. Political parties are starting to mine the sentiment of Twitter users. Entire disciplines are being transformed by Big Data thinking from anthropology to politics to linguistics.
Insight’s Magna Carta for Data project
There’s a shadow over Big Data. Some like to call it Big Brother. Right now, individuals have little control over what data is collected and how it is used. If you’re not paying for the service, you’re not the customer. You’re the product. Your data is generating money for someone and may not be to your benefit. Every week we hear another story on the dangers of data misuse. We need to figure out how to give individuals ownership of their data, while supporting responsible and society-enhancing research and innovation. These are questions that society must start asking. Developing an ethics framework around Big Data is everyone’s responsibility. For more on Insight’s work to develop a rights-based approach to data innovation, visit the Magna Carta for Data Project