01 Sep 2015 18:52 IST

How to stay on top of Big Data

This one term has become so ubiquitous that you will have to deal with it sooner or later

Big Data — this technical term is on its way to becoming a catchword or management jargon. It is so ubiquitous that you will have to deal with it sooner or later, even if you aren’t an IT professional.

But unlike other nerdy terms, this one means exactly what it says — Big Data is a lot of data. But how much is a ‘lot’? Experts say it is in terabytes (over 1,000 gigabytes). “So what if there is a lot of data?”, you might ask. Isn’t there huge amount of data in a Webster’s English dictionary too?

Useless until used

But big data is useless unless it is searched, processed, and acted upon. Google’s search engine searches the ‘Big Data’ of web pages on the internet. Yes, the World Wide Web is a good example of this concept. Google has made a fantastic business out of it. Likewise, retailers and e-tailers, airlines, telecom operators, navigation service providers, social networks and companies in many other sectors too are sitting on huge piles of data.

What is it, though?

Big Data is a by-product of most modern enterprises. Big Data has always been around — what has changed is technology. Today, technology allows vastly richer data to be collected and processed. Structured data in traditional databases and unstructured data in emails, social media posts and the likes, can now be processed. Existing businesses can use this big data profitably. It also makes new businesses possible.

What can you do with it?

What you can ‘do’ with your data depends on which fields or parameters (for instance name, age, date and time, location, transactions, text information, merchandise returns) it includes. What you can do with it will also vary from company to company — it can be used to manage inventories for the holiday season by forecasting a surge in demand; provide better estimates of aircraft arrival times (by taking into account airport traffic congestion); weather forecasting; predicting spread of disease, traffic management… the list is endless. Better predictions in all these applications can mean saving a lot of cash and effort, apart from other benefits. Big Data’s benefits come by framing sophisticated questions about data, developing search algorithms, running them, presenting results, and interpreting them.

Big Data is an important business input and a potential valuable business asset of your company. It is good to know what Big Data (technology) can and can’t do. Since it is used for making predictions based on large amounts of data, principles of statistics apply.

Data size (or sample size): Must be ‘statistically’ large. You can’t predict behaviour of 10 million people based on data related to just one million, unless samples are correctly chosen. Since Big Data is not an experiment with planned collection of samples, this condition is rarely satisfied. This is why you need much larger data sets.

Data duration (period of observation): Must be statistically significant to the time frame of prediction. You can’t predict for one month based on a week’s data. The duration should take into account known cyclical factors like weekend, end of month and end of season.

Can’t predict single events: You shouldn’t use Big Data to predict, or rule out, single events. Remember the crash of 2008? It was statistically deemed to be improbable. Predicting an event (or the impossibility of it) is like trying to predict the next throw of dice based on pattern of previous throws.

Patterns: Big Data predictions are based on ‘patterns’ of past data, and may be used to predict ‘patterns’ of future.

Shorter, the better: We know that future is not equal to the past. The longer the time frame of prediction, the less ‘firm’ it will be. Prediction of this year’s holiday season demand may be dependable, but using the same data for next year’s holiday season is not wise.

Interpretation: The patterns thrown up by Big Data need interpretation. They also depend on how queries are framed. The role of ‘gut’ or ‘insight’ comes here. Big Data doesn’t eliminate the need for these. Subject matter experts, who have insights in business aspects from their long experience, are needed to frame complex questions to be posed and processed on Big Data. They are also needed to interpret the findings, validate them, reframe them and rerun them.

Generalists who ask questions from non-traditional angles can make a big difference. Technology experts in Big Data, technology, infrastructure, and security, are vital for implementation. A leader manager who has appreciation of all above areas must establish Big Data practices and manage them.

Almost any area of business can benefit from making use of Big Data. Market intelligence, product portfolio, demand forecasting, customer management, loyalty programmes, promotions, human resources are just some of the few that can benefit immensely from the technology.

To sum up the above: Big Data is Pattern in, Pattern Out.

Amazon’s zillion recommendations may not entice you enough to make you click the ‘order’ button. But the same company seems to manage holiday season demand quite well, shipping your orders on time. You now have some idea how.

To read more from the Jargon Jungle section, click here .