Solving the Big Data vs Data Science Conundrum
You’ve been hearing quite a lot about data, how it’s the next big thing, and that you should take it up for an interesting and promising career. You decide to check what the hoopla is all about and find that you have the aptitude for a career working with data. There is a confident grin on your face. You’re both excited and confused. A career in data analytics sounds fantastic, but there are so many options to choose from, and you’re struggling to tell one from the other. For instance—‘data science’ and ‘big data’: both have the word ‘data’ in them. Beyond that, you’re clueless.’ Do you become a Data Scientist or a Big Data Analyst? Here is putting that to rest and let you know what’s what.
Data science refers to the collection, processing, manipulation and interpretation of data using diverse tools and techniques from mathematics, statistics, computation and domain expertise. In other words, data science extends all the way from collecting, to deriving insights from, the data you have captured.
Big Data, on the other hand, refers to the kind of data collected—typically, datasets that are so large and complex, that they cannot be processed by traditional software. We are talking about exabytes of data mapped to hundreds of parameters (or specifications), which when analysed yield remarkable insights that can help solve quite a few business problems. By dint of its size, and the speed at which it is generated, big data requires very specific—and sophisticated—data science techniques.
Data Science is the discipline and Big Data is the sub-discipline or sub-speciality (think ‘medical science’ vis-à-vis ‘neurology’). Big Data is the hottest new actor in tinsel town and Data Science is the director who has the knack and knows the craft of dealing with a superstar. With the tools and techniques of data science, one can use Big Data to get insights that can help a business in its critical decision making.
As a data scientist, you would deal with theories such as predictive modelling, statistical modelling, data mining, machine learning, data visualization, cluster analysis etc. and presenting the findings thus obtained. A data scientist seeking to analyse big data, would do so by utilising technologies such as Hadoop, NoSQL and tools such as MapReduce, Spark, Hive and Pig.
So you see it’s really not that difficult to plan your career in the data domain. Play to your strengths and interests, and pick your area of data specialisation accordingly. If theory excites you, Data Science is the way to go and if application is your cup of tea, then Big Data Analytics is what you are looking for.