Data Science is...?

By Daniel Nicola in R Posts Random Stuff

February 22, 2022

As a Data Scientist, have you ever been asked what data science is? Have you struggled to answer?

I’ve found myself in this situation before, and just started talking and talking yet failing to give a clear answer. The short answer is: “everything that has to do with data”.

To start, data has to be sourced somehow and from somewhere, this is usually done by data engineers or software engineers (data scientists’ best friends in the whole wide universe). Although if you’re in a small team or working alone, you might have to do this yourself. So add this part to data science.

After having some data, you might (most likely) have to clean it, prepare it, reclean it, reprepare it, reclean it, reprepare it (ad infinitum)… then explore it, build and train models, test models, automate processes, build and come up with visualizations to communicate results, all this while having (hopefully) some version control tools helping you keep sane. Add machine learning & deep learning to the mix. So yes, a whole lot of shit mixed together.

That is why you find data scientists that are very different from each other, from the ones working with traditional models (e.g. the good old lm’s) to the ones tuning neural nets for image and speech recognition. However, there’s one thing they all have in common, learning non-stop.

Take a look at this video, it’ll definitely clarify what data science is:

Clear like water, right? If not, give it a couple more reads. Adeu!

Posted on:
February 22, 2022
Length:
2 minute read, 252 words
Categories:
R Posts Random Stuff
Tags:
Data Science
See Also:
Un poquito de Python y Excel
My first Mac...
Control de Versiones con RStudio