The field of data science is exploding, and the variety of roles for up-and-coming data scientists has expanded right along with it. Known in the 1990s as data mining, the term actually dates to the early 1970s, when it was first introduced as an alternative to computer science. These days, the process of finding patterns among intersecting data sets involves a host of related technology and techniques, including machine learning.
My own career in data science has led me to a wide range of opportunities in the field — and also in my life. And every one of them resulted from my exposure to people and communities that helped me diversify the skills I use to analyze and visualize complex datasets. Since college, when I first discovered data science, I’ve been able to actively participate in the Kaggle community, which users to engage and compete with other data scientists.
I was also invited to become a Z by HP Data Science Global Ambassador and, in the process, got to help design one of the data science challenges for Unlocked, a new online short film challenge aimed directly at emerging data scientists. With all the hallmarks of a Hollywood thriller, the narrative puts you, and your analytical skills, at the center of the action. It’s a little bit like being dropped into a James Bond movie, to be honest, though one in which real-world challenges span the cornerstones of our field, including NLP (natural language processing), audio information signaling, image analysis and data visualization. The Z by HP short film challenge presents this range so beautifully, in a fun and entertaining way: participants start with data visualization, exploring all your options in the process. Then you move on to the next challenges, which center on images, audio files and, finally, NLP.
The intersection of storytelling and data science is actually quite personal for me. I’m super passionate about images, for example, and on Kaggle, I was initially focused on storytelling through exploratory data analysis and data visualization on any sort of data. I got my start working with tabular audio files, images, NLP, and sentiment analysis before moving into more proper data science modeling. But computer vision has always been a north star for me.
As an undergrad, I came to data science by juggling a course load of statistics and working as a data analyst. One day, a colleague told me, “I want to be a data scientist!” Her excitement was palpable. This new field sounded so flashy to me, though I really didn’t understand what data science meant. She explained that it involved machine learning, so I got curious. I had never coded before in my life. But my love of data analytics, and competitive problem solving, spurred me on toward the artistic visualization side of the equation. I also began to delve deeper into machine learning, specifically Python-based machine learning, and enrolled in a one-year master’s degree in data science. At the end of my course, one of my professors told me that if I wanted to go even farther with it, I should just try Kaggle. It was the best advice I’ve ever received. When I joined that community, I became very invested in data science, and eventually became a Kaggle Notebooks Grandmaster.
When Everything Clicked: Transferring Skills to Grow My Career
During those early days of my career in data science, my first dataset iterations were a bit clumsy. I soon realized how many micro technical issues can creep into your tabular data, images, text, and audio files.
Once it came together for me through visualization, data science began to feel a little bit like magic. That’s the beauty of it. It’s not just about numbers. You can map early detection of skin diseases, for example, or you can apply it to autonomous driving. You can go on YouTube and create recommender systems, or you can take a bunch of audio files, like bird songs, and classify them and quantify the type of species that you’ve heard. The same goes for finance, medicine, entertainment, publishing, manufacturing, logistics…you name it.
This is why it’s extremely important to diversify early in your career. You want to be comfortable with any possible problem you might encounter and be able to assess any and all types of data. It also helps to be curious: you should feel comfortable researching new fields whenever an opportunity to apply your data science tools presents itself. It’s often best to learn a little bit about everything.
Tools of the Trade: Fully Loaded and Lightning Fast
Having an industry-best workstation to create your data sets and visualizations will always save you time and frustration along the way. My Z by HP workstation was life changing for me when it arrived. Turn it on and you’re ready to roll with a preloaded software stack: you have all the environments; you have Git; and you have the most popular software tools right at your fingertips. It requires much less effort to do your job. Take it from me: You don’t want a software glitch, resulting from something you hastily or poorly installed, to slow you down. You want to keep your focus on the data, no matter the challenge, and throw every skill you’ve got at unlocking what answers lie within it.
Andrata Olteanu, a data scientist at Endava, is a Kaggle Notebooks Grandmaster, a Dev Expert at Weights & Biases, and a Data Science Global Ambassador at Z by HP. She has a bachelor’s in statistics and a master’s in data science and analytics. She likes to combine the visual side of data with the technical side to create fun, educative and insightful notebooks and expand the idea of data science for all.