In this instalment of our People in STEM series we talk to Kieran Cooney, a data scientist working in Optum. Kieran represented Ireland twice at the International Mathematical Olympiad.
Part one covers working in data science, while part two will delve deeper into Olympiad maths and self-driven learning.
Could you tell us a bit about your career?
I haven’t been in the field for all that long. So I did my undergraduate in UCC, I studied Maths and physics there, joint honours. After that I did a masters in photonics.
Photonics is the branch of physics to do with the study of light engineering, basically. We know about electronics, which is the study of electrical engineering, photonics is to do with light engineering. It’s really important right now because communications are crucial, and so much data today is actually carried by optical fibre. We know about optical fibre coming to our homes, improving our internet connection, but long before that there was optical fibre running across the bottom of the Atlantic Ocean, connecting data from America to Europe.
So I did a masters in that, kind of physics and theoretically mathsy, and then after that I started working as a data scientist two years ago in Dublin. Right now I am working as a data scientist for Optum in Dublin. They’re a subsidiary of United Health Group, so there’s a lot of health insurance and examining health related data involved in that job.
Why did you choose to study maths?
There was nothing else really. Even though I did my degree in maths and physics I went in through maths. I wasn’t actually planning on doing physics at all, but then I realised that if I wanted to keep my options open for courses I needed to do physics in first year of UCC and then I went with it.
But back to your question, I was kind of unsure of what I wanted to do up until I was about 15 because there were plenty of subjects I was good at at school but I was bored of quite a lot of them. Maths was one of them. It wasn’t until a couple of us were suggested to go to these maths enrichment classes that I found my love and it was mathematics.
I went to these enrichment sessions – I went to UL – they were on every Thursday night from 7 to 10 so it was a bit of a commitment but I just fell in love with it. I remember saying to my mother, because my parents were driving me home, I said to my mother one evening, I don’t know why I’m coming back here, because I was finding them so difficult. It felt like I was understanding very little of them at all but there was something that kept me coming back, I just loved it. After those enrichment classes I knew that it had to be mathematics.
What does a data scientist actually do?
A data scientist essentially is someone who interprets or manipulates – manipulate is a bad choice of word but I’ll stick with it- data for someone. To give you an example, where I’m working at the minute I’m working with health insurance data. Someone will typically have some kind of a question. For example, something that’s current right now “how has coronavirus impacted the number of claims which have been made every month”, for the sake of argument.
There can be a bit of work in that actually, there can be a bit of work in just figuring out what that question really means, and having a conversation with the interested party to figure out what they’re really looking for. Going into the data and investigating it. Going into the data for me means going in using python. Python is a programming language, and it’s really good for manipulating data. I’ll go in and I’ll write some scripts to perform some calculations on the data and then I might make a couple of plots on them.
Then moving past that, the next level – and this is what I meant by manipulating – you can provide models on the data too. This is where the maths side of it comes in actually. You can provide models which will make predictions for you based on the data you have.
So you, Mrs Smith who lives in the US, she lives in Florida, she’s 70 years old, and we know she was in hospital last year for an angiogram, what can we say about what she’s going to do this year? Can we make some kind of a statement on what her claims for 2020 are going to be? Things like this.
Data science is at the intersection point of a lot of different things. Industry is obviously involved, because you are not doing something for the sake of intellectual investigation, like you would in academia. You are doing it because someone has a vested interest in knowing about something. There is a lot of programming involved, because that is how you’re actually interacting with the data. There is a lot of mathematics and statistics involved too. You need to understand the assumptions you are making, probability and probability distributions. You need to understand how the models work, you need to be able to interpret the results of either your data or your models, and figure out which model is the best.
For people going into mathematics now data science is actually very important, and for physics and a lot of these numerically inclined disciplines. It’s an emerging job basically, an emerging field. When I was starting college people weren’t really talking about it but now a lot of people are going to work in it. There are a whole host of reasons for that historically. Basically data is cheap, saving data is cheap. You can get online hard drive space and save terabytes of data. As a result now businesses can hire these people to make decisions for them. It’s something to keep an eye out for, definitely. If you’re thinking maybe I’d like to give industry a shot after college it’s definitely a good option, because it’s so flexible.
Does your work involve Artificial Intelligence (AI)?
Yeah absolutely. Artificial Intelligence is a term that I am kind of weary of because it is not too dissimilar to a phrase like “bitcoin” or “blockchain” in that its meaning can be quite vague but you often see it on marketing campaigns. Some people refer to artificial intelligence as building a model or getting a computer to figure out something. Other people refer to it as androids being able to have a human conversation and things like this. If you refer to it as teaching computers how to make a prediction then yes.
But I would be very careful to use the term artificial intelligence because that to me suggests some form of intelligence. Whereas if you actually take a look under the hood and you’re working with it and you see what’s going on you see that all you are really doing is performing a calculation, it’s just that the calculation is so unwieldy for us humans only computers can do it, and it’s so complicated that it gives the impression of intelligence. But really if you poke at it a little bit you can see it is more artificial than it is intelligent.
What is the difference between a data scientist and a statistician?
I think you wouldn’t see too many statisticians in industry any more. I think if you are talking about a statistician you’d be referring to someone in academia who is actually studying statistics. Whereas now in industry it would be more of a data scientist or a data analyst maybe. There has been a trend away from what we’d call ‘statistics’ in favour of pattern recognition and machine learning and AI because we have so much data.
If you have a hundred data points (by data points I mean like rows in your table, or if what you are trying to understand is a group of people then 100 data points would mean 100 people). If you only have 100 data points it’s quite difficult to build a model on it or apply machine learning techniques because the amount of data you have is quite small, and then statistics become quite useful because you are being very precise about the assumptions that you’re making and you can be very careful about what you’re saying. Because we have so much data now in industry people aren’t as careful as they were because they don’t have to be.
What is the biggest difference between student life and working in industry?
They’re completely different. There isn’t a huge amount that’s similar between them really at all. In college or in student life, you’re very autonomous. You’re told you have these courses. You’re told about these homeworks and really the rest is left up to you. So long as you get your grades at the end of the year, do you turn up to class? Do you do your homework? Do you study? None of that is mandatory actually. So you’re free to do with your time, what you want . But that’s a bit of a poisoned chalice, that you can do some really nice things with that, or you can not use the time well at all. It’s up to you. But it is a very, very nice time to learn about lots of different things, academic and otherwise, whereas in industry things are very structured.
So I’m living in Dublin. I’m commuting – I was commuting pre-Covid. My job is nine to five or half five every day. I’ve got meetings a lot during the day, which I definitely need to attend. When I am working, I get up at six to go to the gym in the morning and I get back at six. There’s a lot more pressure with time, but there’s a lot more structure too. It’s not to say that I’m not learning. I absolutely am. But the learning is less self driven now and it’s more ‘what do I need to know in order to do my job better right now?’. Which isn’t to say that I’m not enjoying what I’m doing. I wouldn’t be in data science otherwise. For me I think that’s the key difference, structure.
Part two will be available later this week. Kieran’s Linkedin is https://www.linkedin.com/in/kieran-cooney-6126868a/ . Thank you for reading, and a big thank you to Kieran!