From genomic sequencing to finance or FMCG, Sigrid has been working in data science for many years. She shares invaluable insights on building the right team, dealing with a pace of change in technology, or the constant need to educate organizations on data science. Buckle-up!
Hi Sigrid, what do you do and what brought you to Singapore?
My job is Head of Data Science at the Singapore Exchange (SGX). I came to Singapore 11 years ago to complete my PhD in Statistics, jointly conducted at A*STAR (GIS) and Université Paris Sud. After that, I did a post doc at A*STAR in cancer genomics before moving to the private sector, gaining experience across several industries, namely FMCG, Telco and Finance. Initially the plan was to stay in Singapore for 6 months but 11 years later, I am still here!
You started your career in biological sciences, worked in a few different industries and are now in finance. What did you learn from working in these different industries?
Working in different industries has given me a wider knowledge on how to use data science to solve problems under different contexts.
Genomic sequencing was my first exposure to big data. Processing large amounts of data is not necessarily what statisticians care about (you usually leave this to engineers or computer scientists), but being able to deal with increasingly larger datasets is inevitable in today’s world where the amount of data created daily grows exponentially. Genomic data give rise to multiple testing comparison problem which, in statistics, occurs when a large number of hypothesis are made simultaneously. In the case of genomics, the expression levels of thousands of genes are measured at the same time. The more inferences are made, the more likely erroneous inferences are to occur. The multiple comparison issue needs to be tackled in order to obtain meaningful and reproducible results.
Working in the largest FMCG company in the world taught me how to develop a customer-oriented approach. Everything I did, I did it with the customer in mind. The CEO used to always remind us that “the customer is the boss”.
In Telco, I dealt with geolocation data and how to productise and monetise it. I also learnt how to deal with personal data and data privacy.
When I joined the financial industry, I discovered a whole new world. Indeed, financial data – trading data to be more specific – is very time sensitive. Everything happens so fast that speed and latency are critical. Financial data analysis requires specific infrastructure and tools. Dealing with time series also brings additional complexity because what happened yesterday can’t be treated independently from what happens today. This is totally different than clinical research where each subject can be treated independently from the rest.
Even though some data science skills are transferable across different domains, context plays a crucial role too. Clearly understanding the problem requires data scientists to understand the industry. in order to gather relevant data and draw relevant conclusions. Data scientist can have very strong technical skills, but if they can’t interpret models correctly and tell a story about the data in their respective contexts, all their efforts would be in vain.
When you joined SGX, you had to build everything from scratch, what did you enjoy the most?
First, building my own data science team from scratch, and second, having a strong support from the upper management.
Building a team means addressing four key areas from my perspective:
We have adopted a hybrid approach with a core data science team that reports to me and what we call super users from different business units. The core team is built with diversity in mind. There is a mix of people with expertise in stats, computer science, engineering and finance. Because the team is quite small and the amount of work is large, we have trained the super users in coding. These super users help us do simpler tasks such as data retrieval, dashboards, reports tailored to their BU needs, and also gather requirements from their teams back to us. This enables us to have more bandwidth to do more advanced analytics.
Sound scientific principles require a set of best practices to guide data scientists in their daily activities. I created the SOPs and pushed to adopt agile practices. Finally, the team has defined a set of analytics deployment processes, which are steps to deploy code from a development environment, to a testing environment (on real data), and finally to a production environment ( deployment to end users).
I spent the first year building a Machine Learning Platform which enables data scientists to run analytics at scale.The platform allows to process large amounts of data using parallel processing. It supports the most common coding languages used by data scientists which are R and Python. It is also satisfied all the security and compliance requirements.
A big part of my role is to define use cases. In the beginning, the team built easy use cases, to prove the value of data science and fund the analytics journey. Now that the team is bigger and processes mature, we can spend more time on both exploratory work (which aims at bringing innovative technologies to the exchange) and delivery mode (which works very closely with the different business units in order to cater to their respective needs, e.g. better consumer understanding, tasks automation).
Our company organization gives my team the unique opportunity to interact with the CEO and President. Their vision for the future is truly inspiring and acts as a real enabler for our team’s development.
The success of data initiatives in a company largely relies on the CEO’s support. It is critical to increase the adoption of data oriented mindset across the company. Creating a data team from scratch often implies changing processes,, organization chart and the culture. If the CEO doesn’t buy in into the data transformation journey, data science team tends to spend more time convincing the organization than doing its job. . Data science is not all about technology, it is also about communication and change management.
After few experiences now, what main challenges do data leaders face in their role?
There are two main challenges
- Convince people about the importance of data science for the company and get rid of misconceptions around data science
- cope with a constantly changing environment where technology evolves very quickly.
On the first challenge, I adopted two key strategies.
First, we have to educate people. I organized sharing sessions on specific topics (e.g. what is AI/ML), demos on specific use cases as well as trainings on statistics, python and q (the coding language we use in the company to extract data from our database).
In addition, I put a lot of effort in building a data-driven culture, i.e. leverage data whenever and wherever possible in order to help business units to take better decisions. One initiative that I started recently was to build a data science community in order to promote data and ideas exchange, encourage cross-disciplinary work, and brainstorm new ways to look into data.
To cope with a constantly changing technology landscape, I practice continuous learning. I love being challenged and acquiring new skills, be it technical skills, soft skills or business understanding. Learning is like playing a sport, the more you learn, the better you are at it and the more addicted you become to it too! When I moved to SGX, I had to pick up financial knowledge as well as a q programming language. Three months later, I was training people on q. The important thing is to stay humble without being afraid to start from scratch again and again.
As more and more data is used to train the Machine Learning algorithms, AI has made tremendous progress over the last few years. How can we ensure that AI is used for good?
In order to ensure that AI is used for good and in order to prove Elon Musk wrong when he says that AI is evil, we need to start thinking on how we can build AI in a sustainable manner.
In order to build a sustainable AI, i.e. an AI that is here to last, there are four aspects to keep in mind: AI should be inclusive and fair, ethical, responsible and explainable.
To be inclusive and fair, AI should benefit to everyone and no one should be left behind. AI should not discriminate and we should prevent biases. Private companies have made some progress and, for example, Google images no longer tags black people as chimpanzees.
An ethical AI means finding a balance between privacy and common good. Where should we stand between a world where nobody shares any data and where we know everything about everyone? If we don’t share any data, it becomes hard for models to be fair as they are not trained with a representative set. At the same time, if we reveal too much, our freedom may be impacted and the data we share may be used against us.
In order to build a responsible AI, our biases need to be properly managed. There are two types of biases to manage: biases we are conscious about and, more importantly, biases aren’t. Having a third party looking at the data or model built by data scientists would be of great help, what I call a “data psychologist” that, similarly to patient seeking the advice of a psychologist, would ask various questions around them to ensure that no biases are introduced.
Finally, AI should be explainable and transparent. For example, when a bank builds a credit scoring model, it should be able to explain why it is approving or rejecting applicants. More importantly, the decisions to accept or reject application should be consistent over time. As more and more data is used to train the Machine Learning algorithms, there is a risk that the final decision may change. If models and decision making processes can be explained, these problems are more likely to be avoided.
In order to ensure that AI is used for good and in a sustainable way, we, Data Scientists, should keep these four principles in mind when building Machine Learning algorithms. We can play a key role in ensuring that data is used appropriately and responsively.
What are the challenges of being a woman in data? How “Dare you” to be a woman in data?
People often ask me this question. Working in a male dominated environment can be very intimidating for some women and they want to know how I deal with it. Being in Technology, especially in the finance sector, I have always been one of the few women (this has been so since my Masters in Statistics). It is not just about knowing your domain very well, I think the main issue is related to self-confidence. Being self-confident is key if you want to convince people, influence and inspire them. People naturally tend to listen to people they respect or that inspire them.
Here are a few tips I often share with other females I mentor. First work on aligning your message with your body language. Always get a sit at the table; don’t try to hide at the back of the room in the shadows where no one can see you. In meetings, don’t hesitate to voice up and join the conversation. People can’t read minds but understand words. Market yourself. What is the point of building the best analytical tool if nobody knows about it? Think about yourself a bit more and don’t be afraid to say no. Finally, get mentors (both male and female) to understand where your blind spots are.
How do you bring more diversity?
I have built a data science team based on talent diversity. All the team members have very different backgrounds, different cultures and complement each other very well.
I also encourage individual training, exchange of ideas via various means both offline (workshops, team bonding events, sharing sessions, stand ups) and online (chats, intranet, shared spaces). Thanks to their complementarity, they constantly learn from each other and seek each other advice.
Our Technology division is also very diverse in terms of skills as well as gender. We have several women leading different teams. Our head of technology is also a female.
In data science, however, there is a clear lack of female talent. Most female I interview want to join the healthcare sector or smart city initiatives.
As a constant effort to bring in more diversity in the team, I am working hard in getting ladies on board.
Diversity in genetics allows species to evolve and adapt to changing environments. In the same way, diversity in the workplace is vital for organisations to stay relevant in a constantly changing world.