Hi Ming-Li. Why did you come to Asia and what do you do?
I landed in Asia two years ago. I was previously based in France, and working abroad has always been in the back of my head. The opportunity came up to work in Singapore, a country where companies are looking for innovation, where AI is a discussion at C-level, but also where data scientist is a scarce resource.
I am now working at DataRobot as a Customer-Facing Data Scientist. I took Financial Mathematics as my major at my engineering school and ended up in “data mining” after a Master’s in Mathematics. Previously, I was in two start-ups in France, both data-oriented, where I learned data recipes from gifted co-workers. Still, experience is the best teacher, and I landed where I am because I worked on many projects across various industries (transport, insurance, luxury goods, etc.) and their specific business issues.
I understand you are working on automated machine learning. What does that mean?
Machine learning gives the ability to discern patterns from your data automatically, without giving explicit rules or instructions. When a data scientist builds machine learning models, there are some repetitive tasks, standard processes, and common errors to avoid. Automated machine learning gives data scientists the ability to build and deploy machine learning models automatically, automating the repetitive, embedding guardrails, and following best practises. Not all the modeling approaches are automated today, but the major ones are already there, such as classification, regression, time series, image recognition, anomaly detection.
This technology is an accelerator for companies. It enables large and small enterprises to speed up their innovation roadmap and become AI-driven. Companies have been collecting a high volume of data, with the plan to use it later, but most of the data is still sleeping on servers and does not bring any value. Being AI-driven means making use of the full potential of this data asset.
Many corporations have launched AI or machine learning projects. Is recruiting for these projects hard?
There is definitely a gap between the demand for AI and the number of data scientists available. First you have to fight to hire them; then you have to fight to keep them. Automated machine learning (AutoML) offers two ways of reducing this gap.
First, it helps your data science team deliver projects faster. I recently did a PoC with the machine learning team of a large Asian bank. With the traditional approach, the data science team needs eight weeks and three data scientists to deliver a project. With AutoML, they took four weeks and only needed two people. The team was three times more efficient.
Second, automated machine learning opens the door to AI democratization. Indeed, business analysts usually understand the data and visualize the expected solution well, but they do not code. Software engineers code but do not have the mathematics background. This technology enables business analysts or software engineers to run and deliver AI solutions by themselves.
I’m not saying that AutoML is magic and anyone can become a data scientist. Machine learning has just become more accessible to people willing to try it. It is tailored to be quite easy to use without having to do a deep dive into coding or mathematics. Gartner’s name for these analysts and engineers is “citizen data scientists.”
What is a citizen data scientist?
Gartner gives this definition, “Citizen data scientists are ‘power users’ who can perform […] analytical tasks that would previously have required more expertise. Today, citizen data scientists provide a complementary role to expert data scientists.”
Here’s an example. Beth is working on the fraud team of an insurance company. She receives many claims every day and needs to approve each of them within one week. Often, she looks at the same fields and, based on her many years of experience, can identify which information will lead to a fraudulent claim. Ideally, she would like to receive a fraud score and a highlight of relevant information to investigate for each individual claim, which would ease her workload and increase her team’s productivity.
But the data scientist’s focus this year is on use cases with the marketing team, and the actuaries are currently busy with pricing. With access to citizen data scientist’s tools, Beth will first take historical claims and do data wrangling with an ETL tool. Then she will build a predictive model with an automated machine learning platform and visualize the outcome with a data visualization software. In the end, she did not write a single line of code to build her AI solution. In order to go into production, she presented a simple API script to a data engineer that is already provided by the tools.
You were involved in many AI projects in different industries. Could you share with us what are the key success factors?
Lots of AI projects fail and never bring value to their companies. The solution may be state-of-the-art, but it never goes in production. The reasons for this can be diverse. Here are some of the main blocking points I have experienced and ways to prevent them from happening:
Think about the implementation plan before starting. Many analytics teams collect data and start building a solution right away. They then think about the production workflow at the end of the development stage. The business and IT implementation should be designed before the development. You should sit with the end users and decide the frequency, the format, and the system they would like to use. You should also make sure they think about how to change their business processes once you provide your solution. Otherwise, your project won’t lead to extra value. Then, design the system workflow and share it with the IT or data engineer team.
Estimate an ROI.Measuring the estimated value of a use case before you start can help a lot. This enables you to decide if the project is worth doing. It also gives you a monetary value so that you can ask management for dedicated resources.
Build your team.This is the most important driver of success. There are three stakeholders you need to involve: an executive sponsor, a business champion, and technical staff.
The executive sponsor ensures communication with other teams, adherence to the timeline, and removal of blockers. If you don’t have support from an executive, it will be hard to dedicate resources to push your project to production.
The business champion is the representative of the end users. If the business team does not like your solution, then the project will fail. The champion makes sure the solution is designed properly during development and implementation.
The technical staff includes modelers and IT engineers. They are the do-ers.
Projects can also fail despite these guardrails, but the commitment of these three key people will give you a higher chance of success in making your organization AI-driven.
How do you dare to be a woman in machine learning?
Data science is known to be very male-dominated. In the Bigcloud Data science salary report Singapore 2017, it showed that among 200 participants, 94% were male. The sample is not large but the trend is there. However, I never felt I had to fight and justify my position because I am a woman. I just liked data and math so I ended up in that space. The gender ratio had little impact on my personal experience. I’ve been lucky to have great co-workers, a smooth evolution into my role, and a comfortable work atmosphere.
What about diversity in the machine learning space? How do you think it will evolve ?
Machine learning is the combination of math, coding, and business analysis. With a few exceptions, data guys usually excel in one or two skills.
I would say that in code-oriented roles, such as data engineers, there are less women. But the gender ratio might be more balanced when looking at data analysis and insights sharing. This is just a personal observation. For example, in my previous company, we had a data team of 15-20 people. It was 100% male for data scientists and engineers, 62% female for data project managers and analysts.
But this might change with the AI democratization technologies. These platforms enable users to be code-free. I currently work with a large company and within this new citizen data scientist team there are five women and four men.
I believe these technologies will open the doors to diversity in the data and AI space. : More diverse profiles will be involved with the rise of citizen data scientists. There will be better gender balance, and companies with lower AI maturity that will catch up faster.
Ming-Li, thank you !
Get in touch with us @ womenfrenchtech at gmail dot com