In 2012, Harvard Business Review had published Data Scientist as the sexiest job of 21st century. In 2019, an Indeed survey had Computer Vision Engineer and Machine Learning Engineer as the highest paid tech jobs, ahead of the data scientist. Within seven years, the balance has shifted from modeling to implementation. From focused ML problems to more complex integrated solutions.
This could be due to several factors:
- An explosion of data science focused MOOCs and graduate programs. Machine learning courses have been the most in-demand with classes filling up very quickly even in undergraduate programs, and not limited to computer science students. This should resolve some of the hiring availability challenges for entry level data scientists.
- Tools such as AutoML and ML as a service (MLaaS) solutions from platform vendors have brought solutions for simple data science problems within the reach of non-data-scientists – such as business analysts, BI developers and application developers.
- Increased maturity level of organizations where they are beginning to integrate machine learning solutions into the overall business process. They are not looking at a artificial intelligence (AI) based solution as an isolated application. But as a component within the overall business solution. This requires focus on the engineering aspect of the data science solution. The model optimization, code quality, continued performance and scalability have become required components.
The last factor is evident within our client interactions too, where a large number of organizations are trying to solve the architectures, tools and best practices for implementation of AI solutions. MLOps is probably the most popular AI buzz word currently. Open source initiatives such as Kubeflow and mlflow have lot of organizational support and excitement within the the developer community. Academic research, industry conference presentations, vendor platforms and even regulations are shifting towards advanced functionality such as explainable AI, model fairness, adversarial robustness, privacy, reproducibility and the overall governance of the model combining business and technical owners.
While these are topics of continued research, two of my recent documents – “Machine Learning Training Essentials and Best Practices” and “5 Techniques to Troubleshoot Your Machine Learning Model” provide insights into few of these factors.