Is Google Cloud Platform Ready to Run Your Data Analytics Pipeline?
This is the focus of my latest research which published in Jan 2019. So, why did I decide to write on this topic? I am glad you asked.
My journey in helping our customers with their technical queries started when I joined Gartner in late 2016. At that time, the burning topic was choosing the right database for burgeoning use cases. I spent the majority of my time helping clients decide which was the right Hadoop platform and which NoSQL / nonrelational data store to pick for specific use cases.
Fast forward to early 2017. I saw the winds change and the inquiry requests shifted towards advanced analytics involving machine learning (ML) questions. Then in the middle of 2017, a realization set in that we were one year away from GDPR and needed to focus on data governance. That’s where the bulk of my time was spent. I ended up writing two documents on data governance. In fact, this space continues to remain hot as can be seen from Alation’s $50M and Collibra’s $100M funding in January 2019.
By the middle of 2018, GDPR was in effect and our clients’ attention shifted towards cloud migration. I saw a palpable uptake in the acceleration of workloads into public cloud. Interestingly, all the topics I just mentioned – selecting the right data store, data science, and data governance, were still the use cases our clients most cared about—except that now all these are wrapped under the cloud context.
To address client questions about cloud, I wrote a document on GCP. I chose GCP because it was the cloud platform our clients knew the least about among the three big cloud vendors – AWS, Azure and Google Cloud. Here is a sample graphic from the document that covers the end-to-end data and analytics architecture on GCP:
My analysis of GCP revealed some of its core strengths and weaknesses. Many GCP customers spoke highly of its well-engineered products. Google BigQuery almost consistently was ranked very high by the people responsible for providing analytical capabilities, including the security, performance and ease of use. GCP contributes frequently to the open source community and uses those products in its suites such as Apache Beam, Apache Airflow, Kubernetes and TensorFlow.
What are the weaknesses of GCP? As you can tell, data governance is a hot topic but an area that many public cloud vendors are weak in. In GCP, I haven’t yet seen an integrated native cloud suite able to perform functions of business glossary, data discovery, business metadata management, data catalog, data quality and lineage, but it’s an area I expect to hear more on soon. What else? GCP users pointed out that its documentation and support was not at the same level as other major public cloud vendors, but they also pointed out that they have already seen visible improvements in this space.
My research revealed that GCP has seen success with customers in fast-moving, cloud-native, bleeding-edge organizations that are looking to derive competitive advantage through advanced analytics, including ML and AI. GCP has gained acceptance for development and experimentation and more enterprise customers are putting it into production. The bottom line is that GCP has all the ingredients required for cloud computing success, including advanced infrastructure, sophisticated products, high security, and widespread machine learning and artificial intelligence.
You can view the entire research report here (requires Gartner subscription)