Occasionally, clients ask what the horizon looks like for a replacement for SQL as the dominant language for data management and analytics. They’re not asking about a replacement for the relational model – several alternative models exist already. Specifically, they’re looking for the next trend to manage and interact with data, regardless of data store. They can stop asking. SQL is here to stay.
NoSQL (now nonrelational) vendors, once proponents of, well, anything other than SQL, have added SQL-like features. And now Confluent has announced support for a SQL-like dialect on top of Apache Kafka, further solidifying SQL’s reign across a wide range of data stores. KSQL isn’t the first option for streaming SQL – far from it – but it reinforces the importance of SQL as the primary interface to operational and analytical data stores.
Over time, we may see natural language query (NLQ) gain traction in isolated use cases for targeted audiences, but I’m skeptical of wide adoption over the next 3-5 years. Cypher and Gremlin are also options, but today only target graph data models. While powerful options, I think they’ll remain specialized to the graph domain.
SQL’s continued dominance underscores another point of confusion with clients around unstructured data. I maintain that there is no such thing as unstructured data because the first thing people do is define that missing structure. Without structure, tools, like SQL, don’t work. Data management practices, like security and governance, don’t work. What has changed, however, is everyone can now create the structure that makes the most sense for their context of analysis. Compromising on a single structure or data model is no longer essential until your goals shift from discovery to optimization. And along the spectrum of analytics workloads, from data science lab, to analytics workbench, to information portal, SQL will remain the lingua franca for data and analytics.