I don’t know what it is about this job but it sure attracts all kinds. Today I spoke with three very different vendors, all addressing “getting more business value from information” but from very different perspectives. In a nutshell, one was focused on providing:
- A reference base of usable, certified, standardized data (as a cloud based look up service)
- A “big data” based platform to distill a semantic model from a mass of heterogeneous systems of growing dark data, in order to yield a successful “search”
- A means to identify any piece of data by comparing it to THE semantic model that represents life, the universe and everything
I can’t quite explain all of these, but they sure make for a fun day. And this is among all the regular, run of the mill work one has with customers, emails and bosses. The odd thing is that all three approaches add value somewhere, yet they all compete in some fashion. They all cannot persist at the same point in time at the same location.
The first example pre supposes that firms can agree to define policy and rules and will make the necessary effort to change what they do to enforce their own policies. This does not necessarily mean more control; that would depend on the policy. But this is more of a “top down” kind of approach that drives the need for change management. This is closely associated with Master Data Management, even t though this specific example related to a subset of MDM, that being Reference Data Management.
The second example comes from the venerable “search” market. To me, “search” is anathema to MDM. Search tools thrive on complexity since they actually work well when there is a mess, and no ability to change, or enforce policy in how the information is governed. In a way, search and MDM are diametrically opposed. Search lives of chaos: MDM seeks to establish order. That being said, in order for search to do what it does well, the technology has to infer a number of things that get very close, conceptually, to a defined data model or an inferred model of how information is related. This would include some notional master data. But the output of this search stuff is not designed in such a way to simplify application integration or information lifecycle management. It just so happens that even though they may appear to be at odds with each other, there are some similarities.
The third example was my most exciting of the day, ne, the month. This vendor has analysts thousands, perhaps millions of streams of text, in different languages, in order to distill a semantic model to which all streams of text (in different languages) can be mapped. The result is that any term can be explicitly identified and its semantic inferred by putting it into the “solution”. A “target” is identified, and a subsequent, related term is added, then another and so on. At some point, enough of the related or connected terms are entered at which point the “target” is identified semantically. [ don’t think of irrational terms – think of terms like, “customer”, “order”, “location”, “service”, “consumer”, “bank plc” etc. The really interesting part here is that apparently there is a maximum number of terms needed before the analysis is conclusive: 154. It so happens, according to this vendors’ example, that most terms only need between 8 and 12 related terms for the engine to do its thing; but there is a maximum, that being 154. Simply amazing, if true. The concept is mind blowing, but the question is, what use is “the model”? In the two preceding example, vendors have solutions that meet “your model” since that is all any firm really needs. Who really cares if a data model uses the correct definition of “tree”? Surely the users only need to agree on “a” definition of “tree”, not “the” defining of tree. Interesting.
I am not sure how I will assimilate today’s learning, and I am not altogether sure I understand it all, but it sure beats packing shelves at the local store. I have to admit though, that does sound appealing on some days. Think on.
Comments Off
Category: Big Data Dark Data Information Governance Master Data Management Semantics Social Data Tags:

Andrew White



































































































