I’ve been writing and speaking about taxonomies and metadata for a little over a decade. In the early days, my audiences consisted mostly of library science refugees seeking shelter in corporate IT departments. I considered myself lucky if there were a dozen people in the room. Last week I attended the annual Microsoft SharePoint Conference in Anaheim, California, and realized things have changed a bit. The session on taxonomies was held in a room with a capacity of 900 people. It was standing room only. People are finally starting to “get it”. Five years ago, taxonomy was all about “findability”. Consistent terminology and tagging makes search engines work better and navigation easier to…well navigate. This is as true as ever but today taxonomy and metadata are more about content lifecycle management running the gamut from content creation to disposal. It is finding its way into every corner of the enterprise.
With popularization comes the increased likelihood of dilution. As people, vendors in particular, jump on the buzzword bandwagon and co-opt terminology for their own nefarious purposes, concepts get muddled and best practices are lost. At the conference, I heard the phrase “unstructured taxonomy” being thrown around. This is an oxymoron at best and utter nonsense at worst. A taxonomy, by definition, is a structured vocabulary. The hierarchy is the whole point. There are other forms of vocabulary that are unstructured, but they are not taxonomies. The offending vendor in this case was attempting a neologism for “folksonomy” and in the process confusing his audience and annoying the analysts. (maybe it was just me).
As people start to get religion with metadata, other heresies are sneaking in as well. The most common I’ve encountered recently is managers placing artificial and arbitrary constraints on vocabularies. I’ve heard teams say things like “we are not allowed to have more than 200 terms in the vocabulary” or “a document can’t have more than two tags”. When pressed for the motivation behind such strictures the answer usually amounts to “we want to keep it simple.” A noble goal, but too often one issued as a fiat rather than as the result of analysis. Simple means the least amount of work necessary, but no less. It should be driven by functional requirements (and possibly platform limitations), not by artificial mandates from on high. Decision makers are starting to understand the benefits and potential of managed vocabularies and metadata, but don’t yet understand the practice of managed vocabularies and metadata.
Start with the standards. Z39.19 and ISO2788 are as close to scripture as it gets in the taxonomy world, though ISO 25964 “Thesauri and Interoperability with other Vocabularies” should soon be canonized as well. Invest in training. The practice is mature enough and the community large enough that you no longer need to go it alone. Don’t’ reinvent the wheel. Vocabularies and metadata frameworks are available for most common domains. License and modify is usually more effective than from scratch DIY. And of course, call Gartner for guidance.
It was gratifying to see so many people packed into a session on taxonomy at the SharePoint conference. The practice has come a long way, but some things never seem to change. In the early days, metadata champions were a small group of oddballs that couldn’t get funding for their projects. Now, it is managers, architects and business analysts who still can’t get funding for their projects. The practitioners have seen the light. Now we need to convince the people who sign the cheques.