Gartner Blog Network


From Taxonomy to Managed Vocabularies

by Darin Stewart  |  May 24, 2011  |  2 Comments

You may have noticed in my last post that I’ve started avoiding using the word “taxonomy.” In its place, I’ve started referring to these hierarchically organized sets of common terms as “managed vocabularies.” This may seem odd in light of the fact that I wrote a book called Building Enterprise Taxonomies but the switch is for a very practical reason. In my experience, using the “T” word in conversation invokes one of two responses. Either people immediately fall asleep or they run screaming from the room. There is very little middle ground.  “Taxonomy” invokes bad memories of AP biology courses and trying to remember if “species” comes before “genus” or vice versa.  Even the more general “controlled vocabulary” has certain “Orwellian” overtones that frightens off users and sponsors alike who suddenly feel constrained or censored because you are restricting how they express themselves or label their content. But that is not the intention behind these sets of preferred terms.

In the context of the enterprise, taxonomy no longer means a place for everything and everything in its place. The intention is not to restrict terms and keywords, but rather to manage their use and to bring some consistency and structure to how users communicate about information and content.  First and foremost the goal is to help them find the information they need when they need it. But just as importantly, consistent terminology can help make sense of the information when they do find it and give them a bit of confidence that their understanding aligns with the author’s intent. 

But even so, it is important to remember that know matter how we try, no matter how carefully we craft these things, we are going to get it wrong. Our taxonomies or any domain model (which is essentially what we are talking about) are going to be rough approximations of the real world at best. There are going to be errors and omissions. So its important to give our users some leeway in how they tag things because like it or not, they will anyway. I discussed this at some length in my post on “desire lines” but it bears repeating. The idea comes from landscape architecture. When planning a new park or public space, the designers try to think through how their visitors will move through the park and accommodate these routes. A paved walk will lead from the gazebo to the pond. Another will lead from the hot dog stand to the playground. But even with all this careful planning, people will still take shortcuts. If enough people find the shortcut useful, over time a permanent pathway of packed dirt will emerge, often cutting right across the officially sanctioned sidewalks.

The same phenomenon emerges in information spaces. If people do not find an official term or keyword they deem appropriate to the situation at hand, they will pick their own and use it to tag the document at hand. If these tags are visible to the user community, certain user tags will become popular and a new vernacular or folksonomy will emerge to organize and describe your content. While many librarians may consider this heresy, folksonomies should be embraced and managed. Rather than undermining or even supplanting your formal, official vocabularies, these user generated tags should be used to inform them. As a user term gains currency, it can be migrated into the formal term collection to keep it relevant.

Just as you shouldn’t micro-manage staff, you shouldn’t micro-manage how your users tag their content.  Provide them structure and oversight in the form of preferred terms and formal vocabularies, but also listen to what they are saying in the form of user tags.  Again, the goal isn’t so much to restrict your user’s terminology, but to liberate their access to information.  So I’ve sworn off terms like taxonomy, ontology and controlled vocabularies except when absolutely necessary for clarity (I’m still a librarian at heart after all) and will henceforth talk about managed vocabularies.

I’m interested in your thoughts on the terminology switch.  Do you think we should stick with the traditional library science terms (taxonomy, ontology, controlled vocabularies, etc.) or is it time to move to a more liberal stance (managed vocabularies) for the enterprise?

 

 

.

Category: collaboration  enterprise-content-managment  knowledge-management  

Tags: managed-vocabularies  taxonomy  

Darin Stewart
Research Vice President
6 years with Gartner
21 years IT industry

Darin Stewart is a research vice president for Gartner in the Collaboration and Content Strategies service. He covers search, knowledge management, semantic technologies and enterprise content management. Read Full Bio


Thoughts on From Taxonomy to Managed Vocabularies


  1. John Baker says:

    Darin, I totally agree with you. I have always hated the use of the word taxonomy and felt it was mis-placed (and over-used).

    Having said that, I understand and acknowledge that language evolves over time, and that it is accepted that if a word/term is used in association with something for a long enough period, then that word/term is generally accepted as correct in context with that ‘something’.

    I like the liberal stance of ‘managed vocabularies’, but wonder if in itself, the term ‘vocabularies’ is mis-understood by shall we say – the less-learned amongst us.

    It will be interesting to see what other comments you get.

    Regards,
    John Baker
    Document Management Specialist
    EnQuest PLC (UK)

  2. […] Stewart the author of Building Enterprise Taxonomies writes in his blog, “In my experience, using the “T” word in conversation invokes one of two responses. […]



Comments are closed

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.