<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Tag Cloud</title>
	<atom:link href="http://blogs.gartner.com/whit_andrews/2008/10/29/tag-cloud/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.gartner.com/whit_andrews/2008/10/29/tag-cloud/</link>
	<description>A member of the Gartner Blog Network</description>
	<lastBuildDate>Mon, 08 Feb 2010 22:13:19 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.4</generator>
	<item>
		<title>By: Whit Andrews</title>
		<link>http://blogs.gartner.com/whit_andrews/2008/10/29/tag-cloud/comment-page-1/#comment-62</link>
		<dc:creator>Whit Andrews</dc:creator>
		<pubDate>Wed, 05 Nov 2008 16:22:13 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.gartner.com/whit_andrews/?p=100#comment-62</guid>
		<description>This absolutely is a problem that will not go away; I could not agree more. I think that Dan Tunkelang&#039;s point is well-taken, and I am also a believer in transparency as a critical element of future relevancy calculations, especially those in which the intention is to exploit the apparently naive but possibly malicious or soimply calculated behaviors of passionate users. However, as we know, most people decline to invest the efoort in understanding a system such the transparency may be allowed to work effectively. So, as a result, the default settings are allowed to dominate, no matter what. Now, what we have here is the ability to alter the default settings at a very high level, if one can make this work. 

Dan Sholler, on the other hand, is of course also right. Tagging, however, is even more vulnerable than other kinds of link manipulation. Google bombing requires technical sophistication (you don&#039;t think so? explain it to your friend who is not interested in how it works). Piggyback attacks require some sophistication, but overload attacks would be trivially easy for someone with enough interested, or compensated, minions. 

My intention here is to point out that any roomful of perfect new solutions for relevancy are always vulnerable insofar as they benefit from publicly available tweaking.</description>
		<content:encoded><![CDATA[<p>This absolutely is a problem that will not go away; I could not agree more. I think that Dan Tunkelang&#8217;s point is well-taken, and I am also a believer in transparency as a critical element of future relevancy calculations, especially those in which the intention is to exploit the apparently naive but possibly malicious or soimply calculated behaviors of passionate users. However, as we know, most people decline to invest the efoort in understanding a system such the transparency may be allowed to work effectively. So, as a result, the default settings are allowed to dominate, no matter what. Now, what we have here is the ability to alter the default settings at a very high level, if one can make this work. </p>
<p>Dan Sholler, on the other hand, is of course also right. Tagging, however, is even more vulnerable than other kinds of link manipulation. Google bombing requires technical sophistication (you don&#8217;t think so? explain it to your friend who is not interested in how it works). Piggyback attacks require some sophistication, but overload attacks would be trivially easy for someone with enough interested, or compensated, minions. </p>
<p>My intention here is to point out that any roomful of perfect new solutions for relevancy are always vulnerable insofar as they benefit from publicly available tweaking.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dan Sholler</title>
		<link>http://blogs.gartner.com/whit_andrews/2008/10/29/tag-cloud/comment-page-1/#comment-61</link>
		<dc:creator>Dan Sholler</dc:creator>
		<pubDate>Thu, 30 Oct 2008 15:33:48 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.gartner.com/whit_andrews/?p=100#comment-61</guid>
		<description>Whit, 
I do need to point out that this problem is not just with tags, but with the entire notion of the web and hypermedia. After all, a tag is just an abstraction for a list of links (all the things that have that tag). this is even how the resources are represented if you get them from del.ici.ous or whoever.  The real issue is that links have no machine readable distinctions. In fact, much of the business of the internet (search optimization in particular ) are heuristic attempts to create those kinds of distinctions. 

Semantic web supposedly gives us a mechanism for attaching metadata to those links, but that only gives the mechanism, and the models still need to be created (possibly by using the same techniques we do today).  In the end, any of these techniques are fundamentally statistical, and we all know that the statistics lie when the data set is too small. This is what happens at the long tail, and nothing anyone does can change that. 

Mr. Tunkelang above suggests that we abandon the statistical techniques for what is essentially a deterministic one, in which each individual controls the level of influence of the various inputs that could be used. this has some appeal as a means of allowing user input to improve the results, but it still relies on the underlying statistical techniques (unless the user is planning to spend a lot of time choosing which of the elements on the list for the del.ici.ous tag &quot;programming&quot; are relevant to his or her current search...) Since the information volumes are so great, these kinds of manual approaches usually cannot work, or they restrict the results to only those items that have been manually vetted. 

So basically, we end up with two choices: either we use a statistical technique that is pretty much assured of screwing up in the long tail, or we use a manual technique which is assured of being very limited in coverage. 

Sorry Whit, this one is a problem that IMHO will not go away.</description>
		<content:encoded><![CDATA[<p>Whit,<br />
I do need to point out that this problem is not just with tags, but with the entire notion of the web and hypermedia. After all, a tag is just an abstraction for a list of links (all the things that have that tag). this is even how the resources are represented if you get them from del.ici.ous or whoever.  The real issue is that links have no machine readable distinctions. In fact, much of the business of the internet (search optimization in particular ) are heuristic attempts to create those kinds of distinctions. </p>
<p>Semantic web supposedly gives us a mechanism for attaching metadata to those links, but that only gives the mechanism, and the models still need to be created (possibly by using the same techniques we do today).  In the end, any of these techniques are fundamentally statistical, and we all know that the statistics lie when the data set is too small. This is what happens at the long tail, and nothing anyone does can change that. </p>
<p>Mr. Tunkelang above suggests that we abandon the statistical techniques for what is essentially a deterministic one, in which each individual controls the level of influence of the various inputs that could be used. this has some appeal as a means of allowing user input to improve the results, but it still relies on the underlying statistical techniques (unless the user is planning to spend a lot of time choosing which of the elements on the list for the del.ici.ous tag &#8220;programming&#8221; are relevant to his or her current search&#8230;) Since the information volumes are so great, these kinds of manual approaches usually cannot work, or they restrict the results to only those items that have been manually vetted. </p>
<p>So basically, we end up with two choices: either we use a statistical technique that is pretty much assured of screwing up in the long tail, or we use a manual technique which is assured of being very limited in coverage. </p>
<p>Sorry Whit, this one is a problem that IMHO will not go away.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel Tunkelang</title>
		<link>http://blogs.gartner.com/whit_andrews/2008/10/29/tag-cloud/comment-page-1/#comment-60</link>
		<dc:creator>Daniel Tunkelang</dc:creator>
		<pubDate>Thu, 30 Oct 2008 02:41:36 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.gartner.com/whit_andrews/?p=100#comment-60</guid>
		<description>Whit, as you know, I&#039;m a big fan of pointing out the damage done by the adversarial nature of &quot;objective&quot; relevance functions, which is just the academic way of talking about the hostile ecosystem. What we really need--on the web as much as in the enterprise--is transparency and user control.

These principles also apply to tagging. Take away the anonymity or quasi-anonymity of tagging, and give users control over which tags affect the user experience. Given that transparency and control, I&#039;ll only consider the tags of users I trust to have non-spammy motives and ideally good judgment / taste. We&#039;ll get there, even if it&#039;s only when the current model collapses under its own weight.</description>
		<content:encoded><![CDATA[<p>Whit, as you know, I&#8217;m a big fan of pointing out the damage done by the adversarial nature of &#8220;objective&#8221; relevance functions, which is just the academic way of talking about the hostile ecosystem. What we really need&#8211;on the web as much as in the enterprise&#8211;is transparency and user control.</p>
<p>These principles also apply to tagging. Take away the anonymity or quasi-anonymity of tagging, and give users control over which tags affect the user experience. Given that transparency and control, I&#8217;ll only consider the tags of users I trust to have non-spammy motives and ideally good judgment / taste. We&#8217;ll get there, even if it&#8217;s only when the current model collapses under its own weight.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

