<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Humans do it better &#8211; but do they scale?</title>
	<atom:link href="http://brendancooper.com/2009/07/23/humans-do-it-better-but-do-they-scale/feed/" rel="self" type="application/rss+xml" />
	<link>http://brendancooper.com/2009/07/23/humans-do-it-better-but-do-they-scale/</link>
	<description>Digital, social media, and everything in between from someone who likes to live in bubbles, be they dotcom, social media, or whatever&#039;s next...</description>
	<lastBuildDate>Tue, 07 Feb 2012 06:02:59 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Brendan</title>
		<link>http://brendancooper.com/2009/07/23/humans-do-it-better-but-do-they-scale/#comment-8198</link>
		<dc:creator><![CDATA[Brendan]]></dc:creator>
		<pubDate>Wed, 29 Jul 2009 17:52:37 +0000</pubDate>
		<guid isPermaLink="false">http://brendancooper.com/?p=1628#comment-8198</guid>
		<description><![CDATA[@Elena and @Alex - One thing&#039;s certain - whatever systems you&#039;re using to monitor debates about sentiment analysis seem to work! ;) I notice Alex, you even have an image on your blog that is reminiscent of the one I chose for my post!

Great comments though. It&#039;s always good to get the insights from people working in the field.]]></description>
		<content:encoded><![CDATA[<p>@Elena and @Alex &#8211; One thing&#8217;s certain &#8211; whatever systems you&#8217;re using to monitor debates about sentiment analysis seem to work! <img src='http://s1.wp.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />  I notice Alex, you even have an image on your blog that is reminiscent of the one I chose for my post!</p>
<p>Great comments though. It&#8217;s always good to get the insights from people working in the field.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alex Fortney</title>
		<link>http://brendancooper.com/2009/07/23/humans-do-it-better-but-do-they-scale/#comment-8197</link>
		<dc:creator><![CDATA[Alex Fortney]]></dc:creator>
		<pubDate>Wed, 29 Jul 2009 16:07:31 +0000</pubDate>
		<guid isPermaLink="false">http://brendancooper.com/?p=1628#comment-8197</guid>
		<description><![CDATA[Accuracy in sentiment analysis is as difficult to do as it is desirable to have. I think when a company leads with the idea that they have the best sentiment analysis available you should proceed with caution because whether or not you use computers to do it, everyone runs into similar barriers.

With computers you get scale but probably more mistakes, with humans you can&#039;t scale but you probably get more accuracy. Regardless, if you are analyzing the English language, no one can claim the holy grail because both humans and computers make mistakes. You have to use the right tools for the job.

Not strictly about sentiment but it&#039;s something we try to deal with at Networked Insights too: http://bit.ly/gRb8T]]></description>
		<content:encoded><![CDATA[<p>Accuracy in sentiment analysis is as difficult to do as it is desirable to have. I think when a company leads with the idea that they have the best sentiment analysis available you should proceed with caution because whether or not you use computers to do it, everyone runs into similar barriers.</p>
<p>With computers you get scale but probably more mistakes, with humans you can&#8217;t scale but you probably get more accuracy. Regardless, if you are analyzing the English language, no one can claim the holy grail because both humans and computers make mistakes. You have to use the right tools for the job.</p>
<p>Not strictly about sentiment but it&#8217;s something we try to deal with at Networked Insights too: <a href="http://bit.ly/gRb8T" rel="nofollow">http://bit.ly/gRb8T</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Elena Haliczer</title>
		<link>http://brendancooper.com/2009/07/23/humans-do-it-better-but-do-they-scale/#comment-8196</link>
		<dc:creator><![CDATA[Elena Haliczer]]></dc:creator>
		<pubDate>Wed, 29 Jul 2009 14:30:15 +0000</pubDate>
		<guid isPermaLink="false">http://brendancooper.com/?p=1628#comment-8196</guid>
		<description><![CDATA[My company faces these training challenges all the time. It is my experience that in order to build a reliable sentiment analysis tool you need a reliable source of training documents and/or human input at the outset. 

An algorithm may &quot;learn over time&quot; but in order to do so it needs consistent re-training by a user base or trained group of taggers. The more complex the sentiment categories get, the more human input is required to build accuracy over time. This training factor does not take away from the fact that a working machine solution to certain problems is far more scalable than a human one. 

I think the key is transparency--in terms of how you do your training and to what degree human input is required. If interested, I wrote a little about our own training process here: http://adaptivesemantics.com/blogs/Building_consistency]]></description>
		<content:encoded><![CDATA[<p>My company faces these training challenges all the time. It is my experience that in order to build a reliable sentiment analysis tool you need a reliable source of training documents and/or human input at the outset. </p>
<p>An algorithm may &#8220;learn over time&#8221; but in order to do so it needs consistent re-training by a user base or trained group of taggers. The more complex the sentiment categories get, the more human input is required to build accuracy over time. This training factor does not take away from the fact that a working machine solution to certain problems is far more scalable than a human one. </p>
<p>I think the key is transparency&#8211;in terms of how you do your training and to what degree human input is required. If interested, I wrote a little about our own training process here: <a href="http://adaptivesemantics.com/blogs/Building_consistency" rel="nofollow">http://adaptivesemantics.com/blogs/Building_consistency</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brendan</title>
		<link>http://brendancooper.com/2009/07/23/humans-do-it-better-but-do-they-scale/#comment-8188</link>
		<dc:creator><![CDATA[Brendan]]></dc:creator>
		<pubDate>Fri, 24 Jul 2009 15:59:34 +0000</pubDate>
		<guid isPermaLink="false">http://brendancooper.com/?p=1628#comment-8188</guid>
		<description><![CDATA[&quot;This is why we don’t display the sentiment for individual tweets, but instead as an aggregate since this encompasses a greater accuracy.&quot;

I&#039;m not sure I understand this. Surely if your individual sentiment calculations are correct, then in aggregation they would show correct sentiment too? 

So, if you have ten positive tweets and five negatives, then you would show a bar chart with positives at ten, and negatives at five, or give a &#039;score&#039; of 10:5 or 200% or 2:1.

If not then you&#039;re aggregating incorrect sentiment to give... incorrect sentiment.]]></description>
		<content:encoded><![CDATA[<p>&#8220;This is why we don’t display the sentiment for individual tweets, but instead as an aggregate since this encompasses a greater accuracy.&#8221;</p>
<p>I&#8217;m not sure I understand this. Surely if your individual sentiment calculations are correct, then in aggregation they would show correct sentiment too? </p>
<p>So, if you have ten positive tweets and five negatives, then you would show a bar chart with positives at ten, and negatives at five, or give a &#8216;score&#8217; of 10:5 or 200% or 2:1.</p>
<p>If not then you&#8217;re aggregating incorrect sentiment to give&#8230; incorrect sentiment.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: KDPaine</title>
		<link>http://brendancooper.com/2009/07/23/humans-do-it-better-but-do-they-scale/#comment-8187</link>
		<dc:creator><![CDATA[KDPaine]]></dc:creator>
		<pubDate>Fri, 24 Jul 2009 15:46:51 +0000</pubDate>
		<guid isPermaLink="false">http://brendancooper.com/?p=1628#comment-8187</guid>
		<description><![CDATA[Anyone doing human content analysis properly should be using formal coding instructions, that are tailored to the market&#039;s definition of desirable or undesirable content. We call it an &quot;Optimal Content Score&quot; which which changes for each client. Our coders routinely achieve between 85 &amp; 90% intercoder reliability scores.  Most auto-sentiment doesn&#039;t come close.]]></description>
		<content:encoded><![CDATA[<p>Anyone doing human content analysis properly should be using formal coding instructions, that are tailored to the market&#8217;s definition of desirable or undesirable content. We call it an &#8220;Optimal Content Score&#8221; which which changes for each client. Our coders routinely achieve between 85 &amp; 90% intercoder reliability scores.  Most auto-sentiment doesn&#8217;t come close.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tim</title>
		<link>http://brendancooper.com/2009/07/23/humans-do-it-better-but-do-they-scale/#comment-8186</link>
		<dc:creator><![CDATA[Tim]]></dc:creator>
		<pubDate>Fri, 24 Jul 2009 15:29:49 +0000</pubDate>
		<guid isPermaLink="false">http://brendancooper.com/?p=1628#comment-8186</guid>
		<description><![CDATA[One thing I discovered in my research is that human analysis is also far from perfect. Jon&#039;s dispute of the &quot;Sad Jane died&quot; example is a good example of this. What I found was that content that is not extremely polarized in how positive or negative it is was often where there was disagreement.

The other aspect is how you interpret auto-sentiment. With twendz, we&#039;re trying not to present it as the gospel truth, but rather to help you to identify trends as an early-warning indicator, or a snapshot of a point-in-time for your brand. This is why we don&#039;t display the sentiment for individual tweets, but instead as an aggregate since this encompasses a greater accuracy.

You&#039;re right, though. It&#039;s not a measurement metric, it should instead be treated as an indicator or a gauge.]]></description>
		<content:encoded><![CDATA[<p>One thing I discovered in my research is that human analysis is also far from perfect. Jon&#8217;s dispute of the &#8220;Sad Jane died&#8221; example is a good example of this. What I found was that content that is not extremely polarized in how positive or negative it is was often where there was disagreement.</p>
<p>The other aspect is how you interpret auto-sentiment. With twendz, we&#8217;re trying not to present it as the gospel truth, but rather to help you to identify trends as an early-warning indicator, or a snapshot of a point-in-time for your brand. This is why we don&#8217;t display the sentiment for individual tweets, but instead as an aggregate since this encompasses a greater accuracy.</p>
<p>You&#8217;re right, though. It&#8217;s not a measurement metric, it should instead be treated as an indicator or a gauge.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Katie Delahaye Paine</title>
		<link>http://brendancooper.com/2009/07/23/humans-do-it-better-but-do-they-scale/#comment-8185</link>
		<dc:creator><![CDATA[Katie Delahaye Paine]]></dc:creator>
		<pubDate>Fri, 24 Jul 2009 11:34:04 +0000</pubDate>
		<guid isPermaLink="false">http://brendancooper.com/?p=1628#comment-8185</guid>
		<description><![CDATA[Only in a society fueled by Zoloft and anti-depressents would &quot;Sad&quot; be considered a negative. This is the quintessential example of why automated sentiment analysis is almost always misleading.]]></description>
		<content:encoded><![CDATA[<p>Only in a society fueled by Zoloft and anti-depressents would &#8220;Sad&#8221; be considered a negative. This is the quintessential example of why automated sentiment analysis is almost always misleading.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brendan</title>
		<link>http://brendancooper.com/2009/07/23/humans-do-it-better-but-do-they-scale/#comment-8184</link>
		<dc:creator><![CDATA[Brendan]]></dc:creator>
		<pubDate>Fri, 24 Jul 2009 08:09:40 +0000</pubDate>
		<guid isPermaLink="false">http://brendancooper.com/?p=1628#comment-8184</guid>
		<description><![CDATA[Hmmm. OK, so the tone is negative, but it&#039;s definitely not negative about Jade Goody or her death, is it? If you were to show this statement to someone and ask whether it&#039;s negative or positive in relation to Jade Goody dying, they wouldn&#039;t say negative, I&#039;m sure.

Perhaps &#039;sympathetic&#039; would be a better word here.]]></description>
		<content:encoded><![CDATA[<p>Hmmm. OK, so the tone is negative, but it&#8217;s definitely not negative about Jade Goody or her death, is it? If you were to show this statement to someone and ask whether it&#8217;s negative or positive in relation to Jade Goody dying, they wouldn&#8217;t say negative, I&#8217;m sure.</p>
<p>Perhaps &#8216;sympathetic&#8217; would be a better word here.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon</title>
		<link>http://brendancooper.com/2009/07/23/humans-do-it-better-but-do-they-scale/#comment-8182</link>
		<dc:creator><![CDATA[Jon]]></dc:creator>
		<pubDate>Thu, 23 Jul 2009 20:38:13 +0000</pubDate>
		<guid isPermaLink="false">http://brendancooper.com/?p=1628#comment-8182</guid>
		<description><![CDATA[Hi Brendan,
Very interesting post. I did, however, want to clarify one aspect with regards to SocialMention&#039;s sentiment scores. You misinterpret what sentiment means in this context.

Let&#039;s take your example:
&quot; I used the tool as a test when Jade Goody died. I noticed it would class as ‘negative’ tweets that said “sad that Jade Goody died”&quot;.

This is actually correct behavior.

Sentiment, as it applies in this context is scored based on the tonality of the post, which in this case is negative. The author is sad as the post clearly shows, hence, the mention is scored as such.

Perhaps, it&#039;s the use of the labels &quot;positive&quot; and &quot;negative&quot; that are misleading.

SocialMention&#039;s sentiment scoring mechanism gauges tonality of the post as a means to flag content - it does not however, utilize a complicated language analysis ... such a system wouldn&#039;t be very accurate anyways given the extremely complex nature of internet speak.

Jon]]></description>
		<content:encoded><![CDATA[<p>Hi Brendan,<br />
Very interesting post. I did, however, want to clarify one aspect with regards to SocialMention&#8217;s sentiment scores. You misinterpret what sentiment means in this context.</p>
<p>Let&#8217;s take your example:<br />
&#8221; I used the tool as a test when Jade Goody died. I noticed it would class as ‘negative’ tweets that said “sad that Jade Goody died”&#8221;.</p>
<p>This is actually correct behavior.</p>
<p>Sentiment, as it applies in this context is scored based on the tonality of the post, which in this case is negative. The author is sad as the post clearly shows, hence, the mention is scored as such.</p>
<p>Perhaps, it&#8217;s the use of the labels &#8220;positive&#8221; and &#8220;negative&#8221; that are misleading.</p>
<p>SocialMention&#8217;s sentiment scoring mechanism gauges tonality of the post as a means to flag content &#8211; it does not however, utilize a complicated language analysis &#8230; such a system wouldn&#8217;t be very accurate anyways given the extremely complex nature of internet speak.</p>
<p>Jon</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Brendan</title>
		<link>http://brendancooper.com/2009/07/23/humans-do-it-better-but-do-they-scale/#comment-8181</link>
		<dc:creator><![CDATA[Brendan]]></dc:creator>
		<pubDate>Thu, 23 Jul 2009 19:56:46 +0000</pubDate>
		<guid isPermaLink="false">http://brendancooper.com/?p=1628#comment-8181</guid>
		<description><![CDATA[AVEs are a terrible metric. AVEs are a terrible metric. AVEs are a terrible metric. 

Does that make you happy? :)

Random sampling? As you say, it&#039;s been around for years. From what I&#039;ve learned recently, it seems to me that, to accurately and reliably measure,  you either use humans, or you use techniques that have &#039;been around for years&#039; because they&#039;re proven, statistically robust measures.

It doesn&#039;t really matter that we&#039;re dealing with &#039;online&#039; here. There&#039;s nothing that different. It&#039;s just more volume. The same maths models and statistical techniques apply.

And I suppose you could say humans have been around for quite some time too.]]></description>
		<content:encoded><![CDATA[<p>AVEs are a terrible metric. AVEs are a terrible metric. AVEs are a terrible metric. </p>
<p>Does that make you happy? <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Random sampling? As you say, it&#8217;s been around for years. From what I&#8217;ve learned recently, it seems to me that, to accurately and reliably measure,  you either use humans, or you use techniques that have &#8216;been around for years&#8217; because they&#8217;re proven, statistically robust measures.</p>
<p>It doesn&#8217;t really matter that we&#8217;re dealing with &#8216;online&#8217; here. There&#8217;s nothing that different. It&#8217;s just more volume. The same maths models and statistical techniques apply.</p>
<p>And I suppose you could say humans have been around for quite some time too.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

