Twitter influence: who do you believe?

Two people walk into a room. They both claim to have the definitive ranking for Twitter influencers for your area of interest. One uses Klout, the other, WeFollow. And guess what? Their results differ, in some cases quite wildly.

Which do you believe?

Let’s multiply the problem. Imagine you’re dealing not just with two people who have different results, but eight. Between them they’re using WeFollow, KloutTweetLevel, Twittergrader, Twinfluence and Twitalyzer, with two of them, bless, still using Followers and Lists. How quaint.

So you have eight people all claiming to have determined who you need to follow, or monitor, or talk to. My take on this? Let’s compare all of them with each other and see if there are any congruences – that is, if I rank according to one metric, then rank according to another, and compare the two, do any of these metrics exactly match? Or nearly match? Because if they do, then it’s probable that they’re more accurate because we’re getting agreeement between them. If not, then, well, we’re stuffed really aren’t we?

So, let’s take a look…

Let’s choose a subject. Say, architecture. I’ve done some social media work in that field so I’m kind of familiar with it, and it’s a nicely defined sector. So, in the manner of Sir Alan Sugar telling your apprentices what they’ve got to do next to massage his over-inflated ego, you tell your eight people to find the top twenty Twitter influencers for architecture. After an hour or so, the results are in.

First off, the person who used Twinfluence goes a bit red in the face and has to admit that they didn’t actually get any results because Twinfluence was down. So, Sir Alan Sugar-like, you tell them they’re fired and they walk out of the room in a hot funk, never to reappear.

Next up, the Twittergrader person tells you that all but two of the candidates scored 100% on the Twittergrader scale. So you cannot determine rank. That’s pretty useless so again, you send them on their way.

Straight away you’re down to six usable, workable sources: WeFollow, Klout, TweetLevel, Twitalyzer, and the two dorks still using Followers and Lists. Being a fairly thorough version of Sir Alan Sugar, you decide to chuck the results into a spreadsheet to see how the various measures compare. You take WeFollow as the base for this because at least WeFollow is explicit, that is, it’s people voting for other people rather than being figured out by an algorithm. At least you understand this. So, you take the WeFollow ranking, and compare that with how you would rank results from the other sources.

This is what you get:

_

WeFollow Rank Followers Rank Klout Rank Tweetlevel Rank Lists Rank Twitalyzer Rank
ArchRecord 1 107,490 5 41 10 66 6 2,136 5 12.2 3
archdaily 2 18,099 8 50 4 67 5 1,763 7 8.7 4
dwell 3 107,712 4 45 6 69 4 3,326 4 0 11
archiCentral 4 15,268 10 12 20 47 18 959 13 0 11
archinect 5 6,570 16 34 13 55 14 729 14 0 11
designmilk 6 159,372 2 54 2 71 2 5,137 3 29 1
wallpapermag 7 212,487 1 44 8 71 2 6,363 1 0 11
DesignObserver 8 148,762 3 57 1 73 1 5,764 2 22.3 2
MetropolisMag 9 9,627 13 31 14 53 15 1,104 11 0 11
architectmag 10 10,225 11 42 9 59 9 1,145 10 0 11
dornobdesign 11 30,754 6 45 6 59 9 1,407 8 8.2 5
AIANational 12 6,925 14 25 16 58 11 619 16 3 8
Interior_Design 13 20,749 7 27 15 57 12 1,343 9 0 11
blueprintmag 14 5,646 17 22 18 44 20 641 15 0 11
archimag 15 2,933 19 20 19 45 19 346 19 0.8 10
casinclair 16 6,647 15 49 5 62 7 606 17 7.6 6
mocoloco 17 9,672 12 36 11 53 15 1,022 12 0 11
designboom 18 15,493 9 52 3 61 8 2,075 6 4 7
architectderek 19 4,774 18 35 12 56 13 312 20 2.5 9
VariousArch 20 2,715 20 25 16 49 17 357 18 0 11

_

That’s right. None of them agree. There are really wild differences here. ArchRecord, which according to WeFollow is number one in the architecture world, would be ranked 10th if you were using Klout for this. According to Klout, DesignObserver is the top dog, which largely agrees with most of the other sources, but again, not with WeFollow. If we were to rank by Followers, casinclair would be 15th, but by TweetLevel, it would be 7th.

So we can scoff at the people still using Followers or Lists, but really, if there is very little agreement across the board, does it matter? The Followers and Lists results are kind of in the same ballpark, so even if they’re crude measures, why not use them?

But there are degrees to which they disagree. Let’s compare them to each other to see which are the closest by figuring out how much, on average, a Twitterer’s rank changes when you use each metric:

_

WeFollow Rank
compared to…
Followers Klout TweetLevel Lists Twitalyzer
ArchRecord 4 9 5 4 2
archdaily 6 2 3 5 2
dwell 1 3 1 1 8
archiCentral 6 16 14 9 7
archinect 11 8 9 9 6
designmilk 4 4 4 3 5
wallpapermag 6 1 5 6 4
DesignObserver 5 7 7 6 6
MetropolisMag 4 5 6 2 2
architectmag 1 1 1 0 1
dornobdesign 5 5 2 3 6
AIANational 2 4 1 4 4
Interior_Design 6 2 1 4 2
blueprintmag 3 4 6 1 3
archimag 4 4 4 4 5
casinclair 1 11 9 1 10
mocoloco 5 6 2 5 6
designboom 9 15 10 12 11
architectderek 1 7 6 1 10
VariousArch 0 4 3 2 9
Average Rank Change 4.2 5.9 4.95 4.1 5.45

_

The table above shows us how much each Twitterer’s rank changes when we compare it with WeFollow (I’m just interested in change here, not whether it’s up or down, hence all the values are positive. I’m no statistician but this makes sense to me for some fairly ad-hoc reason right now). So if you rank ArchRecord by Followers, its position changes by four places compared to if you’d ranked by WeFollow. And if you look at the top table you can see that makes sense: it’s ranked #1 according to WeFollow, but #5 by Followers.

The average difference is simply the average of these positional differences (again, I’m not a statistician). So, on average, if you rank by Followers, compared to WeFollow, Tweeters would change position by a little over four (ie 4.2) ranking places. Look at the average ranking change for WeFollow compared to Klout: it’s nearly six (5.9)! On average, if you drew up a top 20 ranking according to Klout and compared that with WeFollow, your ranks would differ by six places. That’s not even close.

Anyway, I said we’d compare everything with everything so on to the next few tables, with comments below.

_

Followers Rank
compared to…
Klout TweetLevel Lists Twitalyzer
ArchRecord 5 1 0 2
archdaily 4 3 1 4
dwell 2 0 0 7
archiCentral 10 8 3 1
archinect 3 2 2 5
designmilk 0 0 1 1
wallpapermag 7 1 0 10
DesignObserver 2 2 1 1
MetropolisMag 1 2 2 2
architectmag 2 2 1 0
dornobdesign 0 3 2 1
AIANational 2 3 2 6
Interior_Design 8 5 2 4
blueprintmag 1 3 2 6
archimag 0 0 0 9
casinclair 10 8 2 9
mocoloco 1 3 0 1
designboom 6 1 3 2
architectderek 6 5 2 9
VariousArch 4 3 2 9
Average Rank Change 3.7 2.75 1.4 4.45

_

No need to panic, this is doing the same thing as the previous table, but relating ranking by Followers with the other rankings (we don’t need to include WeFollow now because we already did that in the previous table). Again, we’re looking at the absolute change, regardless of whether it’s up or down, then we average those changes at the bottom.

This time the biggest change is Followers compared to Twitalyzer, at 4.45. If two people gave you rankings based on these two metrics, you’d find that on average the positions differed by between 4 and 5 places. That’s still fairly large.

The lowest here is Followers to Lists, at 1.4. In other words, ranks by Followers compared to ranks by Lists would be very similar. Do you find this surprising? I do. I think. More below.

Let’s look at how Klout rankings compare, below.

_

Klout Rank
compared to…
TweetLevel Lists Twitalyzer
ArchRecord 4 5 7
archdaily 1 3 0
dwell 2 2 5
archiCentral 2 7 9
archinect 1 1 2
designmilk 0 1 1
wallpapermag 6 7 3
DesignObserver 0 1 1
MetropolisMag 1 3 3
architectmag 0 1 2
dornobdesign 3 2 1
AIANational 5 0 8
Interior_Design 3 6 4
blueprintmag 2 3 7
archimag 0 0 9
casinclair 2 12 1
mocoloco 4 1 0
designboom 5 3 4
architectderek 1 8 3
VariousArch 1 2 5
Average Rank Change 2.15 3.4 3.75

_

This time, comparing Klout to the remaining metrics (we don’t need to do WeFollow or Followers because we did them above, remember). Klout compared to Tweetlevel is the lowest average difference but still not as low as Followers to Lists.

Next up, TweetLevel:

_

TweetLevel Rank
compared to…
Lists Twitalyzer
ArchRecord 1 3
archdaily 2 1
dwell 0 7
archiCentral 5 7
archinect 0 3
designmilk 1 1
wallpapermag 1 9
DesignObserver 1 1
MetropolisMag 4 4
architectmag 1 2
dornobdesign 1 4
AIANational 5 3
Interior_Design 3 1
blueprintmag 5 9
archimag 0 9
casinclair 10 1
mocoloco 3 4
designboom 2 1
architectderek 7 4
VariousArch 1 6
Average Rank Change 2.65 4

_

Again, I’d say these are fairly large. An average change in rank of 2.65 is still nearly twice that of the lowest so far, Followers:Lists, at 1.4.

And finally, List rankings:

_

Lists Rank
compared to…
Twitalyzer
ArchRecord 2
archdaily 3
dwell 7
archiCentral 2
archinect 3
designmilk 2
wallpapermag 10
DesignObserver 0
MetropolisMag 0
architectmag 1
dornobdesign 3
AIANational 8
Interior_Design 2
blueprintmag 4
archimag 9
casinclair 11
mocoloco 1
designboom 1
architectderek 11
VariousArch 7
Average Rank Change 4.35

_

Well done, you made it to the last table, where all we have left is ranking by Lists compared to rankings by Twitalyzer. It’s still not looking good is it? 4.35 means that rankings would change over 4 positions on average. So the person you said ranked 8th could in fact be ranked 4th, or even 12th.

I probably should create yet another table summarising all the average rank changes but I can’t be arsed. All we really need to look at are the biggest and, most importantly, lowest average differences.

The biggest is 5.9, which is when you compare how rankings would change, on average, if you rank by WeFollow compared to ranking by Klout. This implies to me that there’s something radically different behind those figures, different enough to make them mutually meaningless.

The lowest is 1.4. And guess which combination that is? It’s Followers to Lists.

Now, I’ve spent a lot of time agonising over how to calculate influence. If you do a quick search you’ll find a lot of people saying that Followers is not a good indicator of influence. Others say that perhaps Lists are better. But I don’t buy the other indicators. I don’t understand how they’re calculated and therefore I don’t understand what they mean or, importantly, what action to take. If you look at the Edelman equation for calculating Tweetlevel, it’s horrendously complicated. What does it mean? How do I improve it?

But with Followers, I get it. As an analogy with paper circulations, I can say to people that if, say, arcinect tweets about you, then around 7,000 people will see it. I get Lists too. They tell me that, for example, over 1,000 people have bothered to add architectmag to a list, which is pretty impressive, when compared to the others in the table.

So Followers vs Lists gives the lowest difference. From one angle you could say that’s just an indicator of the propensity of people to create lists, that is, for every 12 or so followers, one creates a list. But I don’t see any such ratio between number of followers and lists above.

So I’m going to be a bit heinous here and go against the commonly accepted wisdom. I’m going to say, in a nicely numbered chain of inference, that:

  1. Followers and Lists are often dismissed as indicators of influence
  2. There are lots of Twitter influence metrics out there that are supposedly better
  3. If you take any two – or three, or four, etc – and compare them, often the differences will be fairly major
  4. This implies that no one metric is really any better than any other metric
  5. Except for Followers vs Lists which seem to tally the closest
  6. You can gain actionable insights from Followers and Lists which you cannot from the other metrics
  7. Therefore: Followers and Lists are the best indicators of influence

I’m prepared to believe that some of the super-duper pro systems out there can do this better. Influence also needs time to really identify who influences whom. I know that influence is cause and effect, input versus output, etc. And this is not a scientific test, it doesn’t have a sufficiently large sample, etc.

But, if you need to draw up a list without access to a pro system, this is my take on it. The supposedly more sophisticated metrics don’t cut it.

I know it’s controversial but if anyone else can provide a convincing argument otherwise, I’d like to hear it.

Advertisements

5 thoughts on “Twitter influence: who do you believe?

  1. Yikes. I hope you don’t mind me basically slagging off your service! I’m unlikely to be in SF any time soon but I’m happy to chat about this – drop me a line at my contact page if you like.

  2. Brendan, bravo!

    A couple of notes about methodology, though.

    Because you started with the top-20 from WeFollow, you’re perpetuating an assumption that those were indeed the top-20. There might be a LOT more correlation with the other metrics, that are thrown out of whack because of the selection.

    Also, you’re not accounting for the gaps between. If your top-3 architects are far and away above the next 17, some of the scores might not account for that gap as accurately.

    But still, you parsed this down to a deep enough level to make a point, and it still sticks. We don’t know WHAT the heck we’re looking at.

    Maybe we can get a stats nerd to automate this, where you can compare several people across multiple engines, and make it easier to calculate the standard deviations that you’re approximating. We’ll generate the Dogpile of Influence, so to speak. That ought to be fun.

  3. I was interested in the overall scores of all measures added together. That approach (though obviously flawed) brings the following results, with the top 6 being almost identical to We Follow’s top 6:

    designmilk
    archrecord
    dwell
    archdaily/designobserver
    architectmag
    Metropolismag
    dornobdesign
    mocoloco
    archimag
    AIAN/wallpapermag
    blueprintmag
    designboom
    interior_design
    variousarch
    architectderek
    casinclair

  4. @Josh – that depends on how you’re bringing the scores together. That’s why you need to be very explicit about your methodology. For example, let’s say you just added Technorati Authority (typically in the hundreds) to Yahoo Inlinks (typically in the thousands). You’d then say that Yahoo Inlinks was the best metric because the rankings hardly changed – but that would just be because Inlinks are ‘bigger’ than Authority.

    So I’m interested in how you did this – blog post maybe?

    @Ike – I was wondering when you’d comment Ike!

    First, the methodology. True, the list is seeded using WeFollow but that’s simply because it’s a nicely categorised system, ie I can go there and specify a topic and it gives me the people, rather than hunt around for a (very) long time looking for people. I knew this would be a problem, which is why I then compared rankings between *every other metric*. That’s why we have so many tables. So even if, say, TweetLevel rankings would be different from WeFollow, it appears they are different from every other metric too.

    Let’s say the algorithms are equivalent but the databases are different, so that, if they all work to the same database, they give the same results. Doesn’t change this: if they’re working off different databases, they’re still just plain old different, and if they’re all different, we still don’t know which is best.

    The methodology is actually the same I used a long time ago to see whether there were correlations between different blog measures (I just spent 15 minutes looking for it to link to it but cannot find it!). Turned out Technorati Authority was the best overall indicator – but we all know what’s happened to Technorati since. So maybe that needs revisiting.

    The idea of an automated list is a good one – and I think that’s what Ad Age is doing with its Power150, right? So yes, maybe someone somewhere could start ranking people on Twitter according to the ‘rank of ranks’. Heck, we have aggregators of aggregators and search engines of search engines nowadays, so why not?

    Oh, and I was calculating standard deviations was I? Well I never!

Look! It's a comment field!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s