13 8 / 2012

How Spreaker Tracks Listening Statistics

UPDATED on December 24th, 2012

I recently joined some discussions with our users, where they asked how statistics work in details. I understood there’s a bit of chaos about what people thinks how we calculate statistics and it revealed the need of a detailed and technical explanation.

Introduction

Listening statistics is the primary metric for a podcaster. Like a blogger measures its success counting the number of times an article gets read, a podcaster wants to know how many people listens to his audio contents.

Tracking accurate statistics is not an easy task, but we try to our best to be as most accurate as we can. In this article, I’m going to explain in details how statistics are tracked at Spreaker, showing strengths and weaknesses, and exploding some myths and wrong information you can find on the net.

How we track statistics

Listening statistics are tracked on the streaming servers. They’re not tracked client-side in the player and this approch gives us two main benefits:

  1. we are able to track listenings from third party players that are not under our control and
  2. it’s much more difficult to track fake plays.

This means that we track every single play whose audio is served by our streaming infrastructure and we’re unable to track statistics for contents created with Spreaker and uploaded to other platforms. For instance, if you enable the sharing of your episodes on Soundcloud, we upload your mp3 file to Soundcloud and Soundcloud users listen the mp3 downloaded from their servers, so that we’re unable to track that plays.

To summarize, Spreaker do tracks statistics for listenings from:

  • Spreaker website (desktop and mobile)
  • Spreaker embedded players (including Facebook, Tumblr, your own blog or site)
  • Spreaker mobile apps (iOS and Android)
  • Third party services and apps that rely on Spreaker streaming servers (including iTunes and TuneIn)

Spreaker do not tracks statistics from:

Downloads

Downloads are counted separately from listenings just because a user can download one your podcasts once and listen to it several times.

Because we cannot track every single listening of a downloaded podcast, if a user downloads a podcast instead of listening to it, we increment the “Total downloads” counter instead of the “Total plays” counter.

Difference between Total plays and Unique listeners

Total plays is the number of times your listeners have pressed “play” button on your episodes and listened at least few seconds of audio.

Unique listeners is the number of different (unique) users have listened to your episodes. This number is always less or at most equal to total plays.

image

How unique listeners are calculated

Tracking unique listeners is a bit tricky because we have to identify who is the listener, in order to correlate together two different plays from the same person. This is a common problem that most of all internet related statistics services have (ie. Google Analytics for website).

To track unique listeners we associate a listener id to each play. This listener identifier is calculated as follow:

  1. if the user is logged in to Spreaker, we use its unique user id (high accuracy)
  2. if the user is not logged in to Spreaker or the user is using a third party player (ie. iTunes), we calculate an hash with several information, including the IP address and browser user agent (low accuracy)

After that we calculate the number of unique listeners in a specific time range (ie. daily or monthly unique users) calculating the number of unique ids among all plays.

Are statistics updated real-time?

The short answer is not all statistics.

Calculating statistics is an heavy task and requires a lot of computation. For this reason, we track real-time statistics that simply require an increment (ie. total plays and total downloads), while we periodically calculate the aggregates of other statistics that require a more heavy elaboration (ie. unique users). These aggregates are usually calculated every few hours.

Generally speaking, the statistics shown in public pages are real-time, while private statistics shown in the dashboard are updated every few hours (at most every day).

image 

How are popular contents ranked?

Ranking on Spreaker is designed to pop up contents that had many listeners in the last day and week. The algorithm details are secret and subject to changes, but we can say which are the metrics that we use to calculate it.

Metrics related to ranking (tracked in the last day and week - more is better):

  • Number of unique listeners and total plays
  • TTSL (total time spent listening)
  • Average play length
  • Number of followers
  • Contents with an image, description, tags, …

Metrics not related to ranking:

  • Total number of episodes - it doesn’t matter how many episodes you broadcasted, so that if you’re a new user but you get an huge audience, you can jump to the first position anyway