We'd like to keep Spreaker up 100% of the time. When that doesn't happen, we write about it here.
|Website||So far so good|
|Api||So far so good|
|Streaming||So far so good|
|Mobile apps||So far so good|
Due to a mistake in the release process, this morning we released a broken version of our Android Radio app (4.0.4), prone to crashes during the application’s startup.
We worked around the clock to fix the issue as soon we noticed it, and we’ve already published an updated version (4.0.5) on the Play Store..
This version is already available for automatic update. If you experienced this issue and your application has not been automatically updated yet, please open the “Play Store” app in your Android device, and visit the “My Apps” section to manually update it.
We’re very sorry about what happened, and we’re already working on improving our continuous integration pipeline in order to avoid similar issues from happening again in the future.
Thanks for your patience.
We’re currently having some networking issues in our primary datacenter run by AWS.
UPDATE at 15:00 UTC: networking issues still ongoing, but should affect a small number of users. We’re monitoring networking connectivity from multiple locations and the issue’s impact is currently reducing over time.
UPDATE at 15:15 UTC: AWS just reported that’s “investigating elevated packet loss between some Internet destinations and the EU-WEST-1 Region”.
UPDATE at 15.36 UTC: An external facility providing some connectivity to the AWS EU-WEST-1 Region has experienced power loss. AWS is currently working with the service provider to mitigate impact and restore power.
UPDATE at 16:23 UTC: AWS recovered power in the impacted facility and is continuing to investigate and resolve intermittent packet loss and latency between some Internet destinations and the EU-WEST-1 Region.
RESOLVED at 16:30 UTC: AWS confirmed the issue has been solved.
Playback currently doesn’t work on latest Firefox when you navigate the Spreaker’s website via HTTPS, due to stronger security policies. We’re working on it and we plan to get it fixed very soon.
In the meantime, we suggest to use a different browser (ie. Google Chrome) or temporarily navigate Spreaker via HTTP.
Thanks for your patience.
UPDATE at 11:00 UTC: the issue has been fixed now. We’re monitoring the infrastructure to ensure everything runs smooth now. We’re really sorry for the inconvenience.
Since yesterday, YouTube sharing is not working. We’re currently fixing it and re-uploading to YouTube all failed videos. This could take some time, due to the huge workload.
Thanks for your patience.
UPDATE at 15:20 UTC: all failed videos have been reprocessed and successfully uploaded to YouTube.
Since yesterday, Twitter login has not been working on our iOS applications. We’re working hard to fix it, and a new release will be uploaded in a few hours - though unfortunately, it could take a few days before it will be available for download on your devices, due to the Apple review process.
Now that your Twitter and Spreaker accounts are connected, as soon as the new release will be out in the App Store, you will be able to use the 1-click login feature to access your account.
As announced yesterday, Spreaker will be under maintenance for 15 minutes from 8:00 to 8:15 UTC. We’re going to upgrade our database servers, as countermeasure taking place after the issues we got yesterday.
UPDATE at 8:10 UTC: database servers have been upgraded successfully. Spreaker is back. Thanks for your patience.
Spreaker mainly runs on a Postgresql database. We currently have two shards, each one in a master-slave streaming replication setup. Each database instance runs on AWS EC2 with four EBS SSD provisioned IOPS disks, in a RAID 0 (stripe) configuration.
Last night, at about 01:05 UTC, we noticed a slow down of two EBS volumes attached to our master #1 database. The slow down was intermittent and still acceptable, so we decided to keep an eye on it and just wait. Unfortunately, at 02:00 UTC, such volumes suddenly stopped working and master #1 database went down.
We immediately elevated the slave database to master, redirecting both read and write queries to a single database instance (instead of splitting the load between two instances). Despite the successful slave-to-master switch, the single instance was unable to process all requests and we hit a hardware limit (500Mb/s EBS bandwidth) that led to another slow down. We started the process to create a new replica, that took more time than expected: once ready, at 03:20 UTC, the workload had been split across master and slave, and the slow down disappeared.
Tomorrow morning, at 8:00 UTC, we’ll put Spreaker in maintenance mode for about 10 minutes, in order to upgrade our database instances. We’ll double the RAM of each instance and migrate to an instance with a 1Gb/s EBS bandwidth cap.
We’re currently switching the primary database to another server, in order to recover from slow performance issues. During this timeframe, Spreaker is unavailable. We’re really sorry for the inconvenience.
UPDATE at 02:50 UTC: Spreaker is currently available, but still very slow. The primary database has been successfully migrated, but it’s slow to reply due to missing read replicas. We’re currently creating database replicas, that’s taking more time than expected.
UPDATE at 03:33 UTC: Spreaker is now working. We’re really sorry for the inconvenience. Tomorrow, we’ll do a deep post-mortem analysis and we’ll post a plan to improve the database recovery process, reducing the down time in case it will happen again.
We’re currently experiencing slow response times, due to a performance issue on our primary database server. We’re investigating it.
Streaming servers are currently under heavy load and you may not be able to listen to Spreaker audio tracks. We’re turning on more servers: it should be fixed in few minutes.