Server and website downSolved

This morning the website and server were down.

We had multiple messages from our customers - not ideal.

Edit to add: being more specific, for most users they went into the ‘Grace’ period, so no real issue.
But we had a couple that had not used our software for some time, and they were locked-out when they tried to run it while the website/license servers were down.

, edited

Hello,

Yes, I also confirm that we had issues with the WEB API (and maybe the activation system).
It happened between 1am and 9am UTC.

Now this seems resolved.

We also experienced issues, I suspect another maintenance window with unforeseen issues. Communication about maintenance in advance would be much appreciated. 

We were also affected by the outage. Among our customers are many churches who use the software for their services, and as it's one day to Easter some of them were very concerned.

It would be great to have a channel where we could get a status update in situations like this, so we can at least confirm to customers that the problem is being worked on, maybe with a time estimate.

👍 1
Answer

I apologize for the downtime. It totaled around 3 to 4 hours (depending on the region).

This was unplanned downtime due to (what looks like right now) to have been a bug in the kernel driver for the networking. Unfortunately it failed in such a way that also prevented failover to our other servers.

We need to do more investigation into exactly what failed and why. Then we need to redesign our failover (yet again) to handle these types of failures.

But in the meantime we’ve updated the kernel driver and have enabled more watchdog services to notify us of other similar failures (until our redesigned failover is up and running).

All signs point to an unfortunate kernel bug that might have already been fixed by the newer kernel we’re now running. But like I said we’ll dig more into it in the coming weeks.

, edited