Microsoft to provide Azure users with 33 percent credit for February outage

By | March 13, 2012, 7:25am PDT

Summary: Microsoft officials have posted a detailed analysis of what led to a widespread leap-year-day outage of its Azure public cloud service.

Microsoft is issuing many of its Windows Azure users with a 33 percent credit “due to the extraordinary nature” of the February 29 cloud-service outage caused by a leap-year bug.

Microsoft officials said all customers of its Azure Compute, Access Control, Service Bus and Caching will get the credit for the entire billing months for its services, whether or not their service was affected. Microsoft execs shared that information — as well as a play-by-play dissection of what caused the widespread outage in a March 9 blog entry (posted at 9 pm ET on Friday, March 9).

The widespread Azure outage began around 9 pm ET on February 28. Customers in Europe, North America and other areas were hit by a series of rolling problems over the course of two days. Many said they weren’t able to access the Azure dashboard, which was basically the only means by which Microsoft was sharing information about the status of the different Azure services. The outage was largely resolved by the morning (ET) of March 1.

The leap-year bug caused a first outage, which then led to a secondary outage. Bill Laing, the head of Microsoft’s server and cloud team, explained what happened:

“The leap day bug immediately triggered at 4:00PM PST, February 28th (00:00 UST February 29th) when GAs (guest agents) in new VMs tried to generate certificates. Storage clusters were not affected because they don’t run with a GA, but normal application deployment, scale-out and service healing would have resulted in new VM creation. At the same time many clusters were also in the midst of the rollout of a new version of the FC (fabric controller), HA (host agent) and GA.”

Laing said Microsoft is taking steps to prevent future time-related bugs with new testing procedures, improvements in dashboard service availability, and a commitment to provide alternate communication channels when outages happen.

Kick off your day with ZDNet's daily e-mail newsletter. It's the freshest tech news and opinion, served hot. Get it.

Topics

Mary Jo has covered the tech industry for more than 25 years for a variety of publications and Web sites, and is a frequent guest on radio, TV and podcasts, speaking about all things Microsoft-related. She is the author of Microsoft 2.0: How Microsoft plans to stay relevant in the post-Gates era (John Wiley & Sons, 2008).

Disclosure

Mary-Jo Foley

Freelance journalist/blogger Mary Jo Foley has nothing to disclose. WYSIWYG (what you see is what you get). I do not own Microsoft stock or stock in any of its partners or competitors. I have no business ventures that are sponsored by/funded by Microsoft or any of its partners or competitors.

Biography

Mary-Jo Foley

Mary Jo Foley has covered the tech industry for 25 years for a variety of publications, including ZDNet, eWeek and Baseline. She has kept close tabs on Microsoft strategy, products and technologies for the past 10 years. In the late 1990s, she penned the award-winning "At The Evil Empire" column for ZDNet, and more recently the Microsoft Watch blog for Ziff Davis.

Got a tip? Send her an email with your rants, rumors, tips and tattles. Confidentiality guaranteed.

Related Discussions on TechRepublic

Did you know you can take part in these discussions with your ZDNet membership?
3
Comments

Join the conversation!

Just In

CostCloud
scH4MMER 1 day ago
While an unfortunate occurrence, this event clearly demonstrates how many of the risks of outsourcing part of your infrastructure to the Cloud are offset by service level agreements. When errors are committed by a private IT department, there is no compensation. I'm speaking about the Cloud in general, not just Azure.
1 Vote
+ -
It's not easy getting leap years working;-)
Richard Flude Updated - 2 days ago
This company is priceless.

Take the credit, or move to a unix cloud platform?
2 Votes
+ -
Unix has no bugs, right?
jdzions 2 days ago
So of course, one outage on Azure is going to chase you to a unix cloud platform, like, say, AWS, which has never had a multi-day outage before.

0 Votes
+ -
CostCloud
scH4MMER 1 day ago
While an unfortunate occurrence, this event clearly demonstrates how many of the risks of outsourcing part of your infrastructure to the Cloud are offset by service level agreements. When errors are committed by a private IT department, there is no compensation. I'm speaking about the Cloud in general, not just Azure.

Join the conversation!

Formatting +
BB Codes - Note: HTML is not supported in forums
  • [b] Bold [/b]
  • [i] Italic [/i]
  • [u] Underline [/u]
  • [s] Strikethrough [/s]
  • [q] "Quote" [/q]
  • [ol][*] 1. Ordered List [/ol]
  • [ul][*] · Unordered List [/ul]
  • [pre] Preformat [/pre]
  • [quote] "Blockquote" [/quote]

The best of ZDNet, delivered

ZDNet Newsletters

Get the best of ZDNet delivered straight to your inbox

Facebook Activity

White Papers, Webcasts, & Resources