fleafounder Admin Mon 8/25/14 12:08 PM

On Sunday, August 24 we had a significant outage that lasted 120 minutes. Several hundred live drafts experienced limited connectivity and did not work correctly. I want to take some time to explain what happened, how we're working with the affected leagues, and what we're doing to make sure this doesn't happen again.

What went wrong?

Fleaflicker's infrastructure is as simple as possible but it still involves six major components. These components are tried-and-tested, rock solid foundations. But they interact in ways that can be tricky to anticipate, especially under heavy load.

Our live drafts are hosted across 8 servers to accommodate heavy traffic. All the servers need to communicate to coordinate the drafts. Sunday's outage was caused by a bug in this communication layer that only surfaced under extremely high load. We stress-test all our systems before release but this was a situation where testing was not enough. Only real-world interactions could trigger the error. It took our engineers two hours to track down the problem. Once it was identified, we corrected it quickly, but for owners with drafts between 8:30 PM and 10:30 PM Eastern, the damage was done.

How can I salvage my draft?

Commissioners of private leagues have three options:

Resume your draft (and undo invalid selections)

If your league stopped drafting during the outage, you can pick up where you left off at any time. The commissioner can load the live draft room, undo the invalid auto-selections if necessary, and pause the draft to give all owners a chance to join.

Schedule a new live draft

If your draft completed and you want to clear the results and draft again, the commish can do this by clicking Clear rosters or Drop non-keepers from the settings page, and then scheduling a new draft time.

If your draft still shows as “in progress" and you're unable to clear the rosters, please contact us and we'll fix it for you immediately.

Where do we go from here?

Modern web systems are complex. I know many owners are upset and switching to a different fantasy site. This is understandable. But bigger companies with vastly greater human and financial resources are not immune to these problems. Yahoo had a similar outage Sunday night that was still unresolved as of Monday morning.

We're reviewing all of our server configurations and automatic failover capabilities to make sure they are not affected by this same error.

We always look at technical problems as an opportunity for us to learn and improve. I am the engineer responsible for this particular mistake so I am especially sorry for the downtime and its impact. For those owners that stay, thank you for your continued support. We are making significant investments and improvements so we can live up to the trust you've place in us.

-Ori Schwartz,

Founder and lead engineer

rwk1967 Mon 8/25/14 6:32 PM

Well said Ori and your correct, the so call BIG sites also have problems... Yahoo had theirs Sunday two and a MAJOR one two years ago... CBS also had a server crash that took almost a week to correct... NFL.com and other have their glitches too with scoring and such, so nobody is immune..... Glad to see the OLD timer check in and make a statement though!!! Hope the married life is treating you good :)

sleepyjim Mon 8/25/14 2:38 PM

Yeah ok I understand but yet like a week ago we had same issues drafting, granted on smaller scale......

Gets old......

Jim

CW-Mac Mon 8/25/14 2:06 PM

Still the best fantasy sports site around. Things happen. You guys fixed it, and now it's time for everyone to move on. You guys keep up the great work. Best site by far.