Postmortem -
Read details
Jan 10, 00:09 EST
Resolved -
This incident has been resolved.
Jan 9, 23:51 EST
Update -
All of the affected newsletters on dedicated pools have been successfully delivered.
Jan 9, 23:40 EST
Update -
All of the affected newsletters on our shared pools have been delivered successfully.
Jan 9, 23:03 EST
Update -
We are confirming our outbound email deliverability - that an email sent from Letterhead is reaching its inbox - is in good shape. We had marked it "degraded performance" at the beginning of this incident, before we had identified the underlying problem, but want to confirm it is not nor was at issue.
Fundamentally, the problem was with our error monitoring. Generally, failure here and there is common, and Letterhead recovers from those failures automatically. In the event it doesn't, someone is always around to intervene. This evening there was a monitoring failure that resulted in a small bug related to email-reputational silos we call "subaccounts," but because our system (dubbed the "dead letter office") wasn't aware, this issue cascaded until an unrelated memory-usage threshold monitor rung alarm bells. This coincided with reports to our customer support team.
Consequently, a small-potatoes error resulted in a pretty severe throttle of our newsletter transmissions pipeline. When a transmission fails for any reason, it will retry after a delay. This built-in delay is designed to help avoid a traffic jam. This is what happened - a lot.
We are presently still evening out.
To reiterate: a newsletter sent now shouldn't be delayed. The newsletters affected appear to have been sent between 3:00 p.m. EST until about 8:30 p.m. EST.
We very much apologize for the inconvenience. Our team is monitoring through the night.
Jan 9, 22:21 EST
Update -
We are continuing to see improved performance. We are confident your emails will reach their audience, even though some may be delayed. Newsletters sent any time in the last couple of hours do not seem to be delayed, which is good. We apologize for the hassle and we will resolve this status only when everything has returned to normal with a full post-mortem.
Jan 9, 21:48 EST
Monitoring -
Newsletters are sending at volume and our queues are slowly returning to normalcy. Delays are still elevated. We have confirmed that the underlying issue is resolved, but the sheer volume of our system will take time to even out.
Jan 9, 20:40 EST
Identified -
We appreciate your patience and we believe we traced the issue that was creating a queuing bottleneck. We are seeing queues return to normal, but delays are still elevated and it will take time to normalize.
Jan 9, 19:54 EST
Update -
We appreciate your patience.
Jan 9, 19:22 EST
Investigating -
We are seeing an elevated number of sending delays. Our team is currently investigating the situation.
Jan 9, 19:04 EST