[Internode-Bulletins] Internode customer mail server problems (June 28 to July 1st 2005)

internode-bulletins at lists.internode.on.net internode-bulletins at lists.internode.on.net
Sat Jul 2 16:31:59 CST 2005


Dear Internode customer,

As you will be aware, the Internode customer mail server cluster 
suffered serious performance issues and mail delays in the period from 
June 28th to July 1st 2005.

Our mail service normally delivers around 1.2 million messages a day. 
During this period, the incoming spam and virus traffic volume 
dramatically exceeded normal levels, peaking at an incoming message rate 
more than 50 times the usual level.

The mail cluster did continue to deliver customer email, not 
withstanding that delivery was substantially delayed for many customers. 
The delays were the consequence of the system processing, and rejecting, 
this onslaught of additional incoming message traffic.

As another consequence of the exceptionally high message load, some 
customer email messages were delivered multiple times - the system does 
this in order to insure against non-delivery of the messages concerned.

We are pleased to say that the problems caused by this incident have now 
been resolved, the backlog of delayed email has now been cleared, and 
normal services have been restored.

We take any sort of disruption to our service very seriously. We have 
had teams of people working solidly to fix this problem, around the 
clock, since it began.

We made a decision to guarantee that email got through eventually – even 
if it took some time to work through the backlog – rather than just 
clear the queues and start from scratch, which would have caused even 
more problems for our customers.

During this incident, we configured and installed additional network 
hardware to permanently block the incoming spam onslaught, by creating a 
second tier of high performance email firewalling for our mail cluster.

This new system is capable of rejecting incoming spam and virus attacks 
of this sort on a sustained basis, and operates in addition to the 
anti-spam/anti-virus software already operational inside the mail 
cluster itself.

In a round-the-clock effort, a separate technical team worked to improve 
the efficiency of the underlying cluster, installing additional mail 
processing cluster service nodes and installing higher performance disk 
server systems. These efforts allowed the server to clear the backlog 
still faster, and will also provide sustained benefits in terms of 
future server performance.

We apologise to every customer who was disrupted by these email problems.

While the source of the problem was beyond our control, our response to 
it was as rapid, and effective, as humanly possible in the circumstances.

We assure you that we will invest whatever it takes to further harden 
our mail system against any future problems of this nature.

We would like to remind you that when service problems are encountered 
at Internode, we post advisory information on the Internode web site and 
regularly update it. You can consult this advisory information via the 
following link:

http://cgi.internode.on.net/advisories/list.html

Finally, if you have questions or comments about this incident, you are 
welcome to write to the Internode management team by sending email to 
feedback at internode.on.net.

Regards,

The Internode Team



More information about the Internode-Bulletins mailing list