2009-02-12: 00:44 UTC  
New SSL Certificates
Over the next several days we will be replacing the SSL certifcates on all web, SMTP, IMAP, and POP3
servers. This is being done in response to the recent publication of a possible attack on MD5 signed
SSL certificates. The short story is that these researchers have created a CA, Certificate Authority,
signing certificate that can be used to sign end entity SSL certificates that will appear to have been
issued by the real CA.
The gory details are
here.
To exploit this MD5 vulnerability requires considerable cryptography knowledge and a significant amount of
computing power to create the fake CA signing certificate. The attacker then has to convince the victim
to connect to the fake server via DNS hijacking, social
engineering, or with phishing techniques. Financial institutions would be the likely target should generating the
fake CA certificate actually be acomplished outside of the laboratory.
2009-02-09: 15:19 UTC  
Internal routng problem
Apologies for the delay., its been a trying day
We use the OSPF routing protocol internally to advertise the IP addresses of each service to the
border routers providing load balancing and failover. The routers were loosing OSPF adjacency and
the assumption was that
this was an OSPF bug in the
routers or in the routing daemons running on the physical servers. OSPF bugs are not unheard of.
It appeared that the OSPF processess in the routers were consuming most the the router CPU.
Much time was wasted shutting down
all OSPF daemons and adding static routes to provide access to the IMAP and SMTP servers when
the real problem was elsewhere. With OSPF shut down the routers were still seeing bursts of 100%
CPU causing periods of total packet loss.
The problem was isolated to to a switch in our first floor rack by disconnecting all trunks to the
first floor and to our upstreams and reconnecting one by one. Eveything was then disconnected from
the first floor switches
and reconnected one machine at a time and tested. This was a time consuming process.
The culprit was a machine in our
first floor rack that was spewing packets of some sort that was driving the routers to 100% CPU. Counters
on the switches and on the machines themselves were not out of the ordinary hiding the real problem.
We have redundant routing, trunks, switches, with two Ethernet interfaces on each server. With this
configuration, the network will survive total hardware failures but not what we experienced today. We are not
new to routing and this is the first time a failure like this has been seen.
No mail was lost. The network being down will not cause mail to be lost (unless its an Exchange server but that's
not our problem). SMTP is a robust queue and retry protocol.
Mail is queued untill it can be delivered to the next hop and a positive acknowledgement of receipt is received.
Its worked that way for 20 years.
Webmail is now working.
Mail is back up and beginning to flow - for those using IMAP desktop clients. All other
processes should be coming online in the next few hours, if not before.
Static routes will be put in place within 40 minutes and should fix the problem
We apologize for this extremely unusual interruption.
It is a routing problem - no mail will be lost.
Router reload did not fix the problem.
Reloading routers now.
(Page 1 of 1, totaling 2 entries)