Random server events 2012-06-11


Advanced search

Message boards : Server backend and mirrors : Random server events 2012-06-11

Author Message
Profile Janus
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 16 Jun 04
Posts: 4483
Credit: 2,094,806
RAC: 0
Message 11412 - Posted: 11 Jul 2012, 16:00:07 UTC

I was moving around backup data today and had inadvertently connected the backup drives directly to mains power instead of the UPS system. In a peculiar mix of random events a nearby lightning strike took out power in a few seconds causing a kernel panic on the main server as the backup file systems abruptly disappeared from view during heavy I/O.
Normally this kind of unlucky events would incite the use of bad language in the local server room but in this case it turns out that the reboot fixed a problem that had been escalating since June 30th: A lot of Java threads had been spinning out of control, keeping the server under a constant 25% load. After the crash/reboot things are back to normal.

Profile Janus
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 16 Jun 04
Posts: 4483
Credit: 2,094,806
RAC: 0
Message 11413 - Posted: 11 Jul 2012, 16:03:51 UTC

It turns out that the first runaway processes where probably caused by a leap second that was added in the night between June 30 and July 1st 2012. It coincides nicely with the first CPU drain starting.

Heh, always learning new stuff:
- Hook up backups through the UPS
- The time can actually be 23:59:60 UTC occasionally

Profile DoctorNow
Project donor
Avatar
Send message
Joined: 11 Apr 05
Posts: 392
Credit: 2,168,338
RAC: 22
Message 11424 - Posted: 25 Jul 2012, 8:31:16 UTC

Ah, BURP is back again finally. :-)
What was the problem for this long outage, Janus?
____________
Life is Science, and Science rules. To the universe and beyond
Proud member of BOINC@Heidelberg
My BOINC-Stats

Profile Janus
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 16 Jun 04
Posts: 4483
Credit: 2,094,806
RAC: 0
Message 11427 - Posted: 26 Jul 2012, 11:35:07 UTC - in response to Message 11424.

I created a new thread for that


Post to thread

Message boards : Server backend and mirrors : Random server events 2012-06-11