Planned downtime Mar 27


Advanced search

Message boards : News : Planned downtime Mar 27

Author Message
Profile Janus
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 16 Jun 04
Posts: 4418
Credit: 2,094,806
RAC: 0
Message 10873 - Posted: 27 Mar 2011, 11:26:26 UTC
Last modified: 27 Mar 2011, 11:34:55 UTC

For about half an hour on Mar 27 (some time around 14:00 UTC) the main storage server will be offline in order to physically move around storage and wiring to prepare for an expansion of the storage array. This will cause the website to be somewhat unstable - hopefully we should be fully back to normal after the Monday maintenance tomorrow.

Once everything is back and functional the Windows Sunflower clients will be updated to a new and fixed Blender version which will be capable of correctly rendering the rabbit in Big Buck Bunny. In order to push the update to the farm without causing any issues it is necessary for the Windows renderqueue to be empty. It currently contains both a 5GB RAM session and a +8GB RAM session. If your client has less than 8GB of memory available you may see it idling while the remaining high-mem units are cleared out.

Profile Janus
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 16 Jun 04
Posts: 4418
Credit: 2,094,806
RAC: 0
Message 10874 - Posted: 27 Mar 2011, 15:40:45 UTC
Last modified: 27 Mar 2011, 15:41:58 UTC

The storage switch took around an hour longer than expected because the system initially refused to boot up in the new configuration due to an incorrectly configured jumper.

Now the storage server no longer looks like spaghetti and things are looking great for the upcoming upgrades. The first upgrade is a set of new 2TB drives that will be enabled either during the Monday maintenance or later this week (they are hot-pluggable).

Profile Janus
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 16 Jun 04
Posts: 4418
Credit: 2,094,806
RAC: 0
Message 10882 - Posted: 30 Mar 2011, 17:51:10 UTC

The two disks were added today and they are currently happily syncing up.

It turned out that one of the disks (a Samsung HD204UI) had a firmware bug that caused it to sometimes write erroneous data to the platters when accessed with a disk diagnostics tool while writing (!). Luckily this was discovered during initial load tests of the drive before adding it to the raid and the firmware was updated (through DOS - oh the good old days) and it has been acting exceptionally well since then.


Post to thread

Message boards : News : Planned downtime Mar 27