Hardware failure

Planned downtime and current issues.

Moderator: drgrussell

Post Reply
drgrussell
Site Admin
Posts: 426
Joined: Sat Feb 12, 2005 8:57 pm
Are you a robot or a human?: Human

Hardware failure

Post by drgrussell » Mon Oct 20, 2008 11:54 am

Just not my month...
10.200.0.9 has failed (suspected power supply failure).
Repair underway (estimated time to fix is 2-3 days).
I may simply replace it with one of the 64bit replacement servers I have just started testing...

Gordon.

drgrussell
Site Admin
Posts: 426
Joined: Sat Feb 12, 2005 8:57 pm
Are you a robot or a human?: Human

Post by drgrussell » Mon Oct 20, 2008 9:37 pm

I have activated an old server, hosting ips in the range 10.0.1.*.
It may be old but it works better than the broken server :)

This should tide us over until I have the new replacement servers up and running.

Gordon.

drgrussell
Site Admin
Posts: 426
Joined: Sat Feb 12, 2005 8:57 pm
Are you a robot or a human?: Human

New hardware

Post by drgrussell » Wed Oct 22, 2008 12:56 pm

I am replacing 10.0.6.* and 10.0.7.* with 64 bit servers today.
They get set up on my bench, tested in linuxzoo for a few hours, then relocated to the server room. This can lead to users losing their sessions. This should be finished by 4pm BST today.

Note as a result all users on the above named servers will lose their current disk image.

drgrussell
Site Admin
Posts: 426
Joined: Sat Feb 12, 2005 8:57 pm
Are you a robot or a human?: Human

Post by drgrussell » Wed Oct 22, 2008 4:58 pm

10.0.6.*, 7.*, and 8.* are now all brand new 64 bit dual quad core machines with 4GB each. I also noticed a bug on 10.0.19.* which I fixed... 10.0.1.* has been disabled again (as it caused more problems than it solved). This brings the maximum capacity of virtual machines to 86 simultaneous users. At the biggest group is 50 this should be more than adequate even if a single server fails.

Hopefully this will mark the end of this problematic week in the labs! Fingers crossed...

Enjoy.
Gordon.

Edit - Not 2GB each but 4GB each.
Last edited by drgrussell on Fri Oct 24, 2008 3:56 pm, edited 1 time in total.

drgrussell
Site Admin
Posts: 426
Joined: Sat Feb 12, 2005 8:57 pm
Are you a robot or a human?: Human

Post by drgrussell » Fri Oct 24, 2008 2:01 pm

The new servers are running at the wrong speeds. I have traced this to a bios problem. Upgrading the bios on a test machine of the same spec fixes the speed problem. I will be updating the bios of the production servers today. Each server will be down for about 15 minutes during this time...

I will try and do this when no one is logged into the servers in question.

Gordon.


Edit - BIOS on all three new servers is now upgraded, and they are running now at the right speed (20% faster believe it or not).

Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests