Start of a new academic year

Planned downtime and current issues.

Moderator: drgrussell

Post Reply
drgrussell
Site Admin
Posts: 426
Joined: Sat Feb 12, 2005 8:57 pm
Are you a robot or a human?: Human

Start of a new academic year

Post by drgrussell » Tue Sep 23, 2008 10:31 am

This year there are many changes planned. We are teaching even more students over the next few months than ever before, so I will be bringing some more servers online to handle the load. These are old development servers, but should be more than adequate. The capacity should be around 80 virtual machines between now and christmas.

Some of my servers are really old. I am replacing these with 64 bit dual quad core servers with 4 GB memory. 64 bit is problematic for User Mode Linux, so I am going to start by replacing one server and monitoring its performance. The main change you will see is that in dmesg skas3 is replaced with skas0. I also had to update the guest kernel in that machine to 2.6.26. These are replacing the current 1.6 athlon 1gb machines. If you have problems with a guest on the new kernel, let me know.

I am replacing my backup power supplies to 3kw APC devices. My previous power supplies failed more times than the power supply! This change is happening today. Some servers may go up and down for a few hours during the rack reorganisation required to install the batteries.

Changes happening soon:
  • I have qemu running windows 2003 in the test environment. I will add this to the system over the next few months. For licencing reasons this is only available to non-free users.
  • I have rewritten the backend of linuxzoo to support other virtual machine styles. This system should also be more efficient and reliable.
  • I hope to add the ability to book more than one virtual machine at some point over the next 6 months. The interface to this probably will be a little strange, as again I am focusing on the backend code.
  • Once the new backend is checked, I will be extending the virtual systems to support not only user mode linux and qemu, but also dynamips, pemu, and vmware. I am also looking at XP and mac guests.
  • The frontend interface needs to be rewritten. This is the task for next year.

drgrussell
Site Admin
Posts: 426
Joined: Sat Feb 12, 2005 8:57 pm
Are you a robot or a human?: Human

Server failure

Post by drgrussell » Tue Sep 23, 2008 5:15 pm

10.0.5.* failed today while I was working on the new 64 bit server and the backup power supplies.
As a fast solution I have rolled out a new 64 bit server in its place.
Obviously I will be monitoring the new machine to make sure the 32 to 64 transition has had no significant impact on the user's experience.

10.0.6.* failed a few weeks ago, but the bad blocks were all in the swap partition! I remade the swap and it seems to be working again.

I seem to have had a number of hard drive failures, all after 2.5 years of continuous use. Oh well...

Gordon.

drgrussell
Site Admin
Posts: 426
Joined: Sat Feb 12, 2005 8:57 pm
Are you a robot or a human?: Human

IP Pools

Post by drgrussell » Tue Sep 30, 2008 1:26 pm

I edited the system today so that users from NAT pool or other IP pools should be able to use telnet to connect to the system. This relies on me knowing that you have a pool (more than one external IP). If it doesnt work for you and you would like it added, let me know the details. I already have AOL...

drgrussell
Site Admin
Posts: 426
Joined: Sat Feb 12, 2005 8:57 pm
Are you a robot or a human?: Human

Crazy people

Post by drgrussell » Mon Oct 06, 2008 10:45 pm

Today one of the free users launched a portscan at an external site.
The virtual network is designed to collapse when this happens, preventing DoS style attacks. However, I feel I must take steps to prevent possible future problems.

I have increased the firewall security. Free users are forbidden from accessing the internet from the virtual machines except by http and dns. Http goes through squid, and dns is redirected to an internal dns server. Both are throttled.

Normal users on non-free servers are still allowed ports > 1024 to the internet for now.

Gordon.

drgrussell
Site Admin
Posts: 426
Joined: Sat Feb 12, 2005 8:57 pm
Are you a robot or a human?: Human

Problem with new 64 bit server

Post by drgrussell » Thu Oct 16, 2008 8:39 pm

10.0.5.* virtual machine ids are being managed by the new 64 bit server. I thought it was going well, but didnt notice a growing problem of resources being locked under some conditions. This may have left users on that server unable to log in...

I have fixed 3 bugs on that server... it remains to be seen if this fixes the problem. However one of the fixes means that any remaining problems will occur very very rarely (the only shutdown which would work before this bug was the "last ditch" kill of the virtual machines which uses kill signals).

Thus things should be fine again on that server. I will continue to monitor it.
Just when I thought it was going well too!

Gordon.

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest