Tuesday, 31 August 2010

Hello World

Hi, Sam here.

When I heard Brendan was going to be starting a blog to help with the VCAP study, I thought it would be a great opportunity for me to stop procrastinating and work on it too.

So just a brief background on me. Started using VMWare back in 2000, first production ESX site in 2003 (ESX2.5), currently running ESXi 4.0 update 2 with an upgrade to 4.1 pending in 3 weeks. We have been running effectively "cloud services" since 2005 and have offered this under the guise of Utility Computing.

In the coming months, I hope to cover quite a bit on the VCAP blueprint as it relates to my current environment. Along with that I want to cover off on a few areas that appear to be lacking from what I have heard in podcasts and user group discussions. Things like developing a cost model, Billing Models, Virtualised Storage, Monitoring (Open Source Solutions), Monitoring (COTS) and some other tit bits learned by blooding my nose on the sharp end of innovation.

So enough from me now and talk to you all soon.

Saturday, 7 August 2010

Building an ESXi provisioning environment: Part 1

There are three things I knew I was going to need to build my provisioning environment:
  • A DHCP server
  • A TFTP server
  • An FTP or web server
The DHCP server should be obvious. I need my hosts to auto-configure their network awareness, and they need to know where to pick up the PXE image they're going to use to bootstrap the ESXi install process.

The TFTP server is going to provide the PXE boot image and ESXi bootstrap files.

The FTP or web server is the repository from which the ESXi installable image, kickstart script (ESXi install script), and post-install configure/build scripts are going to come from.

As far as I'm concerned, the quickest and least resource intensive way to achieve all three is to knock up a small Linux box. I chose Debian.

The DHCP server is of the ISC variety and is provided by the dhcp3-server package.

I went with the tftpd-hpa package, but any TFTP server will work for this - I'm not using any interesting features.

I opted for a web server over an FTP repository for my install media as it lends itself to a configuration script repository better than FTP. Apache's httpd server is my weapon of choice and Debian puts it in the apache2 package.

The PXE boot environment files we need come from the syslinux-common package. You'll need that too.

Install with:

vm-sys-cfg:~# apt-get install dhcp3-server tftpd-hpa apache2 syslinux-common

I've configured my host with the IP address of 10.0.0.1/24, which sets the scene for my DHCP server configuration.

My basic dhcpd configuration is:

subnet 10.0.0.0 netmask 255.255.255.0 {
range 10.0.0.201 10.0.0.210;
}


I'm not expecting to be building more than ten hosts at once, but it's easy to extend anyway.

The next part is the PXE, or in this case gPXE configuration. Now I used the ESXi 4.1 setup guide to, er, guide me on the next part as I'd never heard of gPXE before now. Unfortunately, the VMware setup guide's gPXE example is poorly explained. They spend a lot of time talking about how great gPXE is because you can boot from HTTP or FTP (or iSCSI, or ATA-over-Ethernet, or whatever) but their guide will set you up to boot from TFTP, just like a regular PXE stack, so what's the point?

The VMware guide includes:

option space gpxe;
option gpxe-encap-opts code 175 = encapsulate gpxe;
option gpxe.bus-id code 177 = string;

class "pxeclients" {
match if substring(option vendor-class-identifier, 0, 9) = "PXEClient";
next-server 10.0.0.1;
if not exists gpxe.bus-id {
filename "/srv/tftp/gpxelinux.0";
}
}


In a nutshell this defines a DHCP user class called "pxeclients" based on whether or not the vendor-class-identifier client identifier option contains the string "PXEClient" in its first nine characters. Not surprisingly, the PXE ROM in your network card does announce this.

So inside this class we can specify options and configuration that only apply to PXE clients. I've given the IP address of my Linux host as the next-server option. That's where the PXE client looks for its boot files. The next bit:

if not exists gpxe.bus-id {
filename "/gpxelinux.0";
}


basically says: if you've not identified yourself as a gPXE client, retrieve and load the /gpxelinux.0 image.

tftpd-hpa uses /var/lib/tftpboot as its root directory. Put /usr/lib/syslinux/gpxelinux.0 in there. i.e.

vm-sys-cfg:~# find /var/lib/tftpboot/
/var/lib/tftpboot/
/var/lib/tftpboot/gpxelinux.0


Cool. But what about if you are a gPXE client? Then you'll try and pull down your configuration file based on where you came from: TFTP. I want to use HTTP as my boot protocol from this point. So I've changed that stanza to read:

if not exists gpxe.bus-id {
filename "/gpxelinux.0";
} else {
filename "http://10.0.0.1/";
}


Now everything that gPXE does from this point will be relative to http://10.0.0.1/, which means I've got to set up my web server file tree.

My apache document root contains this file structure:

vm-sys-cfg:~# cd /var/www
vm-sys-cfg:/var/www# find
.
./index.html
./cfg
./cfg/cfg-00:0c:29:f2:55:a7.sh
./dist
./dist/imagedd.md5
./dist/imagedd.bz2
./ks
./ks/ks.cfg
./boot
./boot/vmkboot.gz
./boot/vmkernel.gz
./boot/ienviron.vgz
./boot/cim.vgz
./boot/sys.vgz
./boot/install.vgz
./boot/mboot.c32
./boot/menu.c32
./pxelinux.cfg
./pxelinux.cfg/default
./gpxelinux.0


The ./pxelinux.cfg/default file will be pulled down by the gPXE image. Mine's been ripped pretty much straight out of the VMware manual:

default 1
prompt 1
menu title VMware VMvisor Boot Menu
timeout 50

label 1
kernel boot/mboot.c32
append boot/vmkboot.gz ks=http://10.0.0.1/ks/ks.cfg --- boot/vmkernel.gz 
--- boot/sys.vgz --- boot/cim.vgz --- boot/ienviron.vgz --- boot/install.vgz

label 0
localboot 0x80


(Keep the append directive on a single line)

It gives me two options. I can either boot into the ESXi installer (label 1), or boot from the local disk (label 0). The default is the ESXi installer which will be invoked automatically after 50 seconds. Cool.

Everything in my ./boot directory came off the root of the ESXi install CD, as did the ./dist/imagedd.bz2 and ./dist/imagedd.md5 files. I could have just dumped everything in the root of /var/www but this way is neater.

The last part of the magic is the kickstart script referenced by the ks=http://10.0.0.1/ks/ks.cfg part of my PXE boot line.

For now, I've used:

# Accept the VMware End User License Agreement
vmaccepteula

# Set the root password for the DCUI and Tech Support Mode
rootpw p4ssw0rd

# Erase partition table on first local disk
clearpart --firstdisk=local --overwritevmfs

# Choose the first discovered disk to install onto
autopart --firstdisk --overwritevmfs

# The installation media is in the CD-ROM drive
install url http://10.0.0.1/dist/

# Set the network to DHCP on the first network adapater
network --bootproto=dhcp --device=vmnic0 --addvmportgroup=0


The install url http://10.0.0.1/dist/ directive says to perform the install from my website.

And there we go. Everything you need to automatically build an army of basic ESXi servers. The resulting ESXi servers will use DHCP for their default vmkernel adapter.

Next time I'll cover my lab DNS setup and automatically deploying a basic per-host configuration on my freshly built ESXi systems using the %firstboot kickstart section.

VCAP and building an ESXi provisioning environment

So I'm starting down the VMware VCAP road, and I want to be thorough about it, because I've read the first hand reports of people who've sat the VCAP Datacenter Administration beta exam and it sounds tough.

Being thorough means a painstaking walk through the exam blueprint. And there's going to be lots to learn as the VCAP isn't just about basic vCenter and the vSphere hypervisor any more, and that means spending time with Orchestrator, vShield Zones, the PowerCLI and much more besides; stuff which I haven't had a need to explore up to now.

But the best place for me to start is with ESXi 4.1, or as it now seems to be called (according to the download), VMvisor. Partly this is because up to now all of my VMware hypervisor work has been with ESX, console operating system and all, and that's for historical reasons (I've been doing this since before ESXi existed). But mostly it's because vSphere 4.1 is the last time we'll see the ESX service console as all future versions will be ESXi only.

Now, in the interests of making life easy for myself, I've built a simple provisioning environment to take advantage of the new scripted install mode in ESXi 4.1. This means I can have a central repository of build and configure scripts and spin up an arbitrarily complex ESXi test lab at the push of a couple of PXE-capable network-card buttons. This is a Good Thing™ because each task/step in the exam blueprint isn't necessarily going to be feature-compatible with the preceding or subsequent steps so being able to tear down and rebuild a lab repeatably and consistently is going to prove essential.

I'm building my ESXi lab on top of ESXi itself. Each 4GB physical host will house at least two 2GB ESXi systems. I'm doing that because it lets me take advantage of memory over-provisioning since I don't have the budget for more than 4GB per host, and I can have as many virtual network cards per virtual-ESXi as I need. Throw in a few Linux VMs as routers and I've got a WAN in a box as well. There'll be more about the lab architecture later.

The test lab doesn't have to be fast, it just has to work - so I don't mind so much if I end up swapping with the virtual-ESXi guests.

Anyway, that's the precis. The build environment comes next.