Lights out without a LOM

2007-05-25

From the first time I played with the LOM (Lights Out Management) interface on a Netra, I've wanted a way of getting that functionality on all of my boxes. I've found that by making use of a serial port, wake on lan, and a TFTP server, you can get similar functionality without breaking the bank. This guide assumes that you're attempting to boot x86 hardware with PXE/Wake on Lan support and are using OpenBSD for the netboot server. I'll try to point out any significant differences between OpenBSD and other Linux/Unixes.

First, record the MAC address of the system you want to work with, you'll need this later for DHCP and Wake on Lan. The easiest way to get the MAC is to run /sbin/ifconfig or /sbin/ip link but if the box isn't running or doesn't have an OS installed, you can pop the cover off and read it off the NIC. Or for the extremely lazy (myself included) you can have the box attempt to PXE boot and watch your DHCP server logs.

Next, setup your TFTP server by creating /tftpboot and uncommenting the appropriate line in /etc/inetd.conf. Restart inetd. Depending on what hardware you're booting and what you want to install/boot, you need to add the proper files to your /tftpboot directory. I keep a number of different kernels/bootloaders on my TFTP server so that I can boot just about any hardware I come across with any distro I want.

Let's assume that you want to install Debian on an x86 box. The Debian folks have been nice enough to create a tarball of everything you need to get going on a TFTP server. Just grab /dists/stable/main/installer-i386/current/images/netboot/netboot.tar.gz from your nearest Debian mirror and unpack it in /tftpboot. In the event that the tarball isn't available for the branch or distro you want, you'll need at least a pxelinux file and a kernel. Other files may be necessary depending on the distribution. Check your documentation for details.

Now you need to add a section to your /etc/dhcpd.conf so that the filename and location of the TFTP server is given to the PXE bootloader when it broadcasts for DHCP.

host netboot { hardware ethernet xx:xx:xx:xx:xx:xx; fixed-address 10.1.1.201; filename "debian/i386/pxelinux.0"; next-server 10.1.1.1; }
The first line defines that this is a specific host. The name isn't really important here unless you've got dynamic DNS setup and that's outside the scope of this article. For now, just pick something that tells you what this section is so that you know what it does when you come back to it later. Fill in your MAC address in the hardware ethernet line so that the DHCP server knows what host we're talking about. The fixed-address line is only necessary if you want this host to always get the same IP assigned to it. If not, you can leave this out, but make sure you've got a dynamic range setup elsewhere in your dhcpd.conf. The filename line tells the client where the PXE code to run is relative to your /tftpboot directory. I keep a number of different images for various distributions and architectures on my server, so all of my boot images are organized by <distro>/<arch>/file. In some cases, paths for config files and whatnot are hardcoded into the PXE loader and I resort to symlinking to various files. Finally, the next-server line tells the client which TFTP server to connect to to download the image.

Once you've saved and reloaded your dhcpd.conf, your box should now boot on the console assuming that the BIOS is configured to allow PXE booting. Double check that it grabs a configuration from DHCP and downloads the image from the TFTP server. Once this is working, you're ready to move on to the serial port.

This is where things really deviate depending on what distro you're trying to boot. I'll do my best to cover as many scenarios as possible.

Syslinux (pxelinux)

Add a line that looks like this to your pxelinux.cfg file:

serial 0 9600
Where 0 is the COM port you want to connect to and 9600 is the speed. Unless you've got a good reason, I wouldn't set the speed faster than 9600 baud. While this isn't a rule, it's a bit of a standard that if you plug in a serial console with 9600-8n1, things should work.

GRUB

Add the following lines to your /boot/grub/menu.lst

serial --unit=0 --speed=9600 --word=8 --parity=no --stop=1 terminal serial
It is advisable to setup a boot password on the serial port because if someone were to compromise your serial console, they'd be able to boot into single user mode and gain root access to your system. Generate a password hash by running md5crypt at the grub prompt and adding the following to your menu.lst

password --md5 <hash>

LILO

Add the following line to your lilo.conf to enable serial console access. Once again, a password is advisable.

serial=0,9600n8

OpenBSD

The BSD bootloaders are a bit different than LILO and GRUB. They tend to resemble OpenBoot in some ways, making complicated configurations not so complicated. OpenBSD's pxeboot will attempt to download boot.conf from your TFTP root for additional configuration information. It'll still work if this file doesn't exist. Create /tftpboot/boot.conf and add the following line to it:

set tty com0
This will switch the bootloader's interface to the serial port, giving you complete control of the system. No additional changes are needed because the BSD kernel will get it's console configuration from the bootloader. Your system should boot and be happy.

Linux Kernels

Depending on the kernel you're booting, you may also need to specify a serial console in the kernel arguments by appending the following to the kernel command line.

console=ttyS0,9600n8

If you boot the system now, you'll be able to see kernel messages scroll by as the system boots. Depending on the userspace setup, you still may or may not get a terminal. If not, you'll need to find a way to add the following line to /etc/inittab

co:2345:respawn:/sbin/getty ttyS0 CON9600 vt102

Once your system is installed and running, you may have to tweak the installed bootloader and/or userspace to get the serial console working again. This probably seems like a lot of extra work without much gain, but once you've done it a few times, it becomes second nature and the benefits of running the system on a serial console are apparent.

The last piece of the puzzle is Wake on LAN. WOL works by having the NIC watch for a 'magic packet' on the local link while the system is powered off (ATX power supplies provide 5 volts to the motherboard in the off state for just this sort of thing). When the magic packet is received, the NIC sends a wake up event to the motherboard which relays it to the power supply as if the power button were pushed. You may need to enable these features in the BIOS and/or network card configuration. Not all cards support PXE booting and others do, but call it something different. If in doubt, Google for the model of the card.

In order to send the magic packet, you need to have at least one machine already running on the network segment. There are dozens of scripts on the net for sending out Wake on LAN packets, but because I'm a Python advocate, I tend to prefer this one. Create a .wake_hosts file in your home directory with one host on each line in the format:

<hostname><tab><mac address>
Put the wake script in your PATH and run wake to send out the magic packet. If all goes well, you'll see your serial console come to life.

While the serial console may not be practical for system installs, netbooting certainly is. But once you've got everything running, the serial console can be an invaluable tool in case you manage to lock yourself out of the system by crashing sshd, adding a bad rule to iptables, messing up the routing table, installing a bad kernel, or losing the root password. This approach is limited in comparison to a LOM in that you don't get direct access to the BIOS or more complete diagnostic tools without setting up something like memtest86.

A word on security; in most cases, adding another avenue of access to the system that is not necessarily needed is considered bad security practice. However, by installing a boot loader password and securing your terminal server, the chances of a successful attack through the serial console are probably on par with running an SSH daemon. Even if it does provide a secuity hole, I believe the benefits of having a lifeline like this outweigh the risks.

Next post - MythTV Protocol Library: HD on a PPC Mac mini