Of course, it’s probably worse when you get nothing at all, or a “No Operating System Found”. But the unsettling thing for many sysadmins about the
grub> prompt is that it’s a prompt that doesn’t respond to the usual Linux commands. The inline
help isn’t hugely helpful either. I’m going to demonstrate a few useful GRUB commands for recovering from a failed boot, and explain a few things about GRUB and bootloaders along the way.
A broken GRUB config typically arises when creating an extra OS partition, or when migrating disks. Depending on the operating system, or its setup options, the old GRUB configuration could be rendered invalid. The usual wisdom is to simply boot from a Live installation CD/DVD into recovery mode and fix everything from there. That’s one way, but I’ve always felt a little more self-sufficient to be able to fix the problem without the extra tool or a USB pen or CD. You don’t need to have written down a lot of long kernel version strings either, as I’ll show you. The commands are easier to remember if you understand the process of what is actually happening.
What a boot is
Linux booting is a multistage process. When a computer is powered on, it has nothing in RAM. The operating system itself, however cannot be loaded and executed directly. Instead, a small piece of machine code is executed from the Master Boot Record (MBR), the first 512 bytes of the first disk, which in turn loads the GRUB program from disk. The GRUB program is known as a “bootloader” and its purpose is to prep the way for the running of the operating system itself.
The term “bootloader”, also called the “bootstrap loader”, is so called after the expression “to pull oneself up by one’s bootstraps”, implying the performing of an impossible feat using only one’s own resources. So in that sense, the expression accurately describes a bootloader.
It does this by “unpacking” the Linux kernel image to what’s called an initial RAM disk, or initrd. These terms will become relevant soon.
What GRUB does
GRUB stands for GRand Unified Bootloader, and as well as being able to bring up a running system in any of a whole host of operating systems (including Windows and other flavours of Unix), it also complies to the Multiboot Specification, in that it has a menu which allows for the user to choose to boot from more than one kernel, which may reside on different disk partitions.
GRUB, or more specifically, GRUB2, has become the de facto standard on several distros of Linux and is installed by default. I’m going to demonstrate two scenarios where the GRUB configuration has gone wrong on the default volume configuration of two flavours, namely:
- Ubuntu installed on disk partitions
- Fedora installed on LVM volumes
Fixing the boot from the GRUB prompt
For both examples, the same thing happens (never mind how it got that way) – from BIOS, we see the text “Grub Loading”, and then no menu or boot messages, just a single solitary prompt:
But the good news is that this means that the MBR is intact, and the GRUB program has been loaded.
The first thing to do is to find out what disk devices are present on the system, as far as GRUB is aware. The goal will be to determine which device contains the Linux kernel we want to run, and to boot up a running OS. Once we get a running system, we can fix the GRUB configuration to get the menu back with the automatic booting.
Booting Ubuntu installed on disk partitions
By default, Ubuntu installs straight to disk partitions, and doesn’t use any volume management. This scenario makes a GRUB recovery easier, because all data is stored unencapsulated, and easily readable by GRUB.
grub> prompt, list all disk devices known to the system:
grub> ls (hd0) (hd0,msdos4) (hd0,msdos3) (hd0,msdos2) (hd0,msdos1)
The goal is to work out which partition stores the Linux kernel image. This is a file called vmlinuz* and on Ubuntu it’s under the /boot directory, but conveniently it’s symbolically linked from /. In general this will be the first partition, so we’ll list the root filesystem (/) like so:
grub> ls (hd0,msdos1)/
Which shows that this is the correct partition (in fact, it could have also been specified simply as
(hd0,1). Then, there are only four commands that need to be run with arguments to boot the kernel. Note that happily, tab-completion can be used in the GRUB shell to save on typing. When you have several invalid kernels with long version strings, this save a lot of time and makes things more accurate:
set root=(hd0,msdos1)– this specifies the partition from which to load the images.
linux /vmlinuz ro root=/dev/sda2– Load this Linux kernel, with arguments
initrd /initrd.img– load this Initial RAM disk
boot– Fire it up.
Executing it on a system looks like this:
The result, after some scrolling output, is a running Ubuntu host.
Booting Fedora installed on LVM volumes
By default, Fedora manages its disks using LVM volumes, and the OS is installed on an LVM disk. These LVM volumes aren’t immediately readable to GRUB, but this doesn’t matter because the kernel image and initrd reside on the separate non-LVM /boot partition which is readable from the GRUB shell.
So, from the GRUB prompt, listing available disk devices and their contents:
The first partition, (hd0,1) in this case is actually the MBR itself (that is, the first 512 bytes). The /boot filesystem resides on the second partition (hd0,2), which is apparent by the vmlinuz and initramfs files.
The tricky thing with LVM is that in the kernel GRUB line, it’s necessary to specify the path of the root filesystem that the kernel will use. But unlike straight disk partitions, when this is as simple as, for example, /dev/sda2, with LVM it’s necessary to cite the full path of the logical volume device file. This becomes slightly difficult without prior knowledge of the LVM configuration, and it goes to show why this kind of thing is worth documenting.
One way of finding out the path of the root filesystem is to boot into a recovery shell, known as dracut by specifying rdshell on the GRUB Linux line.
When the host fails to boot (due to you specifying an incorrect “root” value), the boot process will interrupt and fall to a dracut shell. From here, one can query the LVM configuration:
# lvm vgscan # lvm lvscan
Which reveals that the root filesystem is mounted at the volume
From this, turns out that the root volume of Fedora Linux gets mounted at:
In other words, substitute your short hostname (non-FQDN) for “$(hostname -s) above. So for the test VM I’m using, which is called “mpvirtfedora” , the root logical volume is at:
So, reboot the server to the GRUB prompt again and, similarly to the Ubuntu example, enter the root, linux and initrd commands:
At which point, everything working out, your Fedora host will boot up and you can breathe a sigh of relief.
Now that you’ve booted back into your server, fix the GRUB configuration so you don’t get stuck in this position again. Fortunately, this is the easy part, as GRUB2 now has scripts that scan your disks and build a GRUB configuration automatically, without you needing to even know anything.
On Ubuntu, just execute:
And on Fedora:
# grub2-mkconfig -o /boot/grub2/grub.cfg
These are just simple examples – possibly too simple to be real world examples – but it’s worth do a dry run yourself according to your system configuration so you can satisfy yourself that you can manually boot unprepared and unaided by notebooks, Google or Live CDs.
All rights reserved ©