The following information may have errors; It is not permissible to be read by anyone who has ever met a lawyer.

Use is confined to Engineers with more than 370 course hours of electronic engineering for theoretical studies. <p> ph +1(785) 841-3089 Email inform@xtronics.com

Raid

From Transwiki
Jump to: navigation, search

Contents

[edit] Raid Recovery

What happens - a computer dies - motherboard or powersupply - time to move everything to a new machine.

  1. Get the new machine installed and booted.
  1. Power down and add the old raid/ or just one of the drives.
$ mdadm -A /dev/mdx  /dev/sdyx /dev/sdzx
  • Where the x in mdx is NOT the same as any current arrays.
  • Where sdy and sdz are the old raid and the trailing x is the partition you want to load.

Now I thought all I would have to do would be mount - nope - this computer went down with a power failure - so

$ fsck.jfs -v /dev/mdx

Then

$ mount /dev/mdx /mnt/
  • Remember to use rsync -auHxv to copy


[edit] Drive replacement

  • Copy create partitions from working drive (sda) to new drive (sdb )
$sfdisk -d /dev/sda | sfdisk /dev/sdb
  • see how raid goes together
$cat /proc/mdstat
  • add new drive
$mdadm --manage /dev/mdx --add /dev/sdby

where x and y are the appropriate numbers..

[edit] Copy a working Debian system over to RAID

[edit] Overview of steps

  • Install Debian on first drive (/dev/sda)
  • Creating a degraded RAID1 array on disk 2 (/dev/sdb)
  • Update initrd
  • Copying the Debian installation over from disk 1 (sda to sdb)
  • Fix fstab on /Dev/md2
  • Add disk 1 to the degraded array to create final raid array
  • Update initrd again
  • Produce /etc/mdadm/mdadm.conf
  • Setup the monitoring daemon

In this example Disk1 and disk2 are /dev/sda and /dev/sdb respectively.

[edit] The partitions I used are:

/boot 200m
/swap 1Gig raid0
/ rest of drive
A note on raid systems and swap: It is possible for a system to crash if the swap area drive fails. If this is not a concern you could also set it up as simple swap areas or as raid0 to improve performance. (Raid1 also provides faster reads as a side benefit, but there is the cost slower writes - but swap tends to be read more times than written so you might still come out ahead.)).

[edit] Install Debian on first drive

I used the following partitions - I'm using /dev/sda and /dev/sdb as the new drive for the rest of this explaination.

Device Size Id Eventual mount point
/dev/sda1 200M 83 /boot/
/dev/sda5 1G 82 swap
/dev/sda6 100G 83 /

[edit] Partition Drive2 with fdisk

Device Size Id Eventual mount point
/dev/sdb1 200M fd /boot/
/dev/sdb5 1G fd swap
/dev/sdb6 100G fd /

Larger drives over 2TB will need parted see Growing_Partitions_and_file_systems#using_parted

[edit] Create the raid devices

# mdadm --create --verbose /dev/md0 --level 1 --raid-devices=2 missing /dev/sdb1


The above line is the long version of the next line just for reference. .

# mdadm -Cv /dev/md0 -l1 -n2 missing /dev/sdb1
# mdadm -Cv /dev/md1 -l1 -n2 missing /dev/sdb5
# mdadm -Cv /dev/md2 -l1 -n2 missing /dev/sdb6

This creates 3 degraded RAID1 devices (for /boot, swap, and /) consisting of a dummy drive "missing" (why was this so hard to figure out!)

Now, a cat of /proc/mdstat will show your degraded raid devices are up and running

#cat /proc/mdstat

Personalities : [raid1]
    md1 : active raid1 sdb5[1]
     979840 blocks [2/1] [_U]
    md0 : active raid1 sdb1[1]
     192640 blocks [2/1] [_U]
    md2 : active raid1 sdb6[1]
     159661888 blocks [2/1] [_U]
    unused devices: <none>


Note how one drive in all three cases is missing ("_") as opposed to 'Up and running' ("U").


[edit] Some notes about raid metadata, grub2 and large drives

As of this writing (September 29,2011 ) Grub 2 has trouble with large drive and raid metadata 1.2. The workaround is that your boot drive has to be smaller than 2TB and formated with a msdos partition table and use metadata 0.9. If you want to use a large drive, you have to partition using a GPT (GUID Partition Table ). Only parted will do this. (see Growing_Partitions_and_file_systems for the bit about using parted. Set the raid flag on and create the raid with metadata 1.2 (0.9 does not support drives over 2TB). Don't forget to "update-initrdfs -u "


Some new 'feature' of mdadm using 1.2 metadata is that new arrays return with active (auto-read-only) and resync=PENDING. This needs to be fixed with the following command:

# mdadm --readwrite /dev/mdx

[edit] Creating the file systems and Mount them

#mkfs.jfs /dev/md0
#mkfs.jfs /dev/md2
#mkswap /dev/md1
#mkdir /mntroot
#mkdir/mntboot
#mount /dev/md2 /mntroot
#mount /dev/md0 /mntboot

[edit] Copy over everything to the degraded raid

#rsync -auHxv --exclude=/proc/* --exclude=/sys/* --exclude=/boot/* --exclude=/mntboot --exclude=/mntroot/ /* /mntroot/
#mkdir /mntroot/proc /mntroot/boot /mntroot/sys 
#chmod 555 /mntroot/proc
#rsync -auHx   /boot/ /mntboot/

[edit] Update /etc/mdadm/mdadm.conf

Run:

/usr/share/mdadm/mkconf > mntroot/etc/mdadm/mdadm.conf

edit to remove any old drive definitions

Be sure that mdadm knows you have updated your mdadm.conf by removing /var/lib/mdadm/CONF-UNCHECKED if it exists.

[edit] Make Changes in /mntroot/etc/fstab

Changes in bold

# /etc/fstab: static file system information.
#
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0
/dev/md1 none swap sw 0 0
/dev/md0 /boot jfs defaults 0 2
/dev/md2 / jfs defaults,errors=remount-ro 0 1
/dev/hda /media/cdrom iso9660 ro,user,noauto 0 0
/dev/fd0 /media/floppy auto rw,user,noauto 0 0
/dev/hda /cdrom iso9660 ro,user,noauto 0 0

[edit] Attach the original drive's partitions to the existing (degraded) RAID arrays:

# mdadm /dev/md0 -a /dev/sda1
# mdadm /dev/md1 -a /dev/sda5
# mdadm /dev/md2 -a /dev/sda6


#cat /proc/mdstat


Personalities : [raid1]
   md0 : active raid1 sda1[0] sdb1[1]
     192640 blocks [2/2] [UU]
   md1 : active raid1 sda5[0] sdb5[1]
     979840 blocks [2/2] [UU]
   md2 : active raid1 sda6[2] sdb6[1]
     159661888 blocks [2/1] [_U]
     [===>.................]  recovery = 17.9% (28697920/159661888) finish=56.4min speed=38656K/sec
   unused devices: <none>

After a while all devices are in sync:

# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[0] sdb1[1]
     192640 blocks [2/2] [UU]
md1 : active raid1 sda5[0] sdb5[1]
     979840 blocks [2/2] [UU]
md2 : active raid1 sda6[2] sdb6[1]
     159661888 blocks [2/2] [UU]
unused devices: <none>

DO Not Reboot!!!

[edit] Configure grub to boot from both drives

# mount -o bind /dev /mntroot/dev 
# mount -t proc none /mntroot/proc 
# mount -t sysfs none /mntroot/sys
# chroot /mntroot
# /usr/share/mdadm/mkconf > mntroot/etc/mdadm/mdadm.conf
# update-initramfs -u -k all

It should be ok to just call grub-install, but you might want to do 'update-grub' first to get a look at the generated /boot/grub/grub.cfg before installing the bootloader.

# update-grub

It should be ok to just call grub-install, but you might want to do 'update-grub' first to get a look at the generated /boot/grub/grub.cfg before installing the bootloader.

# grub-install /dev/sda
# grub-install /dev/sdb

[edit] Reboot

If it won't work you should be able to boot off of the old grub listing

At this point the box should be running off the degraded RAID1 devices.on the second drive (/dev/sdb)

[edit] Check that everything works

If all is OK you should chnge the partition types of the first drive (/dev/sda) - but you will lose all data!!
Device Size Id Eventual mount point
/dev/sda1 200M fd /boot/
/dev/sda5 1G fd swap
/dev/sda6 100G fd /



[edit] Regenerate /etc/mdadm/mdadm.conf

edit /etc/mdadm/mdadm.conf

Delete all but the top line

DEVICE /dev/sda* /dev/sdb*

Then run:

mdadm --examine --scan >> /etc/mdadm/mdadm.conf

Your final file should look like:

DEVICE /dev/sda* /dev/sdb*
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=4d58ade4:dd80faa9:19f447f8:23d355e3
  devices=/dev/sda1,/dev/sdb1
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=3f1bdce2:c55460b0:9262fd47:3c94b6ab
  devices=/dev/sda5,/dev/sdb5
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=7dfd7fcb:d65245d6:f9da98db:f670d7b6
  devices=/dev/sdb6,/dev/sda6

Note - you need the top DEVICE line and the lower devices lines

[edit] Setup the monitoring daemon

Just run:

dpkg-reconfigure mdadm

[edit] Test Test and Test

Test boot from both drives

Kill a drive and see if you get a email about the event.

Write up a step by step procedure to restore from a drive outage. (send a copy this way for this page!)

You should be all finished!

Please send notes of any typos/corrections to the email address below.

Special thanks to Onni Koskinen of Finland, whose gentle yet expert emails removed several glaring errors on this page and resulted in a vastly improved document.

[edit] Growing Raid1 Arrays

Moved to it's own page Growing_Partitions_and_file_systems

[edit] Re-adding Faulted drive

First, look at proc:

cat /proc/mdstat
Personalities : [raid1]
  md1 : active raid1 sda2[2](F) sdb2[1]
     70645760 blocks [2/1] [_U]
  md0 : active raid1 sda1[0] sdb1[1]
     9767424 blocks [2/2] [UU]
  unused devices: <none>


This shows raid md1 has drive sda2 stopped with a fault.

To re- add:

# mdadm /dev/md1 -r /dev/sda2
    mdadm: hot removed /dev/sda2
# mdadm /dev/md1 -a /dev/sda2
    mdadm: re-added /dev/sda2

Now you will see it regenerate in mdstat:

Personalities : [raid1]
md1 : active raid1 sda2[2] sdb2[1]
70645760 blocks [2/1] [_U]
[>....................] recovery = 0.3% (268800/70645760) finish=21.8min speed=53760K/sec
md0 : active raid1 sda1[0] sdb1[1]
9767424 blocks [2/2] [UU]
unused devices: <none>


If you have to re-add a drive more than once you need to find out why.

[edit] Q & Answers

  • will GRUB automatically boot from the good drive in the event of a disk failure?

Yes, IF you install Grub on both drives and your BIOS will roll over to the first bootable drive.

  • How do you see what is in a initrd file?

mount -o loop /tmp/myinitrd /mnt/myinitrd

Personal tools