ZFS: zpool replace returns error: cannot replace, devices have different sector alignment

Trying to replace a failed SSD in a zpool we encountered the following error:  
cannot replace 4233944908106055 with ata-INTEL_SSDSC2BW240A4_CVD02KY2403GN: devices have different sector alignment
  The pool was aligned to 4k sectors – e.g. ashift=12 – whereas the new SSD was aligned to 512b sectors. There’s a quick and easy fix to this – no need to use partitioning tools:   Continue reading “ZFS: zpool replace returns error: cannot replace, devices have different sector alignment”

ZFS: Adding a new mirror to an existing ZFS pool

  Mirrored vdevs are great for performance and it is quite straight-forward to add a mirrored vdev to an existing pool (presumably one with one or more similar vdevs already):  
zpool add [poolname] mirror [device01] [device02] [device03]
  If it’s a two-way mirror you will only have two devices in the above. An example for ZFS on Ubuntu with a pool named seleucus and two SSDs could look like:  
zpool add seleucus mirror ata-SAMSUNG_SSD_830_Series_S0XYNEAC705640 ata-M4-CT128M4SSD2_000000001221090B7BF9
  As always, it’s good practice to use the device name found in /dev/disk/by-id/ rather than the sda, sdb, sdc etc. names as the latter can change – the former do not.

Western Digital Green drive resilver rates

  western-digital-green-drives   We get asked fairly regularly about resilver rates for ZFS pools – these matter as it impacts on how quickly a vdev with faulty disks can rebuild data onto a fresh disk, as well as how quickly you can swap one disk for another. The longer it takes to rebuild the vdev after a disk has died, the longer your pool is operating with less redundancy – meaning that if you have had one disk fail (raidz1) or two disks fail (raidz2) already then one more failure before it has finished rebuilding will cause the vdev and zpool to fail.   Today we have been tasked with swapping new drives into two 6-disk vdevs, each consisting of a mixture of WD20EARX and WD20EARS drives – Western Digital 2TB green drives. One array contains 8TB of information, the other 5TB. The 5TB array fluctuates around 245MB/s resilver rate, and the 8TB fluctuates around 255MB/s – giving around 6 hours and 9 hours rebuild times respectively.   These figures are what we would consider typical for that size of vdev, given the disks involved. We will post more rebuild rates and add them into a database over time – stay tuned ūüôā

How to add a drive to a ZFS mirror

Sometimes you may wish to expand a two-way mirror to a three-way mirror, or to make a basic single drive vdev into a mirror – to do this we use the zpool attach command. Simpy run:  
# zpool attach [poolname] [original drive to be mirrored] [new drive]
An example:  
# zpool attach seleucus /dev/sdj /dev/sdm
  …where the pool is named seleucus, the drive that’s already present in the pool is /dev/sdj and the new drive that’s being added is /dev/sdm. You can add the force switch like so:  
# zpool attach -f seleucus /dev/sdj /dev/sdm  
to force ZFS to add a device it thinks is in use; this won’t always work depending on the reason why the drive is showing up as being in use.   Please note that you cannot expand a raidz, raidz1, raidz2 etc. vdev with this command – it only works for basic vdevs or mirrors. The above drive syntax is for Ubuntu; for Solaris or OpenIndiana the drive designations look like c1t0d0 instead, so the command might look like:  
# zpool attach seleucus c1t1d0 c1t2d0  
…instead.   This is a handy command if you want a three-way mirror but don’t have all three drives to start with – you can get the ball rolling with a 2-way mirror and add the third drive down the track. Remember that ZFS performs reads in a mirror in round-robin fashion, so that while you get a single drive’s performance for writes you will get approximately the sum of all of the drives in terms of read performance – it’s not hard for a 3-way 6gb/s SSD mirror to crack 1,500MB/s in sequential reads. It’s a fantastic way to get extreme performance for a large number of small VMs.  

ZFS Basics – zpool scrubbing

One of the most significant features of the ZFS filesystem is scrubbing. This is where the filesystem checks itself for errors and attempts to heal any errors that it finds. It’s generally a good idea to scrub consumer-grade drives once a week, and enterprise-grade drives once a month.   How long the scrub takes depends on how much data is in your pool; ZFS only scrubs sectors where data is present so if your pool is mostly empty it will be finished fairly quickly. Time taken is also dependent on drive and pool performance; an SSD pool will scrub much more quickly than a spinning disk pool!   To scrub, run the following command:  
# zpool scrub [poolname]
  Replace [poolname] with the name of your pool. You can check the status of your scrub via:  
# zpool status
  The output will look something like this:  
pool: seleucus state: ONLINE scan: scrub in progress since Tue Sep 18 21:14:37 2012 1.18G scanned out of 67.4G at 403M/s, 0h2m to go 0 repaired, 1.75% done config: NAME¬†¬†¬†¬†¬†¬†¬† STATE¬†¬†¬†¬† READ WRITE CKSUM seleucus¬†¬†¬† ONLINE¬†¬†¬†¬†¬†¬† 0¬†¬†¬†¬† 0¬†¬†¬†¬† 0 mirror-0¬† ONLINE¬†¬†¬†¬†¬†¬† 0¬†¬†¬†¬† 0¬†¬†¬†¬† 0 sdh¬†¬†¬†¬† ONLINE¬†¬†¬†¬†¬†¬† 0¬†¬†¬†¬† 0¬†¬†¬†¬† 0 sdk¬†¬†¬†¬† ONLINE¬†¬†¬†¬†¬†¬† 0¬†¬†¬†¬† 0¬†¬†¬†¬† 0 errors: No known data errors  
Scrubbing has a low priority so if the drives are being accessed while the scrub is happening there should be less impact on performance. It’s a good idea to automate the scrubbing process in case you forget – we will do a later post on just how to do that!