ZFS on Linux: How to find the arc stats (was arcstat.py)

This has now changed; run the following to find the adaptive read cache stats (ARC):  
cat /proc/spl/kstat/zfs/arcstats
  You can gleam some really useful information out of how your RAM is being utilised and what your required ARC size might be from the results – this may be a topic for a future post, however!

ZFS on Linux (Ubuntu) – arcstat.py is now available! How do you run it?

UPDATE: This information is now out of date, see new post here.   One very handy ZFS-related command which has been missing from the standard ZFS on Linux implementation has been arcstat.py. This script provides a great deal of useful information about how effective your adaptive read cache (ARC) is.   ZFSoL 0.6.2 includes it, which you can now update to in Ubuntu with apt-get upgrade. But how do you actually use it when you upgrade? Easy. Assuming you have python installed, run the following (this works for 13.04 at least – we will check the others and update when we do):  
/usr/src/zfs-0.6.2/cmd/arcstat/arcstat.py
  This will provide you with the default readout, e.g. for our system which just rebooted:  
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c 21:33:13     3     1         33          1       33        0        0           1        33       2.5G   15G
  As you can see, since the system has just rebooted and hasn’t started caching requests the ARC size is quite small – 2.5G. This is an extremely useful tool to get an idea of how your ARC is performing – we will do a piece on interpreting the results soon!

ZFS: Adding an SSD as a cache drive

ZFS uses any free RAM to cache accessed files, speeding up access times; this cache is called the ARC. RAM is read at gigabytes per second, so it is an extremely fast cache. It is possible to add a secondary cache – the L2ARC (level 2 ARC) in the form of solid state drives. SSDs may only be able to sustain about half of a single gigabyte per second but this is still vastly more than any spinning disk is able to achieve, and the IOPS (in/out operations per second) are also typically much, much higher than standard hard drives.   If you find that you want more high-speed caching and adding more RAM isn’t feasible from an equipment or cost perspective a L2ARC drive may well be a good solution. To add one, insert the SSD to the system and run the following:  
zpool add [pool] cache [drive]
e.g.:  
zpool add kepler cache ata-M4-CT064M4SSD2_000000001148032355BE  
zpool status now shows:  
        NAME                                            STATE     READ WRITE CKSUM kepler                                          ONLINE     0     0     0 raidz2-0                                      ONLINE     0     0     0 ata-WDC_WD20EARX-00PASB0_WD-WMAZA7352713    ONLINE       0     0     0 ata-WDC_WD20EARX-00PASB0_WD-WCAZAA637401  ONLINE       0     0     0 ata-WDC_WD20EARS-00MVWB0_WD-WCAZAC389999  ONLINE       0     0     0 ata-WDC_WD20EFRX-68AX9N0_WD-WMC300005397    ONLINE       0     0     0 ata-WDC_WD20EARX-00MMMB0_WD-WCAWZ0842074    ONLINE       0     0     0 ata-WDC_WD20EARX-00PASB0_WD-WMAZA7482193    ONLINE       0     0     0 cache ata-M4-CT064M4SSD2_000000001148032155BE       ONLINE       0     0     0
You can see that the cache drive has been added to the bottom of the pool listing.   The way the ARC works is beyond the scope of this post – suffice to say for the moment that simply adding a L2ARC is not necessarily going to improve performance in every situation, so do some research before spending money on a good SSD. Check back for a more detailed investigation into how ARC and L2ARC work in November!