A place for Unix Thoughts and Ideas

ZFS on Linux emergency Boot CD

If you are planning on running your Linux system on a ZFS root, having an emergency boot CD is indispensable.

I was able to get a SystemRescueCD which already had the proper ZFS modules already included


The image can easily be imaged onto a usb drive for convenience.

Alternatively, in a pinch you can use an Ubuntu Live Image as a base to add the ZFS repos and apt-get all the modules.

What I have done for managing ZFS related boot issues:

1. Boot up (use graphical environment or configure the network and change root password for ssh)
2. Import the root pool and setup a chroot environment

mkdir /mnt/ubuntu
zpool import -f -R /mnt/ubuntu rpool
for dir in sys proc dev; do mount –bind /$dir /mnt/ubuntu/$dir; done
chroot /mnt/ubuntu bash -l

3. after this you can run your grub commands to reinstall to your devices and regenerate your grub.cfg. Additionally, you can stop and restart the udevadmin process if you need to recreate devices in /dev for grub.

4.When you are done, exit from your chroot environment and unmount the filesystem to allow you to export the zfs pool prior to rebooting the system.

for dir in sys proc dev; do umount /mnt/ubuntu/$dir; done
zpool export rpool


Minding your ZFS pool and filesystem versions, and feature flags

After Code42 dropped Solaris support in the newest update of CrashPlan, I decided it was time to move my home server from Solaris to Linux.

Using OpenZFS on Linux, the plan was to use zfs send/recieve to migrate all my datasets from version 35 to version 28 pools while still on Solaris and then reboot my server onto a new Ubuntu based ZFS root.

This worked fine, except for a handful of filesystems that were created after going to Solaris 11.1, which by default created the ZFS filesystem as version 6.

If you are creating new pools on Solaris and want to retain the ability to migrate them to ZFS on Linux (or any of the OpenZFS platforms), you will want to stay at version 28 of the pool and version 5 of the filesystem. Also, avoid activating any of the feature flags on OpenZFS side if you want to have the ability to swing back to Solaris.

You will also need to watch the feature flags enabled on your new ZFS root pool as setting the wrong flag can instantly make it incompatible with grub and unable to boot your system. To recover you will have to boot off alternative media and create a new zfs pool for booting; which if you split all your filesystems in to different datasets, sounds worse than it actually is (support for additional features is already in GRUB’s Dev tree, so may not be a issue for long).

With OpenZFS, you’ll want to create the pool with the -d option to have all features turned off by default. Otherwise the default is to enable all features.

Example (pool creation):
zpool create testpool -o version=28 -O version=5 c3t0d0

Example (filesystem creation):
zfs create -o version=5 testpool/testdata

Example (OpenZFS pool creation):
zpool create testpool -d ata-ST4000DM000-1F2168_Z302HRLS-part1

Example (OpenZFS rpool Creation):
zpool create -f -d -o feature@async_destroy=enabled -o feature@empty_bpobj=enabled -o feature@lz4_compress=enabled -o ashift=12 -O compression=lz4 rpool mirror ${DISK1} ${DISK2}

Creating a 2 Drive Raid4 root aggregate with NetApp Clustered Data Ontap

Dedicated raid-dp root aggregates for each node of a Clustered Data ONTAP cluster can really eat up the drives on a small system like a FAS2220 and can seem especially wasteful since the only unique information on the drives is the logs.

Here is a little tip I got from NetApp for setting up a 2 Drive Raid-4 Root Aggregate.

By default the system will build a 1+2 raid-dp aggregate, this can be slimmed down to two drives by converting the aggregate from raid-dp to raid-4.

This can easily be done through editing the aggregate in system manager or via the CLI

::> storage aggregate modify -raidtype raid4 -aggregate aggr0clus1_01


Running on a 2 disk Raid4 root aggregate is supported and is explicitly mentioned in the Clustered Data ONTAP 8.2 Physical Storage Management Guide.

This can also be reversed through the same method, but instead of specifying raid4, you will specify raid_dp

NetApp multiple SSH sessions trick

Here is a quick trick I have found useful for when I need to keeps tabs on the realtime performance of my NetApp arrays.

7-mode only allows you 1 interactive ssh session.

However, you can launch many more non-interactive ones.


ssh testnetapp “stats show -p hybrid_aggr”
ssh testnetapp “stats show -p flexscale-access”
ssh testnetapp “priv set diag; wafltop show -i 10”

Quickly displaying WWN for a AIX server

Here is a quick 1-liner for displaying the adapter name and WWNs on a AIX Host

# lscfg | awk '/fcs/ {print $2}' | while read hba; do printf "$hba "; lscfg -vp -l $hba | grep Net | cut -d. -f14 |sed 's!\.!!g;s!\(..\)!\1:!g;s!:$!!' ;done
fcs0 C0:50:76:xx:A0:xx:yy:08
fcs1 C0:50:76:xx:A0:xx:yy:0A
fcs2 C0:50:76:xx:A0:xx:yy:0C
fcs3 C0:50:76:xx:A0:xx:yy:0E

Enabling Solaris 11.2 Kernel Zones whith VMWare Fusion

Here is a quick trick if you are looking to test out Solaris 11.2 Kernel Zones with Vmware Fusion.

I’m fairly sure that this trick will also work with the other VMWare products.

Out of the box, if you attempt to install a kernel-zone brand zone, you will see the following error:

# zoneadm -z myfirstkz install
Platform does not support the kernel-zone brand.
zoneadm: zone myfirstkz failed to verify

Running virtinfo confirms it:

root@s112test:/dev/lofi# virtinfo
NAME            CLASS     
vmware          current   
non-global-zone supported

1. First thing is that you need to be running i5/i7 generation processor that will support nested paging.
2. Shutdown your VM and then go into the container/folder for the VM and modify the .vmx file and add the following to the bottom:

vhv.enable = “TRUE”

3. Save the file and then restart your VM
4. Verify Support with virtinfo:

root@s112test:/dev/lofi# virtinfo
NAME            CLASS     
vmware          current   
non-global-zone supported
kernel-zone     supported

5. If you have accepted the default drive size of 16GB in VMWare Fusion, you will want to add an additional drive and create a new zfs pool for zone as the default rpool size is too small for a kernel zone and the zonecreate will abort during the zone creation.

Lull of activity

You have probably noticed that I haven’t updated my blog in quite some time.

I haven’t abandoned this blog, just haven’t had as many new tricks to write about.

I switched jobs at the beginning of 2013 and my new role as a Infrastructure Architect has broadened my daily focus to expand beyond the operation and performance management of Unix & Oracle environments. I’m now working on SAN, NAS, load balancers, backups, security and a variety of other broader topics.

I hope to have some new tricks posted in the near future.

Installation issues with LSI 9201 sas controller

As part of refreshing the storage on my home server which had a stack of 1TB WD GP drives which were starting to show their age in my weekly scrubs, I decided it was time to replace my LSI SAS3081 with a newer generation controller.

On eBay there is not shortage of LSI based 92xx controllers for around $80-100, so I picked up a LSI SAS9201-8i which is a IBM rebranded of a LSI 9211-8i.

Upon installing it into my system, the card’s bios initialized and showed all my drives, but the controller was not being see in Solaris 11.1.

I tested the card using a ubuntu rescue cd and confirmed the card was working, just not with Solaris 11. Since it was detected in ubuntu, I upgraded the firmware to fix the issue, but alas that didn’t work either.

Finally,  I decided I would boot from a Solaris 11.1 live CD, to eliminate my install as a issue; the card showed up immediately.

I grabbed all the prtconf -v info and driver files from etc and saved it off on a usb stick and then rebooted to continue testing.

On my install of Solaris 11.1, a grep of the prtconf output showed the OS could see the device, but it wasn’t attaching to it

pci1000,72 (driver not attached)
Hardware properties:
name=’assigned-addresses’ type=int items=15
name=’reg’ type=int items=20
name=’compatible’ type=string items=13
value=’pciex1000,72.1000.72.3′ + ‘pciex1000,72.1000.72’ + ‘pciex1000,72.3’ + ‘pciex1000,72’ + ‘pciexclass,010700’ + ‘pciexclass,0107’ + ‘pci1000,72.1000.72.3’ + ‘pci1000,72.1000.72’ + ‘pci1000,72’ + ‘pci1000,72.3’ + ‘pci1000,72’ + ‘pciclass,010700’ + ‘pciclass,0107′
name=’model’ type=string items=1
value=’Serial Attached SCSI Controller’

I then decided to use the add_drv command to try to link the pci information to the mpt_sas driver, figuring I had ran into a weird device driver issue, and then I got the biggest clue of all:

root@test_sys:/etc# add_drv -vi “pci1000,72@0” mpt_sas
Cannot find module (mpt_sas).

A search of /kernel/drv quickly identified that while there was a mpt drv, there was no mpt_sas driver module present.

After a quick pkg search

root@azurite:/kernel/drv/amd64# pkg search mpt_sas
driver_name driver mpt_sas pkg:/driver/storage/mpt_sas@0.5.11-
basename file kernel/drv/amd64/mpt_sas pkg:/driver/storage/mpt_sas@0.5.11-
basename file kernel/drv/sparcv9/mpt_sas pkg:/driver/storage/mpt_sas@0.5.11-
basename file kernel/drv/amd64/mpt_sas pkg:/driver/storage/mpt_sas@0.5.11-
basename file kernel/drv/sparcv9/mpt_sas pkg:/driver/storage/mpt_sas@0.5.11-
basename file kernel/kmdb/sparcv9/mpt_sas pkg:/developer/debug/mdb@0.5.11-
basename file kernel/kmdb/amd64/mpt_sas pkg:/developer/debug/mdb@0.5.11-
basename file kernel/kmdb/sparcv9/mpt_sas pkg:/developer/debug/mdb@0.5.11-
pkg.fmri set solaris/driver/storage/mpt_sas pkg:/driver/storage/mpt_sas@0.5.11-

After installing the driver package:

pkg install mpt_sas

The card was immediately bound to a driver and now works.

I guess the lesson here is that while just everything is installed with the base install of Solaris 10, make no assumptions of that on Solaris 11.

Taming OSX Time Machine Backups

OSX’s Time machine backup feature is very simple to enable and to use.

Unfortunately, it is almost too simple and there are no mechanisms for capping the amount of storage used for backups and it will eventually grow and take over any sized drive.

Really the best way to work with it is to dedicate a partition to just Time Machine and nothing else.

Time Machine will prune backups as they age and  when you run out of space, but depending on that functionality is very limiting.

It turns out that Time Machine has a very handy command line interface called tmutil for listing and deleting backups. It also has some additional compare commands that look like they could be very useful for tracking down changed files.

m-m:~ $ tmutil
Usage: tmutil help <verb>

Usage: tmutil version

Usage: tmutil enable

Usage: tmutil disable

Usage: tmutil startbackup [-b|--block]

Usage: tmutil stopbackup

Usage: tmutil enablelocal

Usage: tmutil disablelocal

Usage: tmutil snapshot

Usage: tmutil delete snapshot_path ...

Usage: tmutil restore [-v] src dst

Usage: tmutil compare [-a@esmugtdrvEX] [-D depth] [-I name]
       tmutil compare [-a@esmugtdrvEX] [-D depth] [-I name] snapshot_path
       tmutil compare [-a@esmugtdrvEX] [-D depth] [-I name] path1 path2

Usage: tmutil setdestination mount_point
       tmutil setdestination [-p] afp://user[:pass]@host/share

Usage: tmutil addexclusion [-p] item ...

Usage: tmutil removeexclusion [-p] item ...

Usage: tmutil isexcluded item ...

Usage: tmutil inheritbackup machine_directory
       tmutil inheritbackup sparse_bundle

Usage: tmutil associatedisk [-a] mount_point volume_backup_directory

Usage: tmutil latestbackup

Usage: tmutil listbackups

Usage: tmutil machinedirectory

Usage: tmutil calculatedrift machine_directory

Usage: tmutil uniquesize path ...

Use `tmutil help <verb>` for more information about a specific verb.

The following is a example of listing my backups and then deleting one. Read more of this post

Programmatically determining the closest server

We recently updated our server naming standard to be location agnostic.

Due to this change, I had to work out a new mechanism to programmatically locate the nearest server for my imaging, build, and update scripts.

My end solution involves using ping to find the average ping time and comparing the average to determine the closest server.

In the case they are the same, it uses the first server.


case `uname -s` in
        FST=`ping  -vs $FIRSTSR 20 5 | awk -F/ '/^round|^rtt/{printf("%d\n",$6+.5)}'`
        SST=`ping  -vs $SECSR 20 5 | awk -F/ '/^round|^rtt/{printf("%d\n",$6+.5)}'`
        FST=`ping -s 20 -c 5 -v $FIRSTSR | awk -F/ '/^round|^rtt/{printf("%d\n",$6+.5)}'`
        SST=`ping -s 20 -c 5 -v $SECSR | awk -F/ '/^round|^rtt/{printf("%d\n",$6+.5)}'`

if [ $FST -le $SST ]; then
        echo "Using $FIRSTSR for nfs mount"
        echo "Using $SECSR for nfs mount"