Recent Posts
Archives
Advertisements
A place for Unix Thoughts and Ideas
If you are planning on running your Linux system on a ZFS root, having an emergency boot CD is indispensable.
I was able to get a SystemRescueCD which already had the proper ZFS modules already included
http://ftp.osuosl.org/pub/funtoo/distfiles/sysresccd/
The image can easily be imaged onto a usb drive for convenience.
Alternatively, in a pinch you can use an Ubuntu Live Image as a base to add the ZFS repos and apt-get all the modules.
What I have done for managing ZFS related boot issues:
1. Boot up (use graphical environment or configure the network and change root password for ssh)
2. Import the root pool and setup a chroot environment
mkdir /mnt/ubuntu
zpool import -f -R /mnt/ubuntu rpool
for dir in sys proc dev; do mount –bind /$dir /mnt/ubuntu/$dir; done
chroot /mnt/ubuntu bash -l
3. after this you can run your grub commands to reinstall to your devices and regenerate your grub.cfg. Additionally, you can stop and restart the udevadmin process if you need to recreate devices in /dev for grub.
4.When you are done, exit from your chroot environment and unmount the filesystem to allow you to export the zfs pool prior to rebooting the system.
for dir in sys proc dev; do umount /mnt/ubuntu/$dir; done
zpool export rpool
After Code42 dropped Solaris support in the newest update of CrashPlan, I decided it was time to move my home server from Solaris to Linux.
Using OpenZFS on Linux, the plan was to use zfs send/recieve to migrate all my datasets from version 35 to version 28 pools while still on Solaris and then reboot my server onto a new Ubuntu based ZFS root.
This worked fine, except for a handful of filesystems that were created after going to Solaris 11.1, which by default created the ZFS filesystem as version 6.
If you are creating new pools on Solaris and want to retain the ability to migrate them to ZFS on Linux (or any of the OpenZFS platforms), you will want to stay at version 28 of the pool and version 5 of the filesystem. Also, avoid activating any of the feature flags on OpenZFS side if you want to have the ability to swing back to Solaris.
You will also need to watch the feature flags enabled on your new ZFS root pool as setting the wrong flag can instantly make it incompatible with grub and unable to boot your system. To recover you will have to boot off alternative media and create a new zfs pool for booting; which if you split all your filesystems in to different datasets, sounds worse than it actually is (support for additional features is already in GRUB’s Dev tree, so may not be a issue for long).
With OpenZFS, you’ll want to create the pool with the -d option to have all features turned off by default. Otherwise the default is to enable all features.
Example (pool creation):
zpool create testpool -o version=28 -O version=5 c3t0d0
Example (filesystem creation):
zfs create -o version=5 testpool/testdata
Example (OpenZFS pool creation):
zpool create testpool -d ata-ST4000DM000-1F2168_Z302HRLS-part1
Example (OpenZFS rpool Creation):
zpool create -f -d -o feature@async_destroy=enabled -o feature@empty_bpobj=enabled -o feature@lz4_compress=enabled -o ashift=12 -O compression=lz4 rpool mirror ${DISK1} ${DISK2}
Dedicated raid-dp root aggregates for each node of a Clustered Data ONTAP cluster can really eat up the drives on a small system like a FAS2220 and can seem especially wasteful since the only unique information on the drives is the logs.
Here is a little tip I got from NetApp for setting up a 2 Drive Raid-4 Root Aggregate.
By default the system will build a 1+2 raid-dp aggregate, this can be slimmed down to two drives by converting the aggregate from raid-dp to raid-4.
This can easily be done through editing the aggregate in system manager or via the CLI
::> storage aggregate modify -raidtype raid4 -aggregate aggr0clus1_01
Running on a 2 disk Raid4 root aggregate is supported and is explicitly mentioned in the Clustered Data ONTAP 8.2 Physical Storage Management Guide.
This can also be reversed through the same method, but instead of specifying raid4, you will specify raid_dp
Here is a quick trick I have found useful for when I need to keeps tabs on the realtime performance of my NetApp arrays.
7-mode only allows you 1 interactive ssh session.
However, you can launch many more non-interactive ones.
Examples:
ssh testnetapp “stats show -p hybrid_aggr”
ssh testnetapp “stats show -p flexscale-access”
ssh testnetapp “priv set diag; wafltop show -i 10”
Here is a quick 1-liner for displaying the adapter name and WWNs on a AIX Host
# lscfg | awk '/fcs/ {print $2}' | while read hba; do printf "$hba "; lscfg -vp -l $hba | grep Net | cut -d. -f14 |sed 's!\.!!g;s!\(..\)!\1:!g;s!:$!!' ;done
fcs0 C0:50:76:xx:A0:xx:yy:08
fcs1 C0:50:76:xx:A0:xx:yy:0A
fcs2 C0:50:76:xx:A0:xx:yy:0C
fcs3 C0:50:76:xx:A0:xx:yy:0E
Here is a quick trick if you are looking to test out Solaris 11.2 Kernel Zones with Vmware Fusion.
I’m fairly sure that this trick will also work with the other VMWare products.
Out of the box, if you attempt to install a kernel-zone brand zone, you will see the following error:
# zoneadm -z myfirstkz install Platform does not support the kernel-zone brand. zoneadm: zone myfirstkz failed to verify
Running virtinfo confirms it:
root@s112test:/dev/lofi# virtinfo NAME CLASS vmware current non-global-zone supported
1. First thing is that you need to be running i5/i7 generation processor that will support nested paging.
2. Shutdown your VM and then go into the container/folder for the VM and modify the .vmx file and add the following to the bottom:
vhv.enable = “TRUE”
3. Save the file and then restart your VM
4. Verify Support with virtinfo:
root@s112test:/dev/lofi# virtinfo NAME CLASS vmware current non-global-zone supported kernel-zone supported
5. If you have accepted the default drive size of 16GB in VMWare Fusion, you will want to add an additional drive and create a new zfs pool for zone as the default rpool size is too small for a kernel zone and the zonecreate will abort during the zone creation.
You have probably noticed that I haven’t updated my blog in quite some time.
I haven’t abandoned this blog, just haven’t had as many new tricks to write about.
I switched jobs at the beginning of 2013 and my new role as a Infrastructure Architect has broadened my daily focus to expand beyond the operation and performance management of Unix & Oracle environments. I’m now working on SAN, NAS, load balancers, backups, security and a variety of other broader topics.
I hope to have some new tricks posted in the near future.
As part of refreshing the storage on my home server which had a stack of 1TB WD GP drives which were starting to show their age in my weekly scrubs, I decided it was time to replace my LSI SAS3081 with a newer generation controller.
On eBay there is not shortage of LSI based 92xx controllers for around $80-100, so I picked up a LSI SAS9201-8i which is a IBM rebranded of a LSI 9211-8i.
Upon installing it into my system, the card’s bios initialized and showed all my drives, but the controller was not being see in Solaris 11.1.
I tested the card using a ubuntu rescue cd and confirmed the card was working, just not with Solaris 11. Since it was detected in ubuntu, I upgraded the firmware to fix the issue, but alas that didn’t work either.
Finally, I decided I would boot from a Solaris 11.1 live CD, to eliminate my install as a issue; the card showed up immediately.
I grabbed all the prtconf -v info and driver files from etc and saved it off on a usb stick and then rebooted to continue testing.
On my install of Solaris 11.1, a grep of the prtconf output showed the OS could see the device, but it wasn’t attaching to it
pci1000,72 (driver not attached)
Hardware properties:
name=’assigned-addresses’ type=int items=15
value=81010010.00000000.0000a000.00000000.00000100.83010014.00000000.f5040000.00000000.00004000.8301001c.00000000.f5000000.00000000.00040000
name=’reg’ type=int items=20
value=00010000.00000000.00000000.00000000.00000000.01010010.00000000.00000000.00000000.00000100.03010014.00000000.00000000.00000000.00004000.0301001c.00000000.00000000.00000000.00040000
name=’compatible’ type=string items=13
value=’pciex1000,72.1000.72.3′ + ‘pciex1000,72.1000.72’ + ‘pciex1000,72.3’ + ‘pciex1000,72’ + ‘pciexclass,010700’ + ‘pciexclass,0107’ + ‘pci1000,72.1000.72.3’ + ‘pci1000,72.1000.72’ + ‘pci1000,72’ + ‘pci1000,72.3’ + ‘pci1000,72’ + ‘pciclass,010700’ + ‘pciclass,0107′
name=’model’ type=string items=1
value=’Serial Attached SCSI Controller’
I then decided to use the add_drv command to try to link the pci information to the mpt_sas driver, figuring I had ran into a weird device driver issue, and then I got the biggest clue of all:
root@test_sys:/etc# add_drv -vi “pci1000,72@0” mpt_sas
Cannot find module (mpt_sas).
A search of /kernel/drv quickly identified that while there was a mpt drv, there was no mpt_sas driver module present.
After a quick pkg search
root@azurite:/kernel/drv/amd64# pkg search mpt_sas
INDEX ACTION VALUE PACKAGE
driver_name driver mpt_sas pkg:/driver/storage/mpt_sas@0.5.11-0.175.1.7.0.4.2
basename file kernel/drv/amd64/mpt_sas pkg:/driver/storage/mpt_sas@0.5.11-0.175.1.7.0.4.2
basename file kernel/drv/sparcv9/mpt_sas pkg:/driver/storage/mpt_sas@0.5.11-0.175.1.7.0.4.2
basename file kernel/drv/amd64/mpt_sas pkg:/driver/storage/mpt_sas@0.5.11-0.175.1.7.0.4.2
basename file kernel/drv/sparcv9/mpt_sas pkg:/driver/storage/mpt_sas@0.5.11-0.175.1.7.0.4.2
basename file kernel/kmdb/sparcv9/mpt_sas pkg:/developer/debug/mdb@0.5.11-0.175.1.9.0.1.2
basename file kernel/kmdb/amd64/mpt_sas pkg:/developer/debug/mdb@0.5.11-0.175.1.9.0.1.2
basename file kernel/kmdb/sparcv9/mpt_sas pkg:/developer/debug/mdb@0.5.11-0.175.1.9.0.1.2
pkg.fmri set solaris/driver/storage/mpt_sas pkg:/driver/storage/mpt_sas@0.5.11-0.175.1.7.0.4.2
After installing the driver package:
pkg install mpt_sas
The card was immediately bound to a driver and now works.
I guess the lesson here is that while just everything is installed with the base install of Solaris 10, make no assumptions of that on Solaris 11.
OSX’s Time machine backup feature is very simple to enable and to use.
Unfortunately, it is almost too simple and there are no mechanisms for capping the amount of storage used for backups and it will eventually grow and take over any sized drive.
Really the best way to work with it is to dedicate a partition to just Time Machine and nothing else.
Time Machine will prune backups as they age and when you run out of space, but depending on that functionality is very limiting.
It turns out that Time Machine has a very handy command line interface called tmutil for listing and deleting backups. It also has some additional compare commands that look like they could be very useful for tracking down changed files.
m-m:~ $ tmutil Usage: tmutil help <verb> Usage: tmutil version Usage: tmutil enable Usage: tmutil disable Usage: tmutil startbackup [-b|--block] Usage: tmutil stopbackup Usage: tmutil enablelocal Usage: tmutil disablelocal Usage: tmutil snapshot Usage: tmutil delete snapshot_path ... Usage: tmutil restore [-v] src dst Usage: tmutil compare [-a@esmugtdrvEX] [-D depth] [-I name] tmutil compare [-a@esmugtdrvEX] [-D depth] [-I name] snapshot_path tmutil compare [-a@esmugtdrvEX] [-D depth] [-I name] path1 path2 Usage: tmutil setdestination mount_point tmutil setdestination [-p] afp://user[:pass]@host/share Usage: tmutil addexclusion [-p] item ... Usage: tmutil removeexclusion [-p] item ... Usage: tmutil isexcluded item ... Usage: tmutil inheritbackup machine_directory tmutil inheritbackup sparse_bundle Usage: tmutil associatedisk [-a] mount_point volume_backup_directory Usage: tmutil latestbackup Usage: tmutil listbackups Usage: tmutil machinedirectory Usage: tmutil calculatedrift machine_directory Usage: tmutil uniquesize path ... Use `tmutil help <verb>` for more information about a specific verb.
The following is a example of listing my backups and then deleting one. Read more of this post
We recently updated our server naming standard to be location agnostic.
Due to this change, I had to work out a new mechanism to programmatically locate the nearest server for my imaging, build, and update scripts.
My end solution involves using ping to find the average ping time and comparing the average to determine the closest server.
In the case they are the same, it uses the first server.
FIRSTSR=server1 SECSR=server2 case `uname -s` in SunOS) FST=`ping -vs $FIRSTSR 20 5 | awk -F/ '/^round|^rtt/{printf("%d\n",$6+.5)}'` SST=`ping -vs $SECSR 20 5 | awk -F/ '/^round|^rtt/{printf("%d\n",$6+.5)}'` ;; *) FST=`ping -s 20 -c 5 -v $FIRSTSR | awk -F/ '/^round|^rtt/{printf("%d\n",$6+.5)}'` SST=`ping -s 20 -c 5 -v $SECSR | awk -F/ '/^round|^rtt/{printf("%d\n",$6+.5)}'` ;; esac; if [ $FST -le $SST ]; then echo "Using $FIRSTSR for nfs mount" else echo "Using $SECSR for nfs mount" fi
Recent Comments