Thursday, May 30, 2013

Firefly: failsafe image for illumos-based distros

If somebody is looking for illumos-based filesafe image, I've created small iso (usb image is available also) at sourceforge. This image provides the same method as old Solaris iso's: looks for bootable root pool, imports it to /a and mounts bootfs:

I hope someone will find this image useful for recovery purposes on any illumos-based distributions.

25 comments:

Robin Smidsrød said...

I've created a fairly simple netboot recipe for Firefly using iPXE here:

https://gist.github.com/robinsmidsrod/5781653

Hope it might be useful for others.

atar said...

Nice! Are you going to make a SPARC iso as well?

alhazred said...

Unfortunately I have not any sparc system anymore...

Shadow Cat said...

Hi there, how do I install Firefly, IPS, and Xfce to my hdd?

alhazred said...

Look at http://sourceforge.net/projects/xstreamos/

JimKlimov said...

Hi Alex,

I've recently tried the latest firefly image on a recent OmniOS bloody, and its rpool feature flags are not compatible with those supported by your image. Can you please re-spin with a more recent illumos-gate and/or publish the recipes for others to do so?

Can the image be built by a running system from its binary bits, like the old Solaris 10 failsafe images were generated?

Otherwise, it was not a hassle to run Firefly from the HDD: I created an rpool/ROOT/firefly and copied all the contents from the ISO image into it, and updated grub.conf accordingly:


title Firefly Recovery 14/01
bootfs rpool/ROOT/firefly
kernel$ /platform/i86pc/kernel/amd64/unix
module$ /platform/i86pc/amd64/firefly

HTH,
Jim Klimov

JimKlimov said...

Also, it may be an issue especially regarding the "repair boot" intent, that Firefly only seems to provide 64-bit kernel bits in both the ISO and the compressed image (while userspace binaries have the usual 32-bit variants with 64-bit as needed).

JimKlimov said...

Well, in the end (just for kicks) I copied the current kernel from OmniOS into a clone of the firefly dataset, mounted the firefly image and updated the files in its /kernel and /platform with those available in OmniOS (not all drivers were available though), and the resulting BE is indeed bootable and able to import and mount the rpool with its features. Not surprisingly, however, it does not take into account my split-root setup (the rpool/SHARED/var/* stuff became just /a/var, while rpool/ROOT/omnios became /a/ROOT) but this suffices for testing or repairs.

Thanks Alex, great job preparing the bulk of the image and its logic! ;)

Here is what I did to update the image I had (locally installed per my comment above):

:; beadm mount firefly-151013 /a
:; gzcat /a/platform/i86pc/amd64/firefly > /tmp/ff.img

:; mkdir /tmp/a
:; mount /dev/lofi/1 /tmp/a

Added and ran this script in the image root:

####################
:; cat > /tmp/a/update-kernel.sh << EOF
#!/bin/sh

# Update the kernel bits in this image with files from the running system
# (C) 2014 by Jim Klimov

for D in `pwd`/kernel `pwd`/platform; do
cd "$D" && \
find . -type f | while read F; do
RFP="/platform/$F"; RFK="/kernel/$F"; RF=""
[ -s "$RFP" ] && RF="$RFP"
[ -s "$RFK" -a -z "$RF" ] && RF="$RFK"
[ -n "$RF" ] && \
{ echo "+++ Got '$RF'"; cp -pf "$RF" "$F"; } || \
echo "=== No $RFP nor $RFK !"
done
done

EOF
####################

:; cd /tmp/a && chmod +x update-kernel.sh && ./update-kernel.sh

:; cd /
:; umount /tmp/a
:; lofiadm -d /tmp/ff.img

:; cp -pf /platform/i86pc/kernel/amd64/unix /a/platform/i86pc/kernel/amd64/unix
:; cp -pf /platform/i86pc/kernel/kmdb/amd64/unix /a/platform/i86pc/kernel/kmdb/amd64/unix

:; gzip -c -9 < /tmp/ff.img > /a/platform/i86pc/amd64/firefly

:; beadm umount /a
:; rm /tmp/ff.img

If all went well, you are ready to reboot into this updated firefly (via manual selection at boot) as the good old locally-installed failsafe image :)

//Jim

alhazred said...

Glad to hear that it helps you. As soon as the free time (laugh), I will try to update it

JimKlimov said...

Pardon me, the copy-paste above missed an important line:

:; lofiadm -a /tmp/ff.img

This returns the "/dev/lofi/1" (or some other device path) to use in the mount under it.

The filesystem in the image is UFS (mount -F ufs) if this is not detected automagically for some reason.

Also, before the described procedure, the BE apparently had to be cloned bofore mounting:

:; beadm create -e firefly firefly-151013

(The original "firefly" contained an rsync'ed copy of the ISO contents).

HTH,
//Jim Klimov

Shadow Cat said...

Hello, Jim Klimov, you can, please, forward this info to my email?

Shadow Cat said...

Hello J.K,

May you at least run the complete command all over?

By the way ... great blog, Alex.

Keep the Systems Info coming ...


Thank You.

JimKlimov said...

By the way, for some reason (maybe mounting over CIFS on another box that I'm Firefly'ing now) I couldn't mount either 0114 nor 0215 ISOs as "hsfs"... no despair - "7z" can unpack ISO contents just nicely enough that lofi-mounting is not required in the above procedure to seed the firefly dataset with Alhazred's published build at all :)

I'll try to (re-)write it up consistently in a subsequent comment, when I test that all went well via 7z :)

@Alex: Is there any chance that you'd publish (or push into illumos) the recipe for creation of Firefly image contents from the bits available on the system, so it might some day be re-integrated with "bootadm update-archive" (as the failsafe images were built in Solaris 10 and SXCE) or some similar new tool, in order that the recovery kernel and drivers can stay up-to-date with each upgrade (in their own BE as I do now, or in the common BE as it was with failsafe images)?
Perhaps just push what source you have to Sourceforge or GitHub and let others figure it out at will? ;) Anyhow, thanks again! :)

JimKlimov said...

The variables in my write-up below are meant to be compatible with my split-rootfs maintenance scripts from https://github.com/jimklimov/illumos-splitroot-scripts - but do not dictate that those are used (although my "beadm-clone.sh" there can simplify getting across any custom zfs attributes, if you used any - like copies or compression).
In fact, I've integrated the procedure outlined below into that project as "beadm-firefly-update.sh", to retain and especially maintain it in a better place than side-line comments ;) Still, copy-pasting from here should "just work" (though it does less error-checking and reporting along this way).

Have to split this post to be under 4Kb per chunk, however...

### Pre-requisite: Download the Firefly ISO image from SourceForge project
### http://sourceforge.net/projects/fireflyfailsafe/files/ to your $DOWNLOADDIR
DOWNLOADDIR="/export/distribs"

### The latest baseline Firefly version from ISO filename e.g.
:; BEOLD=$(basename "`ls --sort=time --time=ctime -1 ${DOWNLOADDIR}/firefly*.iso | head -1`" .iso) # || die ...
#:; [ $? = 0 -a -n "$BEOLD" ] || die ...
### ... example resulting string:
#:; BEOLD="firefly_0215"

### The current BE name, will be used to pick up updated files
### to refresh the FF image, and to partially name the new FF BE
:; CURRENT_BE="`beadm list -H | while IFS=";" read BENAME BEGUID BEACT BEMPT BESPACE BEPOLICY BESTAMP; do case "$BEACT" in *N*) echo "$BENAME";; esac; done`" # || die ...

### The new Firefly BE to be updated with files from BEOLD
:; BENEW="${BEOLD}-${CURRENT_BE}"

### Mountpoints. Current BE is assumed to be at root "/" :)
:; BENEW_MPT="/a"
:; BEOLD_MPT="/b"
### Here we'll lofi-mount the archive file
:; FFARCH_MPT="/tmp/a"
:; FFARCH_FILE="/tmp/ff.img"

# (to be continued)

JimKlimov said...

# (needs envvars prepared in the post above)

### Seed the initial image, if needed (as a reference base to clone)
:; if ! beadm list "$BEOLD" ; then
beadm create -d "FireFly FailSafe Recovery $BEOLD (from ISO)" "$BEOLD" && \
beadm mount "$BEOLD" "$BEOLD_MPT" && \
( cd "$BEOLD_MPT" && 7z x "$DOWNLOADDIR/$BEOLD.iso" ) # || die ...
beadm umount "$BEOLD"
fi

### Clone and mount the new FF dataset to refresh the image from Current BE
:; if ! beadm list "$BENEW" ; then beadm create -d "FireFly FailSafe Recovery $BENEW (auto-updated from $BEOLD)" -e "$BEOLD" "$BENEW" ; fi
:; beadm mount "$BENEW" "$BENEW_MPT"
#:; [ -d "$BENEW_MPT" ] && ( cd "$BENEW_MPT" ) || die ...

### Prepare a copy of the Firefly image for modifications
:; gzcat "$BENEW_MPT"/platform/i86pc/amd64/firefly > "$FFARCH_FILE" # || die ...
:; mkdir -p "$FFARCH_MPT"
:; mount -F ufs "`lofiadm -a "$FFARCH_FILE"`" "$FFARCH_MPT" # || die ...

### Embed the update-script into the new image
####################
:; echo '#!/bin/sh

# Update the kernel bits in this image (rooted at "current dir" == `pwd`)
# with files from the running system (rooted at "/")
# (C) 2014 by Jim Klimov

for D in `pwd`/kernel `pwd`/platform; do
cd "$D" && \
find . -type f | while read F; do
RFP="/platform/$F"; RFK="/kernel/$F"; RF=""
[ -s "$RFP" ] && RF="$RFP"
[ -s "$RFK" -a -z "$RF" ] && RF="$RFK"
[ -n "$RF" ] && \
{ echo "+++ Got $RF"; cp -pf "$RF" "$F"; } || \
echo "=== No $RFP nor $RFK !"
done
done
' > "$FFARCH_MPT"/update-kernel.sh
####################

:; [ $? = 0 ] && ( cd "$FFARCH_MPT" && chmod +x update-kernel.sh && ./update-kernel.sh ) # || die

### Clean up...
:; cd /
:; umount "$FFARCH_MPT" && lofiadm -d "$FFARCH_FILE" && rm -rf "$FFARCH_MPT"

### Note that i386 32-bit kernels are not supported by current Firefly
:; cp -pf /platform/i86pc/kernel/amd64/unix "$BENEW_MPT"/platform/i86pc/kernel/amd64/unix # || die
:; cp -pf /platform/i86pc/kernel/kmdb/amd64/unix "$BENEW_MPT"/platform/i86pc/kernel/kmdb/amd64/unix # || die

:; gzip -c -9 < "$FFARCH_FILE" > "$BENEW_MPT"/platform/i86pc/amd64/firefly # || die

:; beadm umount "$BENEW_MPT"
:; rm "$FFARCH_FILE"

:; echo "If all went well, you are ready to reboot into this updated firefly"
:; echo "(via manual selection at boot) just like the good old locally-installed"
:; echo "failsafe image :)"

JimKlimov said...

Some follow-up thoughts regarding the snippet above: on one hand, "beadm" manages the boot-manager (GRUB) entries for the newly created and destroyed BEs. On another, it seems to just copy the latest entry in the menu.lst file (e.g. one taken from my earlier post above, see comment#6) so the "modules$" line points to the "/platform/i86pc/amd64/firefly" image filename in the newly created menu entries. If that block is not inserted manually once (and to the end of the menu.lst file), the menu is likely not going to be nicely updated for this bootfs type. It seemingly may make sense to rename "firefly" into "boot_archive" for commonality of the menu-block structure, although it may as well be likely that a "bootadm update-archive" would then replace it with its own usual boot archive (or maybe not, since the file-list files are not present in this bootfs).

UPDATE: A quick experiment (cloning and updating the main OS BE, and then creating a new Firefly based on it) has shown that menu.lst is updated with correct snippets for both image types, each (in its proper turn) appended to the end of file. So beadm is smart enough for that. However, during a test with a pristine menu.lst where a Firefly block never existed, it did not guess the block we needed and just copied the usual OS bootfs lines. So at least once it has to be fixed to something like:

title Firefly Recovery 14/01
bootfs rpool/ROOT/firefly
kernel$ /platform/i86pc/kernel/amd64/unix
module$ /platform/i86pc/amd64/firefly


Also note that as recently discussed on mailing lists, some boot-loading routines may peak out at around 40 BE's - and separate fireflies would get you there twice as fast.

@Alex: looking at the ISO (or bootable firefly dataset) contents, I see that nearly the only unique file there is /platform/i86pc/amd64/firefly - the "unix" binary is copied over from the current kernel... in short, it does not seem "criminal" to save the "firefly" archive straight into the same BE dataset as the "applied" OS. Same as was done in Sol10/SXCE. I can think of some pros and cons regarding reliability (two copies of "unix" vs. one seems better, and protection from breakage of the BE dataset logical structure in case of a common rootfs more often written into) vs. simplicity (one rootfs for each software revision level, less rootfs'es overall), but still wanted to ask if you have some opinion on whether such Firefly images embedded onto rpool should be united with a rootfs or should live in a separate dataset?

alhazred said...

No, I prefer the old Solaris model where each BE has its own failsafe archive.

JimKlimov said...

Both solutions have their pro's and con's, that's why I asked. But yes, I'm rather used to private failsafe's in each BE (also seems to save space on needless boot_archive's generated by beadm in firefly datasets otherwise).

I've tested by copying the firefly archive into an identical path under the main OS BE and fixing up the menu.lst accordingly (and "beadm destroy"ing the firefly dataset for surety) - it boots up as expected of a different "module" archive with same "kernel" and "bootfs" as the production OS image. Subsequent beadm clone-ing of the main OS BE and the Firefly BE (based on the firefly dataset with customized boot-menu block) created proper menu entries as well.

As for automating boot-menu maintenance for "same-bootfs different-module" configuration without changing beadm itself... I am not sure how to portably go about that, beside scripting up an ad-hoc editor of menu.lst :(

As an aside, I hope Firefly would help migrate my other laptop from IDE mode to SATA mode booting, finally (other ways of booting - via USB-CDROM and USB-Flash - ended up reenumerating storage devices, so zfs rpool_mount failed to initialize it due to "bad" device path just as well) ;)

Shadow Cat said...

How do I run Slackware on top of Firefly?

Slackware seem to have an abundance of packages available.

JimKlimov said...

@ShadowCat : are you trolling? Why not go straight for Windows on top of Firefly? ;)

Firefly is a minimized illumos (Solaris-like) distro fit into a read-only image and tailored for one purpose: aid in recovery of botched Solaris-like installations that refuse to boot. It is not a Linux, it is not even a general-purpose OS. It is a tool, very useful in certain situations, but nothing more.

JimKlimov said...

@Alex, a couple of nits about Firefly contents:

1) Is it possible to build-in a more functional shell with proper command-line editing capabilities (I can press "UP" to scroll through history, but can't edit the resulting lines - only "ENTER" to apply them verbatim time and again) and tab-completion of paths? Bash would be nice ;)

2) When I exit the shell, nothing happens. IIRC, the Solaris Failsafes used to unmount and reboot (-p) in this case.

Thanks,
Jim Klimov

JimKlimov said...

Shameless plug: the https://github.com/jimklimov/illumos-splitroot-scripts/blob/master/bin/beadm-firefly-update.sh which initially grew from my comments on this blog, reached a stage where it can be used to manage and update Firefly images "integrated" into the current rootfs or stored in a "standalone" dedicated BE (either of these can be a source/destination for the original/resulting image file), and relevant GRUB entries.

Currently the script is not very user-friendly (manageable via environment variables passed by the caller) and does not yet support maintenance of also alternate OS BE's not rooted at "/" (to do this also during upgrades stashed into an alt-root BE, as the rest on my split-root project scripts encourage and automate), but it is getting close to that point - and then it will become an optional step in beadm-upgrade.sh successful finish ;)

JimKlimov said...

Yay, finally, my new script is mature enough to also manage an altroot'ed BE and so it got integrated with `beadm-upgrade.sh` of my "illumos-splitroot-scripts" project - now if an original "firefly" image, dataset or archive is available, and a recent upgrade in a "$BENEW" succeeded, a custom-tailored failsafe image is created in that BE, and the needed GRUB entry is generated just after the main OS entry (because it is the last one in GRUB's "menu.lst" at this moment).
Magic, if I may say so! :)

Also, I found that the Firefly environment does already have bash - it just needs to be executed manually, and works fine both for history and tab-completion. Default shell at the moment is ksh93 which maybe suffers from unexpected (missing?) terminal settings, but overall sucks in terms of interactive usability.

JimKlimov said...

Tested that changing symlinks in the failsafe image (specifically /sbin/sh) to bash does not seem to break anything, and enables the nicer shell by default. So this went into image-modifications automated and perpetrated by the script ;)

JimKlimov said...

Yay - (little surprise here, but anyway) - Firefly did help switch over the rpool device paths as I changed hardware setups (dual-booted OI partition with another OS, so the same rpool can be physical IDE or SATA, or VirtualBox IDE or SATA).
Just like the recommended way of doing similar tasks also with Failsafe Boot, in Sol10/SXCE back in the day.

FWIW, I booted a VM instance of OI from the other OS as hypervisor, with the Firefly ISO as the boot device (since the OI installation previously booted from hardware now refused to import the rpool from VM device path to try and import the ISO contents from there). The resulting Firefly running image did not let me access the CDROM, but the runtime environment did contain the three pieces needed to make a firefly dataset: the kernel and its kmdb bits, and the mounted /dev/ramdisk:a which I dd'ed back and compressed as the firefly image file on the new dedicated bootfs (recognized and tested mountable as an UFS image).
Rebooted with this new HDD-based Firefly dataset - ok.
Exported the rpool and rebooted into proper OI. Got a stacktrace. (In hindsight, maybe I should have done some devfsadm and/or "bootadm update-archive -R /a" tricks before exporting - but did not do this explicitly in the entirety of the described procedure).
Rebooted again with "-kv" to enable debugging and see what goes wrong - but it booted okay.
Rebooted without "-kv" - still boots okay, now from VM hardware :)

So maybe the switchover is not entirely seamless, but quite workable without need for additional media in the process. And in my case of a constrained laptop, additional media means different hardware added-on, and thus different device paths from those seen in a standalone installation.

Thanks,
Jim Klimov