pfSense Support Subscription

Author Topic: HOW TO: 2.4.0 ZFS Install, RAM Disk, Hot Spare, Snapshot, Resilver Root Drive  (Read 9911 times)

0 Members and 2 Guests are viewing this topic.

Offline pfBasic

  • Hero Member
  • *****
  • Posts: 1021
  • Karma: +139/-22
    • View Profile
EDIT: I'm glad to see this was made a sticky! I hope it is helpful, I'll keep this original post updated so that relevant info stays at the top and won't get lost in whatever discussion may come here in the future.

pfSense 2.4 is coming and with it comes ZFS, for some this feature is a non-event but others will find it very useful.

in pfSense, ZFS is good for:

  • Generally rugged filesystem (the forum has posts about users having to reinstall due to UFS errors due to hard shutdown or unknown reasons, ZFS avoids greatly mitigates this)
  • High availability systems (using software raid provides redundancy in the event of drive failure)
  • Remote systems (for same reasons as above, allows you to "fix" your system remotely if a drive fails without having to physically access it
  • Cheap install media (installing to USB flash is very cheap, but they are write sensitive, ZFS helps mitigate that issue) EDIT: While I've never encountered them there can be issues with USB https://forum.pfsense.org/index.php?topic=128577.msg709455#msg709455
  • Full System Backup (snapshots, not as useful on pfSense due to ease of restoring config.xml, but still has it's uses)
  • System replication (if setting up many identical systems can set one up and quickly distribute install + config to other systems - snapshot)

in pfSense, ZFS is bad for:

  • RAM limited systems (it uses a lot of RAM, general rule of thumb would be 1GB RAM available for ZFS only, but it's not a hard rule)

I'm not an expert of any kind in the IT or networking world, I'm a hobbyist so keep that in mind reading through this.
None of this is my original idea, it comes from various places throughout the internet.
This is probably most commonly useful for someone wishing to install to cheap media (that's why I'm doing it.)

Why flash drives? They are cheap, really cheap. I got 5x8GB for $30.
Why does ZFS help? Flash drives (the cheap ones) are write sensitive, write too many times to them and they break. Operating Systems write a lot, so they can destroy a flash drive really fast. ZFS alone doesn't stop this (RAM disks mitigate it greatly) ZFS just gives you software raid. This way if one disk fails, another is there to keep things running seamlessly.

So why raid and a hot spare? All of your disks are being written to equally in the raid pool, so if one drive fails it isn't unreasonable to think that the others will follow suit relatively soon. With a hot spare already partitioned with boot code installed, you can introduce a fresh disk with little to no writes on it to the pool.
Why bother with flash drives at all if they are so fragile? Looking through FreeNAS forums (another FreeBSD based appliance that actually recommends USB flash installs) you can see many scenarios where people have gotten years of use out of single $5 flash drives as boot drives.

This won't be complex, this guide is for non-IT people like myself. It will cover:

  • The general idea of what's happening in the pfSense 2.4 ZFS Auto Install
  • A few zpool and zfs settings
  • Partitioning, installing bootcode to and adding a hot spare to your zpool
  • Recovering from a boot drive failure
  • Snapshot basics






1. - The General Idea of What's Happening in the pfSense 2.4 ZFS Auto Install -
 
 The auto install feature is quick and straightforward but I'll mention a few things to save you some googling.
 
 Pool Type: Just read the descriptions, they tell you what's going on. If you still have questions ask here. I'm using raidz2 (not because I think I need it).
 I'm betting a 2 disk mirror is best for most people unless you know you have a reason for more.
 
 When you install I recommend inserting one disk at a time and selecting disk info, write that serial number somewhere and assign it a number and letter, then write that number or letter on the physical disk.
 i.e., disk 1, S/N: 1234567890, (write a "1" on the disk) pull disk one out, put disk two in, select Rescan disks, then Disk info, disk 2, S/N: 3126450789, (write a "2" on the disk).
 When a disk fails, zfs will give you the serial number of the failed disk, this just makes it easy to identify and replace the bad disk.
 
 name the pool whatever you like, probably something easy to type out though
 
 Leave 4k sector alignment unless you have a reason not to
 
 Encrypt if you need to, I don't
 
 For flash drives I would recommend turning off swap (just select 0 or nothing), I think you lose crash boot dumps though so if you need those leave it, but swap=on means more writes to your flash drive.
 
 Once your install is complete, for flash drives I recommend going to System > Advanced > Miscellaneous and turning RAM Disk on. If you already are using pfSense you can get a ballpark of your space needs
 for /tmp & /var with:
 
Code: [Select]
du -hs /tmp
 du -hs /var
 

EDIT: When performing a ZFS install to Flash media on 2.4.0 BETA, if you encounter issues booting (I have not) try adjusting the boot delay in /boot/loader.conf or /boot/loader.conf.local as follows.
Code: [Select]
kern.cam.boot_delay="10000"
Ref. https://redmine.pfsense.org/issues/7309
 





2. - A Few Zpool and ZFS Settings -

You can see your zfs settings per pool with:
Code: [Select]
zfs get all yourpoolname
If you install to a single disk, you can make zfs write two copies of everything to your drive. On flash this is probably a bad idea. The benefit is that if one copy of something you need gets corrupted, it's unlikely that the other will also
be corrupted so ZFS will likely recover from this corruption seamlessly.
Code: [Select]
zfs set copies=2 yourpoolname
You can see your zpool settings & stats with:
Code: [Select]
zpool get all yourpoolname
The only thing I'll mention here is setting autoreplace=on, it's saying that if the pool is degraded and you have a hot spare, resilver to the hot spare without asking you.
I do not recommend turning this on (it's off by default) unless you have set everything up (it doesn't just work on its own) and need it. But we'll talk about it later so I mention it.
Code: [Select]
zpool set autoreplace=on yourpoolname
ZFS can checksum your data to make sure nothing it corrupted, if it finds something corrupted AND has a redundant copy of that data, it will fix the corruption.
This is called running a scrub.
Code: [Select]
zpool scrub yourpoolnameOnce the scrub is complete you can check the pools status and it will tell you if it repaired any errors. Scrubs DO write to your drives even if they don't repair any errors.
You can see for yourself by starting a scrub and running
Code: [Select]
iostat -x -w 1You will see writes occurring intermittently throughout the scrub, and ending when the scrub is complete (if you have RAM disk enabled in pfSense and swap=off).
Because of this I only scrub monthly via cron.






3. - Partitioning, Installing Bootcode To And Adding A Hot Spare To Your zpool -

So you've installed ZFS to pfsense and want a hot spare. Since your zpool is a boot pool (ZFS on /) you need to partition and set up your hot spare accordingly.
NOTE: You can't use a hot spare that is smaller than the drives in your pool.
NOTE: Don't resilver to a hot spare unless you read the next section as well.

Take a look at your pool
Code: [Select]
zpool statusYou'll notice that your pool is only using the second partition of each disk. We'll do the same. In order to set the sizes equal to those in the pool run:
Code: [Select]
gpart showIf you installed to a 2 way mirror you'll see something like this (values will be different, adjust accordingly).

Code: [Select]
=>      40   8388528  da0  GPT  (4.0G)
        40      1024    1  freebsd-boot  (512K)
      1064       984       - free -  (492K)
      2048   8384512    2  freebsd-zfs  (4.0G)
   8386560      2008       - free -  (1.0M)

=>      40   8388528  da1  GPT  (4.0G)
        40      1024    1  freebsd-boot  (512K)
      1064       984       - free -  (492K)
      2048   8384512    2  freebsd-zfs  (4.0G)
   8386560      2008       - free -  (1.0M)
 

  So to create a hot spare for this:
Code: [Select]
# gpart create -s gpt da2
# gpart add -a 4k -s 512k -t freebsd-boot -l gptboot2 da2 ###This creates p1, you are using 4k alignment, size is 512k, type is freebsd-boot, label is gptboot2, you are partitioning drive da2
# gpart add -b 2048 -s 8384512 -t freebsd-zfs -l zfs2 da2 ###This creates p2, you are beginning at block 2048 and stopping at block 8384512
# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da2 ###This writes the bootcode to p1 of your hot spare

You will have to adjust these commands to match your system. If you did it properly then all drives will appear identical in the output of gpart show.

Now simply add p2 of your hot spare to the pool:
Code: [Select]
zpool add yourpoolname spare da2p2
Thanks to @kpa for guiding me through this!






4. - Recovering From A Boot Drive Failure -

If a boot drive fails, your pool will show as Degraded. If your pool sustains more drive failures than your type of pool is capable of sustaining, it breaks forever and you have to reinstall.
A degraded pool will reboot just fine.
If your degraded pool is a boot pool (on pfSense with an auto install, it is) and you resilver that pool to a hot spare, it will work until you reboot. When you reboot it will hang on reboot.
To avoid hanging on reboot after resilvering your boot pool, you must remove the bad drive from the pool.
Code: [Select]
zpool detach yourpoolname baddisknameThis makes the hot spare a permanent part of your pool.

If you are smarter than me I'm betting you could automate this with a script, I would think something running frequently in cron along the lines of:
Code: [Select]
check if pool is degraded
if no, exit
if yes, check if resilver complete
if no, exit
if yes, detach baddisk

If anyone does write such a script, please share! ;)

In addition to writing a script to automate detaching the bad disk after resilver, you would need to automate resilvering. ZFS can handle that alone.

Turn autoreplace=on (covered earlier)
AND start zfsd (ZFS File Management Daemon) on boot, without zfsd running, autocomplete=on won't do anything.
Install pfSense package "Shellcmd"
Add a new command:
Command: zfsd
Type: shellcmd

Again, only do this if you figure out a script automating detaching the disk, your system will fail to boot after resilvering to a hot spare until the bad disk is detached from the pool.


To do all of this manually, when you get a degraded pool:
Code: [Select]
zpool replace yourpoolname baddisk hotsparedisk

 After the resilver is complete
 
 
Code: [Select]
zpool detach yourpoolname baddiskname
 Alternatively, you can pull the bad disk physically, add a completely new disk (or the hot spare if you remove it form the pool) partition it as above, and just the zpool replace command will heal your pool
 
 




5. - Snapshot Basics -
 
 Snapshots reference a moment in time of your file system that is read only, you can recover completely or partially from that moment in time.
 When a snapshot is taken it takes up no space because it is the same as your current filesystem.
 As your filesystem changes the snapshot grows proportionately.
 Snapshots probably aren't all that critical for pfSense since the config.xml restore works so well, but it is still handy.
 Since snapshots take up space over time, that means more writes. So if you want to minimize writes you can store snapshots on a seperate drive (like another cheap USB)
 If you want it go to a separate drive you can add the drive, partition it as you like and (if it is also ZFS) create a new pool for it using:
 
Code: [Select]
zpool create yournewpoolname yournewdiskname
 

 
Code: [Select]
zfs snapshot -r yourpoolname@snapshotname ###Creates a recursive snapshot of your entire pool
zfs send -Rv yourpoolname@snapshotname | zfs receive -vF yournewpoolname ###Sends that snapshot to your other drive
zfs destroy -r yourpoolname@snapshotname ###Recursively destroys the entire snapshot of your pool that is stored on the boot pool
zfs list -t snapshot ###Shows you a list of all your snapshots
 

 EDIT: I don't recommend setting a second zpool as it can cause issues with booting. If you want to send snapshots on a separate device, try a UFS filesystem on it. People smarter than myself can probably get around this, if anyone has a solution please share and I'll add it here!
 To use UFS:
After partitioning the drive follow the instructions here:
https://www.freebsd.org/doc/handbook/disks-adding.html

To send your snapshot to a UFS partition you can modify this for your mount point and copy and paste:
Code: [Select]
zfs snapshot -r yourpoolname@`date "+%d.%b.%y.%H00"` && zfs send -Rv yourpoolname@`date "+%d.%b.%y.%H00"` | gzip > /mnt/sshot/sshot`date "+%d.%b.%y.%H00."`gz && zfs destroy -r yourpoolname@`date "+%d.%b.%y.%H00"` && zfs list -r -t snapshot -o name,creation && du -hs /mnt/sshot/sshot`date "+%d.%b.%y.%H00."`gz

You can write the following to a file in pfsense (NOT in /var or /tmp if using a RAM disk, and you must modify for your mount point), make it executable (chmod +x /usr/myshells/yourfilehere) and add it to a cron job to automate your snapshots on pfSense. Just run the cron job towards the beginning of the hour as the date time group variable changes on the hour and could cause issues if the hour changes before it completes.
Code: [Select]
zfs snapshot -r yourpoolname@`date "+%d.%b.%y.%H00"` && zfs send -R yourpoolname@`date "+%d.%b.%y.%H00"` | gzip > /mnt/sshot/sshot`date "+%d.%b.%y.%H00."`gz && zfs destroy -r yourpoolname@`date "+%d.%b.%y.%H00"`




Once the transfer is complete, you can compare file sizes of your actual pool and the backed up snapshot if you want, it should be much less than what the zfs send verbose output estimates after compression.
Code: [Select]
du -hs /mnt/yourmountdirectory/yoursnapshotbackup.gz && zfs get used yourpoolname






That's it for now, I hope this is helpful! I appreciate all comments and recommendations!

You can find more discussion on this topic here:
https://forum.pfsense.org/index.php?topic=120340.0
« Last Edit: June 26, 2017, 12:04:01 pm by pfBasic »


Offline occamsrazor

  • Jr. Member
  • **
  • Posts: 25
  • Karma: +2/-0
    • View Profile
Hi,

I'm planning to by one of these:

https://www.aliexpress.com/item/Latest-New-core-I5-5250U-4-LAN-Home-computer-router-server-support-pfsense-linux-firewall-Cent/32798137911.html

...and have been reading in advance a lot about the installation methods in order to determine my needs. But as I have only ever played with pfsense in virtual machine I'm confused. I'm just planning ahead to see what will be the best combination of media on which to install on.

This 2.4 guide suggests using USB keys, but if I have the option to use the internal mSata SSD would that make sense to do so?

If so, and lets say I plan to use a whole bunch of packages including Squid, Suricata, etc, what would a suitable size be? My understanding with ZFS is there would still be a benefit to using ZFS when installed on a single volume... right? I'm not sure I could configure a pair of SSDs on this device. I should add that my ISP speed is low (currently 10mb) but I am over-speccing this a lot for possible much faster speeds in future, and also in case I decide to repurpose the device as something else. I understand ZFS uses more RAM, will 8G be enough?

Alternatively would it be better to use a pair of USB keys for the installation? If so what would be a suitable size? Would the SSD then be unused, or would it still be useful for non-boot functions?

Sorry for all the questions but I have to order everything in advance internationally so just want to get the hardware right first time in terms of RAM, SSD, USB. Actual installation will be later. Thanks in advance....

Offline pfBasic

  • Hero Member
  • *****
  • Posts: 1021
  • Karma: +139/-22
    • View Profile
I don't recommend USB Flash Drives on ZFS over SSDs unless you trying to save money and don't already have an SSD. I might recommend them over an HDD because they are silent and use less power, but the advantages over SSD are only price.

Using flash drives complicates things, so if you have an SSD definitely use that, and yes there are advantages of ZFS over UFS in a single drive configuration. In fact, single drive would be the recommended configuration for almost all use cases unless you are using USB flash drives.

ZFS does use more RAM than UFS but it's not a huge amount in a firewall implementation. 8GB is way more than enough as far as ZFS is concerned.

Offline occamsrazor

  • Jr. Member
  • **
  • Posts: 25
  • Karma: +2/-0
    • View Profile
Thanks a lot - that was exactly the information I was looking for.

Offline sienar

  • Newbie
  • *
  • Posts: 6
  • Karma: +4/-3
    • View Profile
I'm assuming the answer is yes, but would the common ZFS suggestion to ensure you have ECC ram apply to PFSense as well? The FreeNAS folks definitely like to point out the possibility of entirely destroying an entire pool silently with a stuck bit in RAM.

Offline pfBasic

  • Hero Member
  • *****
  • Posts: 1021
  • Karma: +139/-22
    • View Profile
No, non-ECC will be just fine. The whole FreeNAS ECC imperative is a pretty questionable argument at best. I'm pretty sure somewhere out there on the internet the developers of ZFS said in so many words that the ZFS needs ECC thing was silly.

You won't get a stuck bit that destroys your system. But for the sake of argument, even if you do, and don't have any snapshots then you just have to reinstall and restore from config on pfSense which should take about five minutes.
If you do keep snapshots regularly then you import the snapshot and mount it.

Now if it's an installation for a customer that needs high availability in a production environment then you probably should use ECC. If for no other reason than to give the customer peace of mind.

In short, if you didn't already have a reason to use ECC, then ZFS on pfSense shouldn't change your mind. But if you want to be convinced otherwise just ask the same question on the FreeNAS forums and I'm sure you'll be flamed for acknowledging that such a thing as non-ECC exists.

Offline occamsrazor

  • Jr. Member
  • **
  • Posts: 25
  • Karma: +2/-0
    • View Profile
If you install to a single disk, you can make zfs write two copies of everything to your drive. On flash this is probably a bad idea. The benefit is that if one copy of something you need gets corrupted, it's unlikely that the other will also
be corrupted so ZFS will likely recover from this corruption seamlessly.
Code: [Select]
zfs set copies=2 yourpoolname

Thanks for your earlier advice, I now have a nicely working Qotom i5 running 2.4 Beta installed on a 64GB SSD. So for an SSD would you recommend to enable this "two copies" setting? Is there any disadvantage except storage space (of which I have way more than needed)? If I do enable that should I then enable autoreplace, or is that only for if you have a 2nd drive?

You can see your zpool settings & stats with:
Code: [Select]
zpool get all yourpoolname

Are there any other settings I should change in my setup? Below is the result of a zpool getall command:

Code: [Select]
NAME   PROPERTY                       VALUE                          SOURCE
zroot  size                           57.5G                          -
zroot  capacity                       1%                             -
zroot  altroot                        -                              default
zroot  health                         ONLINE                         -
zroot  guid                           xxxxxxxxxxxxxxxxxxx            default
zroot  version                        -                              default
zroot  bootfs                         zroot/ROOT/default             local
zroot  delegation                     on                             default
zroot  autoreplace                    off                            default
zroot  cachefile                      -                              default
zroot  failmode                       wait                           default
zroot  listsnapshots                  off                            default
zroot  autoexpand                     off                            default
zroot  dedupditto                     0                              default
zroot  dedupratio                     1.00x                          -
zroot  free                           56.6G                          -
zroot  allocated                      964M                           -
zroot  readonly                       off                            -
zroot  comment                        -                              default
zroot  expandsize                     -                              -
zroot  freeing                        0                              default
zroot  fragmentation                  5%                             -
zroot  leaked                         0                              default
zroot  feature@async_destroy          enabled                        local
zroot  feature@empty_bpobj            active                         local
zroot  feature@lz4_compress           active                         local
zroot  feature@multi_vdev_crash_dump  enabled                        local
zroot  feature@spacemap_histogram     active                         local
zroot  feature@enabled_txg            active                         local
zroot  feature@hole_birth             active                         local
zroot  feature@extensible_dataset     enabled                        local
zroot  feature@embedded_data          active                         local
zroot  feature@bookmarks              enabled                        local
zroot  feature@filesystem_limits      enabled                        local
zroot  feature@large_blocks           enabled                        local
zroot  feature@sha512                 enabled                        local
zroot  feature@skein                  enabled                        local

Offline pfBasic

  • Hero Member
  • *****
  • Posts: 1021
  • Karma: +139/-22
    • View Profile
I would set it to 2 personally.

It isn't going to save you from everything, but it's certainly better than nothing.

Check out this article, it's far from a controlled test but I think it does a good job of showing us what multiple copies can and can't do for us.
http://www.openoid.net/testing-the-resiliency-of-zfs-set-copiesn/

There is a performance impact on disk writes (you have to write everything twice). But, in pfSense an SSD is so fast that even writing twice (or three times) I don't think you will notice the difference. I also think that for a pfSense application your SSD will outlive the system even with you writing double (or even triple) copies to disk.

FWIW, setting copies=x only affects future files, not what has already been written.

Since pfSense is so easy and quick to reinstall and restore config.xml, ultimately what we are trying to achieve with copies=x is to avoid the annoyance of having to troubleshoot, reinstall, or have downtime because of a few corrupted files.
From what I've read, multiple copies offers some chance of avoiding those unpleasant situations, but is by no means a guarantee. In my mind, that's valuable enough since I believe the performance & durability costs of using it are likely negligible in pfSense.
« Last Edit: May 18, 2017, 02:12:33 am by pfBasic »

Offline kpa

  • Hero Member
  • *****
  • Posts: 1177
  • Karma: +131/-6
    • View Profile
As far as I know multiple copies tries to spread the storage space of the copies around the medium used which is nice for spinning disks because bad blocks when they appear tend to cluster around one spot. On SSDs this is not guaranteed at all though.

Offline bingo600

  • Full Member
  • ***
  • Posts: 115
  • Karma: +12/-0
    • View Profile
Gents' I'm a total pfsense newbie (uses linux) , and I'm waiting for my new Qotom Q355G4 i5 box to arrive.
It will come w. 8G Ram & 64G mSata , but i'm going to install a Toshiba 240G SSD Sata disk.
Maybe i'll remove the 64G mSata , unless someone advices me to keep both disks in there.


I'd like to install the 2.4.? on it straight away, and use ZFS.

If just keep the 240G SSD in there, do you have any hints for a "single disk ZFS" install.

Would there be any advantage of keeping the 64G mSata in there , besides complicating the install for a newbie.
Is the "write 2 copies" adviceable for a SSD (wear) ?

Do i (ZFS) still need TRIM to be enabled ?
 

/Bingo
pfSense 2.4.1

QOTOM-Q355G4 Quad Lan.
CPU  : Core i5 5250U
Ram : 8GB Kingston DDR3LV 1600
LAN  : 4 x Intel 211
Disk  : 240G Toshiba Sata SSD

Offline stilez

  • Full Member
  • ***
  • Posts: 101
  • Karma: +4/-2
    • View Profile
The guide's very good, and many people will want ZFS. I feel a lot safer with it on my NAS and data stores, and any business is likely to want it.

However it's worth noting that whether it's best for smaller and home systems is down to each person. For example, if you are happy to download or back up your config when it changes, and if a disk goes then just insert a new one and reinstall pfSense and the last config, and you're not worried about data corruption at rest (because there isn't much of it maybe, and you have backups), then ZFS adds little except a need for more hardware and an extra HDD/SSD, because a reinstall is about 15 - 40 minutes downtime while watching the TV.

After all, if data at rest that's actively used by the router for its own purposes (as opposed to files and directories it doesn't use itself) then most often it'll be caught anyway if it has a random bit flip or I/O error - the file won't make sense when read and it'll make this clear to the administrator.

If on the other hand you want to be sure that logs and RRD, tables of IPs, Squid caches, leases, or other extensive data stays 100% intact, and there isn't downtime, or your pfSense platform hosts other data and services too, then ZFS may well be very useful.

So I would add a note to any guide, of the pros and cons, because a router is a very different use case from other installations, if it isn't holding data whose integrity at rest isn't much of a concern.
« Last Edit: August 24, 2017, 06:49:12 pm by stilez »

Offline pfBasic

  • Hero Member
  • *****
  • Posts: 1021
  • Karma: +139/-22
    • View Profile
Yeah ZFS is certainly not a must have. The majority of users would never notice a difference.

It doesn't add a requirement for more hardware though. You can install ZFS to a single disk, you just wouldn't get some of its features.
More RAM maybe - but if you don't already have enough RAM then simply do a UFS install.

The major benefit for your average home user would be added protection against data corruption due to power outages in locales that are prone to them. There are quite a few threads about this on UFS.
The real solution to this is a UPS, but if you can't/don't want to afford a UPS then simply installing to ZFS is a viable stopgap that will very likely (but not certainly) solve this problem.

The other home user benefit would be saving money on hardware. If you are building a budget system you can save a notable amount of $ by installing to a pair of thumb drives instead of a HDD or SSD. Doing this on ZFS allows you to mirror the drives and gives you a bit of redundancy.


But again, I agree that ZFS is by no means a must have for home users. It is a very nice option to have though.

Offline Kreeblah

  • Newbie
  • *
  • Posts: 7
  • Karma: +0/-0
    • View Profile
Is it possible to restore a config from a UFS-based system to a ZFS-based one?

I'd like to switch to ZFS once 2.4.0 is released, which I know will require a reinstall, but I've been having a hard time finding whether restoring my old config would cause issues or whether it would be better to do a manual config from scratch.  Does anybody have any information on doing that?

Offline kpa

  • Hero Member
  • *****
  • Posts: 1177
  • Karma: +131/-6
    • View Profile
As far as I know it should work and is supported, I'd be very surprised if it didn't work because the only differences are in the storage method.