OpenZFS on Arch Linux: A Practical How-To

Introduction
OpenZFS (often simply called "ZFS") has long been valued for its robust data integrity, flexible storage configurations, and high-performance caching. While Arch Linux doesn't provide ZFS packages in its official repositories due to licensing conflicts, the community has made it possible to install and maintain ZFS through the ArchZFS project.
In this blog post, we'll walk through:
- Installing OpenZFS on Arch Linux - You can also use all example commands on any ZFS enabled workstation or server.
- Creating pools with mirrors, stripes, and RAID-Z variations
- Adding SLOG (Separate Intent Log) and cache (L2ARC)
- Managing devices (offline, online, resilvering)
- Migrating datasets with
zfs send
, andzfs receive
- Best practices for day-to-day administration and maintenance
1. Installing OpenZFS on Arch Linux
1.1 Enable the ArchZFS Repository
Because Arch Linux does not distribute ZFS packages directly, you’ll need to add the archzfs
repository to your pacman.conf
.
# First, edit pacman.conf
sudo vim /etc/pacman.conf
# Add the following lines at the bottom: escape,i
[archzfs]
Server = https://archzfs.com/$repo/x86_64
# Save and exit. escape escape :x! to save and write the file.
1.2 Synchronize and Install
After adding the repo:
sudo pacman -Sy
sudo pacman -S zfs-linux zfs-utils
Depending on your kernel variant (e.g., linux-lts
), install matching ZFS packages:
sudo pacman -S zfs-linux-lts zfs-utils-lts
1.3 Enable ZFS Services
ArchZFS provides systemd services to automatically import and mount your pools at startup:
sudo systemctl enable zfs-import-cache.service
sudo systemctl enable zfs-mount.service
sudo systemctl enable zfs-import.target
sudo systemctl enable zfs.target
Optional - Reboot once installation is complete:
sudo reboot
2. Pool Creation Basics
ZFS organizes storage into pools (zpools). Each pool is composed of one or more vdevs (virtual devices), which can be:
- Single disk (a vdev using one disk)
- Mirror (Mirror formed of one or multiple pairs of disks)
- RAID-Z variants (RAID-Z1, RAID-Z2, RAID-Z3)
- Striped (multiple disks combined for capacity and performance, but no redundancy)
2.1 Basic Terminology
- Stripe (RAID-0(Zero)): No redundancy; data is written across multiple devices for speed, but if one disk fails, the pool fails.
- Mirror: Each block is duplicated across two or more disks, offering protection from disk failures, superior read performance, but higher disk usage.
- RAID-Z1/2/3: Similar to RAID-5/6 concepts but with ZFS improvements, allowing the pool to tolerate 1, 2, or 3 disks failing.
2.2 Creating a Striped Pool
Useful for testing or for data you don’t mind losing if a disk fails:
We should always use /dev/disk/
by-id by-label by-path or by-uuid
to find our disks so they do not change on BIOS or EFI changes, we shall use by-id
smalley@demoa:~$ ls /dev/disk/by-id/
scsi-3600224806e10a63be9260b6b6048cab1 scsi-360022480f2e2ed849175d59f4ae8a49a wwn-0x600224808c00ff5aa6788569077f7e16
scsi-36002248079f9f66f426ea82fb0957801 wwn-0x600224806e10a63be9260b6b6048cab1 wwn-0x60022480f2e2ed849175d59f4ae8a49a
scsi-36002248085f7a4ffce559da2bfab1561 wwn-0x6002248079f9f66f426ea82fb0957801
scsi-3600224808c00ff5aa6788569077f7e16 wwn-0x6002248085f7a4ffce559da2bfab1561
# Example: stripe using two disks
smalley@demoa:~$ sudo zpool create mypool /dev/disk/by-id/scsi-3600224806e10a63be9260b6b6048cab1 /dev/disk/by-id/scsi-36002248085f7a4ffce559da2bfab1561
2.3 Creating a Mirrored Pool
A two-disk mirror:
smalley@demoa:~$ sudo zpool create mymirror mirror /dev/disk/by-id/scsi-3600224806e10a63be9260b6b6048cab1 /dev/disk/by-id/scsi-36002248085f7a4ffce559da2bfab1561
You can also create three-disk or four-disk mirrors by listing more devices.
2.4 Creating a RAID-Z Pool
For RAID-Z1 with three disks:
smalley@demoa:~$ sudo zpool create myraidz raidz /dev/disk/by-id/scsi-3600224806e10a63be9260b6b6048cab1 /dev/disk/by-id/scsi-36002248085f7a4ffce559da2bfab1561 /dev/disk/by-id/scsi-3600224808c00ff5aa6788569077f7e16
For RAID-Z2 or Z3, simply specify raidz2
or raidz3
and include enough disks:
smalley@demoa:~$ sudo zpool create myraidz2 raidz2 /dev/disk/by-id/scsi-3600224806e10a63be9260b6b6048cab1 /dev/disk/by-id/scsi-36002248085f7a4ffce559da2bfab1561 /dev/disk/by-id/scsi-3600224808c00ff5aa6788569077f7e16 /dev/disk/by-id/scsi-36002248079f9f66f426ea82fb0957801
3. Advanced Configurations: Combining Mirrors and Stripes
ZFS also supports mixing mirrored vdevs in a larger stripe.
3.1 Multiple Mirrored Vdevs in One Pool
Suppose you have 4 disks and want two mirrored pairs:
smalley@demoa:~$ sudo zpool create bigpool \
mirror dev/disk/by-id/scsi-3600224806e10a63be9260b6b6048cab1 /dev/disk/by-id/scsi-36002248085f7a4ffce559da2bfab1561 \
mirror /dev/disk/by-id/scsi-3600224808c00ff5aa6788569077f7e16 /dev/disk/by-id/scsi-36002248079f9f66f426ea82fb0957801
This effectively stripes the two mirrored vdevs. You can add more mirror pairs later to expand the pool.
4. Adding SLOG and Cache (L2ARC)
4.1 Separate Intent Log (SLOG)
The ZFS Intent Log (ZIL) handles synchronous writes. By default, the ZIL resides on your main pool. A dedicated SLOG device (e.g., an SSD) can speed up these writes. Note that an SLOG device is not a write cache; it just helps confirm writes quickly.
# Add an SSD (e.g. /dev/nvme0n1 but use the dev disk by-id format) as an SLOG device
smalley@demoa:~$ sudo zpool add mypool log /dev/disk/by-id/scsi-360022480f2e2ed849175d59f4ae8a49a
If the SSD fails, the pool remains intact, but the latest in-flight synchronous writes could be lost if not mirrored. For critical data, consider mirroring the SLOG:
smalley@demoa:~$ sudo zpool add mypool log mirror /dev/disk/by-id/scsi-36002248079f9f66f426ea82fb0957801 /dev/disk/by-id/scsi-36002248079f9f66f426ea89fb0957206
4.2 Level 2 ARC (L2ARC)
ZFS’s primary read cache (ARC) lives in RAM. L2ARC is a secondary cache placed on faster storage (SSD/NVMe) to hold data evicted from the ARC.
# Add an SSD as L2ARC
smalley@demoa:~$ sudo zpool add mypool cache /dev/disk/by-id/scsi-36002348079f9f66f426ea89fb0957209
Data is still on the main pool; if the L2ARC device fails, no data is lost. L2ARC is best used if you have enough RAM to track the metadata for the cached blocks.
5. Disk Management: Offline, Online, and Replacement
5.1 Taking a Disk Offline
If a disk starts failing or you need to remove it:
smalley@demoa:~$ sudo zpool offline mypool /dev/disk/by-id/scsi-3600224806e10a63be9260b6b6048cab1
The pool will show a degraded state. Data remains accessible if sufficient redundancy exists.
5.2 Bringing a Disk Online
After maintenance or replacement, you can bring it back:
smalley@demoa:~$ sudo zpool online mypool /dev/disk/by-id/scsi-3600224806e10a63be9260b6b6048cab1
5.3 Replacing a Failed Disk
If /dev/sda
fails, replace it with /dev/sdz
:
smalley@demoa:~$ sudo zpool replace mypool /dev/disk/by-id/scsi-3600224806e10a63be9260b6b6048cab1 /dev/disk/by-id/scsi-3600224808c00ff5aa6788569077f7e16
ZFS then resilvers the new disk.
6. Resilvering and RAID-Z Benefits
6.1 Resilvering Explained
Resilvering is ZFS’s process of rebuilding a disk’s data (in mirrors or RAID-Z) after a replacement or reintroduction. It checks blocks on all other disks, recomputes missing data or parity, and writes to the new disk.
smalley@demoa:~$ sudo zpool status mypool
This command shows resilver progress and estimates.
6.2 RAID-Z Level Benefits
- RAID-Z1: Tolerates 1 disk failure
- RAID-Z2: Tolerates 2 disk failures
- RAID-Z3: Tolerates 3 disk failures
Larger arrays often benefit from higher redundancy (Z2 or Z3). You must balance capacity, performance, and fault tolerance.
7. Migrating Data with zfs send and zfs receive
7.1 Creating Snapshots
smalley@demoa:~$ sudo zfs snapshot mypool/mydataset@snap1
7.2 Sending the Snapshot
smalley@demoa:~$ sudo zfs send mypool/mydataset@snap1 > mydataset_snap1.zfs
You can pipe directly to another system:
smalley@demoa:~$ sudo zfs send mypool/mydataset@snap1 | ssh user@otherserver "zfs receive backup/mydataset"
7.3 Incremental Sends
Send only differences between snapshots:
smalley@demoa:~$ sudo zfs send -i mypool/mydataset@snap1 mypool/mydataset@snap2 | ssh user@otherserver "zfs receive backup/mydataset"
8. Additional Administration and Management Tips
- Scrubs and Health Checks
- Use
zpool scrub mypool
regularly to detect and correct data issues. - Check status:
zpool status mypool
- Use
- Compression and De-duplication
- De-duplication is resource-heavy—only enable it if you have ample RAM.
- Encryption
- OpenZFS supports native encryption on some platforms. Usage varies by distro.
- Snapshots and Rollbacks
- Create snapshots frequently for backups.
- Backing Up Configuration
- Keep a copy of
/etc/pacman.conf
and any ArchZFS config changes.
- Keep a copy of
Regularly export your pool cache:
smalley@demoa:~$ sudo zpool set cachefile=/etc/zfs/zpool.cache mypool
Roll back if necessary:
smalley@demoa:~$ sudo zfs rollback mypool/mydataset@snap1
Enable compression on datasets:
smalley@demoa:~$ sudo zfs set compression=lz4 mypool/mydataset
Conclusion
Whether you’re setting up a basic home lab with mirrored HDDs or a mission-critical server environment leveraging RAID-Z2 with dedicated SLOG and L2ARC, ZFS is built for data integrity, flexibility, and performance. Arch Linux users can harness these strengths via the ArchZFS repository, gaining access to advanced storage features that rival any enterprise solution.
Key Takeaways:
- Use mirrors or RAID-Z for redundancy and data protection.
- SLOG devices accelerate synchronous writes, while L2ARC extends read caching beyond system RAM.
- Offline and replace disks carefully, and let ZFS resilver automatically.
zfs send
andzfs receive
enable powerful, incremental backups and migrations.
With this guide, you can confidently administer OpenZFS on Linux, from initial installation to advanced day-to-day management. Happy ZFS-ing!