OpenSolaris

You are not signed in. Sign in or register.

ZFS Frequently Asked Questions

How can I get ZFS?

ZFS is available in the following releases:

When will ZFS be available for <insert OS here>
There are projects under way to port ZFS to FreeBSD and to Linux (using FUSE). For more information on CDDL, see the licensing FAQ.
What does ZFS stand for?

Originally, ZFS was an acronym for "Zettabyte File System." The largest SI prefix we liked ('yotta' was out of the question) was 'zetta'. Since ZFS is a 128-bit file system, it was a reference to the fact that ZFS can store 256 quadrillion zettabytes (where each ZB is 270 bytes). Over time, ZFS gained a lot more features besides 128-bit capacity, such as rock-solid data integrity, easy administration, and a simplified model for managing your data.

Why does ZFS have 128-bit capacity?
Filesystems have proven to have a much longer lifetime than most traditional pieces of software, due in part to the fact that the on-disk format is extremely difficult to change. Given the fact that UFS has lasted in its current form (mostly) for nearly 20 years, it's not unreasonable to expect ZFS to last at least 30 years into the future. At this point, Moore's law starts to kick in for storage, and we start to predict that we'll be storing more than 64 bits of data in a single filesystem. For a more thorough description of this topic, and why 128 bits is enough, see Jeff's blog entry.
What limits does ZFS have?

The limitations of ZFS are designed to be so large that they will never be encountered in any practical operation. ZFS can store 16 Exabytes in each storage pool, file system, file, or file attribute. ZFS can store billions of names: files or directories in a directory, file systems in a file system, or snapshots of a file system. ZFS can store trillions of items: files in a file system, file systems, volumes, or snapshots in a pool.

Why doesn't ZFS have an fsck(1M)-like utility?

There are two basic reasons to have an fsck(1M)-like utility.

  • Verify filesystem integrity - Many times, administrators simply want to make sure that there is no on-disk corruption within their filesystems. With most filesystems, this involves running fsck(1M) while the filesystem is offline. This can be time consuming and expensive. Instead, ZFS provides the ability to 'scrub' all data within a pool while the system is live, finding and repairing any bad data in the process. There are future plans to enhance this to enable background scrubbing as well as keep track of exactly which files contained uncorrectable errors.

  • Repair on-disk state - If a machine crashes, the on-disk state of some filesystems will be inconsistent. The addition of journalling has solved some of these problems, but failure to roll the log may still result in a filesystem which needs to be repaired. In this case, there are well known pathologies of errors (such as creating a directory entry before updating the parent link) which can be reliably repaired. ZFS does not suffer from this problem because data is always consistent on disk.

    A more insidious problem occurs with faulty hardware or software. Even those filesystems or volume managers which have per-block checksums are vulnerable to a variety of other pathologies that result in valid but corrupt data. In this case, the failure mode is essentially random, and most filesystems will panic (if it was metadata) or silently return bad data to the app. In either case, an fsck(1M) utility will be of little benefit. Since the corruption matches no known pathology, it will be likely be unrepairable. With ZFS, these errors will be (statistically) nonexistent in a redundant configuration. In an non-redundant config, these errors will correctly be detected, but will result in an I/O error when trying to read the block. It is theoretically possible to write a tool to repair such corruption, though any such attempt would likely be a one-off special tool. Of course, ZFS is equally vulnerable to software bugs, but the bugs would have to result in a consistent pattern of corruption to be repaired by a generic tool. During the 5 years of ZFS development no such pattern has been seen.

Why does du(1) report different file sizes for ZFS and UFS?
On UFS, du(1) reports the size of the data blocks within the file. On ZFS, du(1) reports the actual size of the file as stored on disk. This includes metadata, as well as compression. This really helps answer the question of "how much more space will I get if I remove this file?" So even when compression is off, you will still see different results between ZFS and UFS.
What can I do if ZFS panics on every boot?

ZFS is designed to survive arbitrary hardware failures through the use of redundancy (mirroring or RAID-Z). Unfortunately, certain failures in non-replicated configurations can cause ZFS to panic when trying to load the pool. This is a bug, and will be fixed in the near future (along with several other nifty features, such as background scrubbing). In the meantime, if you find yourself in the situation where you cannot boot due to a corrupt pool, do the following:

  1. boot using '-m milestone=none'
  2. # mount -o remount /
  3. # rm /etc/zfs/zpool.cache
  4. # reboot

This will remove all knowledge of pools from your system. You will have to recreate your pool and restore from backup.

Does ZFS support hot spares?

Yes, the ZFS hot spares feature is available in the Solaris Express Community Release, build 42, the Solaris Express July 2006 release, and the Solaris 10 11/06 release. For more information about hot spares, see the ZFS Administration Guide.

Can devices be removed from a ZFS pool?

You can remove a device from a mirrored ZFS configuration by using the zpool detach command. Removal of a top-level vdev, such as an entire RAID-Z group or a disk in an unmirrored configuration, is not currently supported. This feature is planned for a future release.

Can I use ZFS as my root file system? What about for zones?

Currently, ZFS file systems cannot be used as a root file system on Solaris 10 releases. However, a small subset of ZFS root and boot support is available in the SX community release for x86 systems. For more information, see ZFS Boot. Please stay tuned for the Solaris 10 ZFS boot schedule.

ZFS can be used as a zone root path in the Solaris Express release, but it cannot be patched or upgraded until those tools recognize ZFS file systems. Zone root paths on ZFS are not supported in the Solaris 10 release. For more information, see the Zones FAQ.

In addition, you cannot create a cachefs cache on a ZFS file system.

Is ZFS supported in a clustered environment?

SunCluster 3.2 supports a local ZFS file system as highly available (HA) in the Solaris 10 11/06 release. This support allows for live failover between systems, with automatic import of pools between systems.

If you use SunCluster 3.2 to configure a local ZFS file system as highly available, review the following caution:

Do not add a configured quorum device to a ZFS storage pool. When a configured quorum device is added to a storage pool, the disk is relabeled and the quorum configuration information is lost. This means the disk no longer provides a quorum vote to the cluster. After a disk is added to a storage pool, you can configure that disk as a quorum device. Or, you can unconfigure the disk, add it to the storage pool, then reconfigure the disk as a quorum device.

Using SunCluster 3.2 with HA-ZFS in the Nevada release is not recommended.

ZFS is not a native cluster, distributed, or parallel file system and cannot provide concurrent access from multiple, different hosts. ZFS works great when shared in a distributed NFS environment.

In the long term, we plan on investigating ZFS as a native cluster file system to allow concurrent access. This work has not yet been scoped.

Which third party backup products support ZFS?

  • EMC Networker 7.3.2. backs up and restores ZFS file systems, including ZFS ACLs.
  • Veritas Netbackup 6.5 backs up and restores ZFS file systems, including ZFS ACLs.
  • IBM Tivoli Storage Manager client software (5.4.1.2) backs up and restores ZFS file systems with both the CLI and the GUI. ZFS ACLs are also preserved.
  • Computer Associates' BrightStor ARCserve product backs up and restores ZFS file systems, but ZFS ACLs are not preserved.
  • Does ZFS work with SAN-attached devices?

    Yes, ZFS works with either direct-attached devices or SAN-attached devices. However, if your storage pool contains no mirror or RAID-Z top-level devices, ZFS can only report checksum errors but cannot correct them. If your storage pool consists of mirror or RAID-Z devices built using storage from SAN-attached devices, ZFS can report and correct checksum errors.

    For example, consider a SAN-attached hardware-RAID array, set up to present LUNs to the SAN fabric that are based on its internally mirrored disks. If you use a single LUN from this array to build a single-disk pool, the pool contains no duplicate data that ZFS needs to correct detected errors. In this case, ZFS could not correct an error introduced by the array.

    If you use two LUNs from this array to construct a mirrored storage pool, or three LUNs to create a RAID-Z storage pool, ZFS then would have duplicate data available to correct detected errors. In this case, ZFS could typically correct errors introduced by the array.

    In all cases where ZFS storage pools lack mirror or RAID-Z top-level virtual devices, pool viability depends entirely on the reliability of the underlying storage devices.

    If your ZFS storage pool only contains a single device, whether from SAN-attached or direct-attached storage, you cannot take advantage of features such as RAID-Z, dynamic striping, I/O load balancing, and so on.

    ZFS always detects silent data corruption. Some storage arrays can detect checksum errors, but might not be able to detect the following class of errors:

    • Accidental overwrites or phantom writes
    • Mis-directed reads and writes
    • Data path errors

    Overall, ZFS functions as designed with SAN-attached devices, but if you expose simpler devices to ZFS, you can better leverage all available features.

    In summary, if you use ZFS with SAN-attached devices, you can take advantage of the self-healing features of ZFS by configuring redundancy in your ZFS storage pools even though redundancy is available at a lower hardware level.

    Why doesn't ZFS have user or group quotas?

    ZFS file systems can be used as logical administrative control points, which allow you to view usage, manage properties, perform backups, take snapshots, and so on. For home directory servers, the ZFS model enables you to easily set up one file system per user. ZFS quotas are intentionally not associated with a particular user because file systems are points of administrative control.

    ZFS quotas can be set on file systems that could represent users, projects, groups, and so on, as well as on entire portions of a file system hierarchy. This allows quotas to be combined in ways that traditional per-user quotas cannot. Per-user quotas were introduced because multiple users had to share the same file system.

    ZFS file system quotas are flexible and easy to set up. A quota can be applied when the file system is created. For example:

    # zfs create -o quota=20g tank/home/users

    User file systems created in this file system automatically inherit the 20-Gbyte quota set on the parent file system. For example:

    # zfs create tank/home/users/user1
    # zfs create tank/home/users/user2
    # zfs list -r tank/home/users
    NAME                    USED  AVAIL  REFER  MOUNTPOINT
    tank/home/users        76.5K  20.0G  27.5K  /tank/home/users
    tank/home/users/user1  24.5K  20.0G  24.5K  /tank/home/users/user1
    tank/home/users/user2  24.5K  20.0G  24.5K  /tank/home/users/user2
    

    ZFS quotas can be increased when the disk space in the ZFS storage pools is increased while the file systems are active, without having any down time.

    Rather than attempt to make user-based quotas fit an administration model that is based on file systems as points of control, the ZFS team is working to improve multiple file system management.

    An alternative to user-based quotas for containing disk space used for mail, is using mail server software that includes a quota feature, such as the Sun Java System Messaging Server. This software provides user mail quotas, quota warning messages, and expiration and purge features.

    Can I split a mirrored ZFS configuration?

    Currently, ZFS does not support the ability to split a mirrored configuration for cloning or backup purposes. The best method for cloning and backups is to use ZFS clone and snapshot features. For information about using ZFS clone and snapshot features, see the ZFS Admin Guide. See RFE 6421958 to recursively send snapshots that will improve the replication process across systems.

    In addition to ZFS clone and snapshot features, remote replication of ZFS file systems is provided by the Sun StorageTek Availability Suite product. AVS/ZFS demonstrations are available here.

    Keep the following cautions in mind if you attempt to split a mirrored ZFS configuration for cloning or backup purposes:

    • Splitting a mirrored ZFS configuration is not supported by ZFS. RFE 6421958 is filed to provide this feature.
    • You cannot remove a disk from a mirrored ZFS configuration, back up the data on the disk, and then use this data to create a cloned pool.
    • If you want to use a hardware-level backup or snapshot feature instead of the ZFS snapshot feature, then you will need to do the following steps:
      1. zpool export pool-name
      2. Hardware-level snapshot steps
      3. zpool import pool-name
    • Any attempt to split a mirrored ZFS storage pool by removing disks or changing the hardware that is part of a live pool could cause data corruption.