Features

From OpenZFS
Revision as of 16:31, 13 September 2013 by Mahrens (talk | contribs)
Jump to navigation Jump to search

This page describes some of the more important features and performance improvements that are part of OpenZFS.

Help would be appreciated in porting features between platforms whose status is "not yet".

Feature Flags Overview

See these slides (Jan 2012) for more details.

libzfs_core

See this blog post (Jan 2012) and associated slides and video for more details.

CLI Usability

These are improvements to the command line interface. While the end result is a generally more friendly user interface, getting the desired behavior often required modifications to the core of ZFS.

Listed in chronological order (oldest first).

Pool Comment

OpenZFS has a per-pool comment property which can be set with the zpool set command, but can be read even if the pool is not imported, so it is accessible even if the pool cannot be imported.

illumos Nov 2011
FreeBSD ??
ZFS on Linux Aug 2012
Mac ZFS ??

Size Estimates for zfs send and zfs destroy

This feature enhances OpenZFS's internal space accounting information. This new accounting information is used to provide a -n (dry-run) option for zfs send which can instantly calculate the amount of send stream data a specific zfs send command would generate. It is also used for a -n option for zfs destroy which can instantly calculate the amount of space that would be reclaimed by a specific zfs destroy command.

illumos Nov 2011
FreeBSD ??
ZFS on Linux Jul 2012
Mac ZFS ??

vdev Information in zpool list

OpenZFS adds a -v option to the zpool list command which shows detailed sizing information about the vdevs in the pool:

$ zpool list -v
NAME          SIZE  ALLOC   FREE  EXPANDSZ    CAP  DEDUP  HEALTH  ALTROOT
dcenter      5.24T  3.85T  1.39T         -    73%  1.00x  ONLINE  -
  mirror      556G   469G  86.7G         -
    c2t1d0       -      -      -         -
    c2t0d0       -      -      -         -
  mirror      556G   493G  63.0G         -
    c2t3d0       -      -      -         -
    c2t2d0       -      -      -         -
  mirror      556G   493G  62.7G         -
    c2t5d0       -      -      -         -
    c2t4d0       -      -      -         -
  mirror      556G   494G  62.5G         -
    c2t8d0       -      -      -         -
    c2t6d0       -      -      -         -
  mirror      556G   494G  62.2G         -
    c2t10d0      -      -      -         -
    c2t9d0       -      -      -         -
  mirror      556G   494G  61.9G         -
    c2t12d0      -      -      -         -
    c2t11d0      -      -      -         -
  mirror     1016G   507G   509G         -
    c1t1d0       -      -      -         -
    c1t5d0       -      -      -         -
  mirror     1016G   496G   520G         -
    c1t3d0       -      -      -         -
    c1t4d0       -      -      -         -
illumos Jan 2012
FreeBSD ??
ZFS on Linux Sept 2012
Mac ZFS ??

ZFS list snapshot property alias

Functionally identical to Solaris 11 extension zfs list -t snap.

illumos not yet ported
FreeBSD ??
ZFS on Linux Apr 2012
Mac ZFS ??

ZFS snapshot alias

Functionally identical to Solaris 11 extension zfs snap.

illumos not yet ported
FreeBSD ??
ZFS on Linux Apr 2012
Mac ZFS ??

zfs send Progress Reporting

OpenZFS introduces a -v option to zfs send which reports per-second information on how much data has been sent, how long it has taken, and how much data remains to be sent.

illumos May 2012
FreeBSD ??
ZFS on Linux Sept 2012
Mac ZFS ??

Arbitrary Snapshot Arguments to zfs snapshot

illumos Jun 2012
FreeBSD ??
ZFS on Linux August 2013
Mac ZFS ??


Performance

These are significant performance improvements, often requiring substantial restructuring of the source code.

Listed in chronological order (oldest first).

SA based xattrs

Improves performance of linux-style (short) xattrs. (Not to be confused with Solaris-style Extended Attributes which are full-fledged files or "forks", like NTFS streams. This work could be extended to also improve the performance of small Extended Attributes whose permissions are the same as the containing file) on Illumos.

Requires a disk format change and is off by default until Filesystem (ZPL) Feature Flags are implemented (not to be confused with zpool Feature Flags).

illumos not yet (needs additional functionality)
FreeBSD ??
ZFS on Linux Oct 2011
Mac ZFS ??

Note that SA based xattrs are no longer used on symlinks as of Aug 2013 until an issue is resolved.

Use the slog even with logbias=throughput

illumos ??
FreeBSD ??
ZFS on Linux Oct 2011
Mac ZFS ??

Asynchronous Filesystem and Volume Destruction

Destroying a filesystem requires traversing all of its data in order to return its used blocks to the pool's free list. Before this feature the filesystem was not fully removed until all blocks had been reclaimed. If the destroy operation was interrupted by a reboot or power outage the next attempt to import the pool (probably during boot) would need to complete the destroy operation synchronously, possibly delaying a boot for long periods of time.

With asynchronous destruction the filesystem's data is immediately moved to a "to be freed" list, allowing the destroy operation to complete without traversing any of the filesystem's data. A background process reclaims blocks from this "to be freed" list and is capable of resuming this process after reboots without slowing the pool import process.

The new freeing algorithm also has a significant performance improvement when destroying clones. The old algorithm took time proportional to the number of blocks referenced by the clone, even if most of those blocks could not be reclaimed because they were still referenced by the clone's origin. The new algorithm only takes time proportional to the number of blocks unique to the clone.

See this blog post for more detailed performance analysis.

Note: The async_destroy feature flag must be enabled to take advantage of this.

illumos May 2012
FreeBSD ??
ZFS on Linux Jan 2013
Mac ZFS ??

empty bpobjs?

illumos Aug 2012
FreeBSD ??
ZFS on Linux Dec 2012
Mac ZFS ??

single copy arc

illumos Sep 2012
FreeBSD ??
ZFS on Linux Dec 2012
Mac ZFS ??

FASTWRITE Algorithm

Improves synchronous IO performance.

illumos not yet ported
FreeBSD ??
ZFS on Linux Oct 2012
Mac ZFS ??

Note that a locking enhancement is being reviewed.

George's massive spacemap performance improvments (the set from 2012)?

also 16a4a8074274d2d7cc408589cf6359f4a378c861 and 9eb57f7f3fbb970d4b9b89dcd5ecf543fe2414d5

illumos Nov 2012
FreeBSD ??
ZFS on Linux May 2013
Mac ZFS ??

nop-write?

illumos Nov 2012
FreeBSD ??
ZFS on Linux Not yet (as of Sept. 13, 2013)
Mac ZFS ??

lz4 compression

Note: The lz4_compress feature flag must be enabled to take advantage of this.

illumos Jan 2013
FreeBSD ??
ZFS on Linux Jan 2013
Mac ZFS ??

synctask rewrite

illumos Feb 2013
FreeBSD ??
ZFS on Linux Sept 2013
Mac ZFS ??

l2arc compression

illumos Jun 2013
FreeBSD ??
ZFS on Linux Aug 2013
Mac ZFS ??

arc shouldn't cache freed blocks

illumos Jun 2013
FreeBSD ??
ZFS on Linux Jun 2013
Mac ZFS ??

Improve N-way mirror performance

Queues read requests to least busy leaf vdev in mirrors.

illumos not yet ported
FreeBSD ??
ZFS on Linux Jul 2013
Mac ZFS ??

smoother write throttle

The write throttle (dsl_pool_tempreserve_space() and txg_constrain_throughput()) is rewritten to produce much more consistent delays when under constant load. The new write throttle is based on the amount of dirty data, rather than guesses about future performance of the system. When there is a lot of dirty data, each transaction (e.g. write() syscall) will be delayed by the same small amount. This eliminates the "brick wall of wait" that the old write throttle could hit, causing all transactions to wait several seconds until the next txg opens. One of the keys to the new write throttle is decrementing the amount of dirty data as i/o completes, rather than at the end of spa_sync(). Note that the write throttle is only applied once the i/o scheduler is issuing the maximum number of outstanding async writes. See the block comments in dsl_pool.c and above dmu_tx_delay() for more details.

The ZFS i/o scheduler (vdev_queue.c) now divides i/os into 5 classes: sync read, sync write, async read, async write, and scrub/resilver. The scheduler issues a number of concurrent i/os from each class to the device. Once a class has been selected, an i/o is selected from this class using either an elevator algorithem (async, scrub classes) or FIFO (sync classes). The number of concurrent async write i/os is tuned dynamically based on i/o load, to achieve good sync i/o latency when there is not a high load of writes, and good write throughput when there is. See the block comment in vdev_queue.c for more details.

illumos Aug 2013
FreeBSD not yet
ZFS on Linux not yet
Mac ZFS not yet

Dataset Properties

These are new filesystem, volume, and snapshot properties which can be accessed with the zfs(1) command's get subcommand. See the zfs(1) manpage for your distribution for more details on each of these properties.

Property Description illumos FreeBSD ZFS on Linux Mac ZFS
refcompressratio The compression ratio acheived for all data referenced by (but not necessarily unique to) a snapshot, filesystem, or volume, expressed as a multiplier. Jun 2011 ?? Aug 2012 ??
clones For snapshots, this property is a comma-separated list of filesystems or volumes which are clones of this snapshot. Nov 2011 ?? Jul 2012 ??
written The amount of referenced space written to this dataset since the previous snapshot. Nov 2011 ?? Jul 2012 ??
written@<snap> The amount of referenced space written to this dataset since the specified snapshot. This is the space referenced by this dataset, but not referenced by the specified snapshot. Nov 2011 ?? Jul 2012 ??
logicalused, logicalreferenced The amount of space used or referenced, before taking into account compression. Feb 2013 ?? not yet ported ??