Difference between revisions of "ZFS on high latency devices"

Jump to navigation Jump to search
no edit summary
m
Line 16: Line 16:
There are a few requirements, though:
There are a few requirements, though:


1. Larger blocks are better. A 128K average blocksize is ok for a receive-only pool, though 64K can be made
* Larger blocks are better.
 
A 128K average blocksize is ok for a receive-only pool, though 64K can be made
to work adequately.  256K is better, and for a pool that directly takes non-zfs-receive writes, is probably the minimum size.
to work adequately.  256K is better, and for a pool that directly takes non-zfs-receive writes, is probably the minimum size.
512K would be better
512K would be better
Line 27: Line 29:
based on high latency storage may be much more painful.
based on high latency storage may be much more painful.


2. Reads are usually not a problem.  Writes must be done carefully for the best results.
* Reads are usually not a problem.  Writes must be done carefully for the best results.


The most optimal possible solution is a pool that only receives enough writes to fill it once.
The most optimal possible solution is a pool that only receives enough writes to fill it once.
Line 46: Line 48:
try a pool with logbias=throughput, the increased fragmentation will destroy read performance.
try a pool with logbias=throughput, the increased fragmentation will destroy read performance.


3. Lots of ARC is a good thing.  Lots of dirty data space can also be a good thing provided that
* Lots of ARC is a good thing.  Lots of dirty data space can also be a good thing provided that
dirty data stabilizes without hitting the maximum per-pool or the ARC limit.
dirty data stabilizes without hitting the maximum per-pool or the ARC limit.


Line 70: Line 72:
Async writes in ZFS flow very roughly as follows:
Async writes in ZFS flow very roughly as follows:
   
   
Data
* Data


Dirty data for pool (must be stable and about 80% of dirty_data_max)
* * Dirty data for pool (must be stable and about 80% of dirty_data_max)
   
   
TxG commit
* TxG commit


zfs_sync_taskq_batch_pct (traverses data structures to generate IO)
* * zfs_sync_taskq_batch_pct (traverses data structures to generate IO)


zio_taskq_batch_pct (for compression and checksumming)
* * zio_taskq_batch_pct (for compression and checksumming)


zio_dva_throttle_enabled (ZIO throttle)
* * zio_dva_throttle_enabled (ZIO throttle)
   
   
VDEV thread limits
* VDEV thread limits


zfs_vdev_async_write_min_active
* * zfs_vdev_async_write_min_active


zfs_vdev_async_write_max_active
* * zfs_vdev_async_write_max_active


Aggregation (set this first)
* Aggregation (set this first)


zfs_vdev_aggregation_limit (maximum I/O size)
* * zfs_vdev_aggregation_limit (maximum I/O size)


zfs_vdev_write_gap_limit (I/O gaps)
* * zfs_vdev_write_gap_limit (I/O gaps)


zfs_vdev_read_gap_limit
* * zfs_vdev_read_gap_limit


block device scheduler (set this first)
* block device scheduler (set this first)


You must work through this flow to determine if there are any
You must work through this flow to determine if there are any
significant issues and to maximize IO merge.  The exceptions are:
significant issues and to maximize IO merge.  The exceptions are:


zio_taskq_batch_pct (the default of 75% is fine)
* zio_taskq_batch_pct (the default of 75% is fine)


agg limit and gap limits (you can reasonably guess these)
* agg limit and gap limits (you can reasonably guess these)


block device scheduler (should be noop or none)
* block device scheduler (should be noop or none)


K is a factor that determines the likely size of free spaces on your pool after
K is a factor that determines the likely size of free spaces on your pool after
Editor
17

edits

Navigation menu