Jump to: navigation, search

Performance tuning

153 bytes added, 17:51, 29 September 2015
Fix deduplication description to clarify that we do not keep the DDT entries in dedicated kmem buffers long enough to actually double cache, contrary to my mistaken previous belief.
=== Deduplication ===
Deduplication uses an on-disk hash table, using [ extensible hashing] as implemented in the ZAP (ZFS Attribute Processor). Each cached entry consumes approximately 512 uses slightly more than 320 bytes of memory. The DDT code relies on ARC for caching the DDT entries, such that there is no double caching or internal fragmentation from the kernel memory allocator. Each pool has a global deduplication table shared across all datasets and zvols on which deduplication is enabled. Each entry in the hash table is a record of a unique block in the pool. (Where the block size is set by the <code>recordsize</code> or <code>volblocksize</code> properties.)
The hash table (also known as the DDT or DeDup Table) must be accessed for every dedup-able block that is written or freed (regardless of whether it has multiple references). If there is insufficient memory for the DDT to be cached in memory, each cache miss will require reading a random block from disk, resulting in poor performance. For example, if operating on a single 7200RPM drive that can do 100 io/s, uncached DDT reads would limit overall write throughput to 100 blocks per second, or 400KB/s with 4KB blocks.
The consequence is that sufficient memory to store deduplication data is required for good performance. The deduplication data is considered metadata and therefore can be cached if the <code>primarycache</code> or <code>secondarycache</code> properties are set to <code>metadata</code>. In addition, the deduplication table will compete with other metadata for metadata storage, which can have a negative effect on performance. Simulation of the number of deduplication table entries needed for a given pool can be done using the -D option to zdb. Then a simple multiplication by 512320-bytes can be done to get the approximate memory requirements. Alternatively, you can estimate an upper bound on the number of unique blocks by dividing the amount of storage you plan to use on each dataset (taking into account that partial records each count as a full recordsize for the purposes of deduplication) by the recordsize and each zvol by the volblocksize, summing and then multiplying by 512320-bytes.
=== Metaslab Allocator ===

Navigation menu