Difference between revisions of "Performance tuning"

Jump to navigation Jump to search
522 bytes added ,  22:16, 27 June 2020
Expand overprovisioning section to discuss NVMe
m (Added note about Optane / 3D XPoint SSDs to Synchronous I/O)
(Expand overprovisioning section to discuss NVMe)
Line 142: Line 142:
If your workload involves fsync or O_SYNC and your pool is backed by mechanical storage, consider adding one or more SLOG devices. Pools that have multiple SLOG devices will distribute ZIL operations across them. The best choice for SLOG devices is likely an Optane / 3D XPoint SSD. See [[Hardware#Optane_.2F_3D_XPoint_SSDs]] for a description of them. If an Optane / 3D XPoint SSD is an option, the rest of this section on synchronous I/O need not be read. If an Optane / 3D XPoint SSD is not an option, see [[Hardware#NAND_Flash_SSDs]] for suggestions for NAND flash SSDs and also read the information below.
If your workload involves fsync or O_SYNC and your pool is backed by mechanical storage, consider adding one or more SLOG devices. Pools that have multiple SLOG devices will distribute ZIL operations across them. The best choice for SLOG devices is likely an Optane / 3D XPoint SSD. See [[Hardware#Optane_.2F_3D_XPoint_SSDs]] for a description of them. If an Optane / 3D XPoint SSD is an option, the rest of this section on synchronous I/O need not be read. If an Optane / 3D XPoint SSD is not an option, see [[Hardware#NAND_Flash_SSDs]] for suggestions for NAND flash SSDs and also read the information below.


To ensure maximum ZIL performance on NAND flash SSD-based SLOG devices, you should also overprovison spare area to increase IOPS[http://www.anandtech.com/show/6489/playing-with-op]. You can do this with a mix of a secure erase and a partition table trick, such as the following:
==== Overprovisioning spare area ====
 
To ensure maximum ZIL performance on NAND flash SSD-based SLOG devices, you should also overprovison spare area to increase IOPS[http://www.anandtech.com/show/6489/playing-with-op]. Only about 4GB is needed, so the rest can be left as overprovisioned storage. The choice of 4GB is somewhat arbitrary. Most systems do not write anything close to 4GB to ZIL between transaction group commits, so overprovisioning all storage beyond the 4GB partition should be alright. If a workload needs more, then make it no more than the maximum ARC size. Even under extreme workloads, ZFS will not benefit from more SLOG storage than the maximum ARC size. That is half of system memory on Linux and 3/4 of system memory on illumos.
 
===== Overprovisioning by secure erase and partition table trick =====
 
You can do this with a mix of a secure erase and a partition table trick, such as the following:


# Run a secure erase on the NAND-flash SSD.
# Run a secure erase on the NAND-flash SSD.
Line 153: Line 159:
Alternatively, some devices allow you to change the sizes that they report.This would also work, although a secure erase should be done prior to changing the reported size to ensure that the SSD recognizes the additional spare area. Changing the reported size can be done on drives that support it with `hdparm -N <sectors>` on systems that have laptop-mode-tools.
Alternatively, some devices allow you to change the sizes that they report.This would also work, although a secure erase should be done prior to changing the reported size to ensure that the SSD recognizes the additional spare area. Changing the reported size can be done on drives that support it with `hdparm -N <sectors>` on systems that have laptop-mode-tools.


The choice of 4GB is somewhat arbitrary. Most systems do not write anything close to 4GB to ZIL between transaction group commits, so overprovisioning all storage beyond the 4GB partition should be alright. If a workload needs more, then make it no more than the maximum ARC size. Even under extreme workloads, ZFS will not benefit from more SLOG storage than the maximum ARC size. That is half of system memory on Linux and 3/4 of system memory on illumos.
 
===== NVMe overprovisioning =====
 
On NVMe, you can use namespaces to achieve overprovisioning:
 
# Do a sanitize command as a precaution to ensure the device is completely clean.
# Delete the default namespace.
# Create a new namespace of size 4GB.
# Give the namespace to ZFS to use as a log device. e.g. zfs add log /dev/nvme1n1


=== Whole disks ===
=== Whole disks ===
Editor
348

edits

Navigation menu