Difference between revisions of "Performance tuning"

Jump to navigation Jump to search
3,581 bytes added ,  21:32, 27 June 2020
m
Fixed ashift url.
m (→‎InnoDB: typo fix)
m (Fixed ashift url.)
(12 intermediate revisions by the same user not shown)
Line 30: Line 30:
* [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks#ZFSandAdvancedFormatdisks-OverridingthePhysicalBlockSize sd.conf] on illumos
* [http://wiki.illumos.org/display/illumos/ZFS+and+Advanced+Format+disks#ZFSandAdvancedFormatdisks-OverridingthePhysicalBlockSize sd.conf] on illumos
* [https://www.freebsd.org/cgi/man.cgi?query=gnop&sektion=8&manpath=FreeBSD+10.2-RELEASE gnop(8)] on FreeBSD; see for example [http://web.archive.org/web/20151022020605/http://ivoras.sharanet.org/blog/tree/2011-01-01.freebsd-on-4k-sector-drives.html FreeBSD on 4K sector drives] (2011-01-01)
* [https://www.freebsd.org/cgi/man.cgi?query=gnop&sektion=8&manpath=FreeBSD+10.2-RELEASE gnop(8)] on FreeBSD; see for example [http://web.archive.org/web/20151022020605/http://ivoras.sharanet.org/blog/tree/2011-01-01.freebsd-on-4k-sector-drives.html FreeBSD on 4K sector drives] (2011-01-01)
* [http://zfsonlinux.org/faq.html#HowDoesZFSonLinuxHandlesAdvacedFormatDrives -o ashift=] on ZFS on Linux
* [https://openzfs.github.io/openzfs-docs/Project%20and%20Community/FAQ.html#advanced-format-disks-o ashift=] on ZFS on Linux
* -o ashift= also works with both MacZFS (pool version 8) and ZFS-OSX (pool version 5000).
* -o ashift= also works with both MacZFS (pool version 8) and ZFS-OSX (pool version 5000).


-o ashift= is convenient, but it is flawed in that the creation of pools containing top level vdevs that have multiple optimal sector sizes require the use of multiple commands. [http://www.listbox.com/member/archive/182191/2013/07/search/YXNoaWZ0/sort/time_rev/page/2/entry/16:58/20130709002459:82E21654-E84F-11E2-A0FF-F6B47351D2F5/ A newer syntax] that will rely on the actual sector sizes has been discussed as a cross platform replacement and will likely be implemented in the future.
-o ashift= is convenient, but it is flawed in that the creation of pools containing top level vdevs that have multiple optimal sector sizes require the use of multiple commands. [http://www.listbox.com/member/archive/182191/2013/07/search/YXNoaWZ0/sort/time_rev/page/2/entry/16:58/20130709002459:82E21654-E84F-11E2-A0FF-F6B47351D2F5/ A newer syntax] that will rely on the actual sector sizes has been discussed as a cross platform replacement and will likely be implemented in the future.


In addition, [[User:Ryao | Richard Yao]] has contributed a [https://github.com/zfsonlinux/zfs/blob/master/cmd/zpool/zpool_vdev.c#L108 database of drives known to misreport sector sizes] to the ZFS on Linux project. It is used to automatically adjust ashift without the assistance of the system administrator. This approach is unable to fully compensate for misreported sector sizes whenever drive identifiers are used ambiguously (e.g. virtual machines, iSCSI LUNs, some rare SSDs), but it does a great amount of good. The format is roughly compatible with illumos' sd.conf and it is expected that other implementations will integrate the database in future releases. Strictly speaking, this database does not belong in ZFS, but the difficulty of patching the Linux kernel (especially older ones) necessitated that this be implemented in ZFS itself for Linux. The same is true for MacZFS. However, FreeBSD and illumos are both able to implement this in the correct layer.
In addition, [[User:Ryao | Richard Yao]] has contributed a [https://github.com/openzfs/zfs/blob/master/cmd/zpool/os/linux/zpool_vdev_os.c#L98 database of drives known to misreport sector sizes] to the ZFS on Linux project. It is used to automatically adjust ashift without the assistance of the system administrator. This approach is unable to fully compensate for misreported sector sizes whenever drive identifiers are used ambiguously (e.g. virtual machines, iSCSI LUNs, some rare SSDs), but it does a great amount of good. The format is roughly compatible with illumos' sd.conf and it is expected that other implementations will integrate the database in future releases. Strictly speaking, this database does not belong in ZFS, but the difficulty of patching the Linux kernel (especially older ones) necessitated that this be implemented in ZFS itself for Linux. The same is true for MacZFS. However, FreeBSD and illumos are both able to implement this in the correct layer.


=== Compression ===
=== Compression ===
Line 63: Line 63:


Changing the recordsize on a dataset will only take effect for new files. If you change the recordsize because your application should perform better with a different one, you will need to recreate its files. A cp followed by a mv on each file is sufficient. Alternatively, send/recv should recreate the files with the correct recordsize when a full receive is done.
Changing the recordsize on a dataset will only take effect for new files. If you change the recordsize because your application should perform better with a different one, you will need to recreate its files. A cp followed by a mv on each file is sufficient. Alternatively, send/recv should recreate the files with the correct recordsize when a full receive is done.
==== Larger record sizes ====
Record sizes of up to 16M are supported with the large_blocks pool feature, which is enabled by default on new pools on systems that support it. However, record sizes larger than 1M is disabled by default unless the zfs_max_recordsize kernel module parameter is set to allow sizes higher than 1M. Larger record sizes than 1M are not well tested as 1M, although they should work. `zfs send` operations must specify -L to ensure that larger than 128KB blocks are sent and the receiving pools must support the large_blocks feature.


=== zvol volblocksize ===
=== zvol volblocksize ===
Line 109: Line 113:


Make sure that you create your pools such that the vdevs have the correct alignment shift for your storage device's size. if dealing with flash media, this is going to be either 12 (4K sectors) or 13 (8K sectors). For SSD ephemeral storage on Amazon EC2, the proper setting is 12.
Make sure that you create your pools such that the vdevs have the correct alignment shift for your storage device's size. if dealing with flash media, this is going to be either 12 (4K sectors) or 13 (8K sectors). For SSD ephemeral storage on Amazon EC2, the proper setting is 12.
=== Atime Updates ===
Set either relatime=on or atime=off to minimize IOs used to update access time stamps. For backward compatibility with a small percentage of software that supports it, relatime is preferred when available and should be set on your entire pool. atime=off should be used more selectively.


=== Free Space ===
=== Free Space ===
Line 118: Line 126:
=== LZ4 compression ===
=== LZ4 compression ===


Set compression=lz4 on your pools' root datasets so that all datasets inherit it unless you have a reason not to enable it. Userland tests of LZ4 compression of incompressible data in a single thread has shown that it can process 10GB/sec, so it is unlikely to be a bottleneck even on incompressible data. The reduction in IO from LZ4 will typically be a performance win.
Set compression=lz4 on your pools' root datasets so that all datasets inherit it unless you have a reason not to enable it. Userland tests of LZ4 compression of incompressible data in a single thread has shown that it can process 10GB/sec, so it is unlikely to be a bottleneck even on incompressible data. Furthermore, incompressible data will be stored without compression such that reads of incompressible data with compression enabled will not be subject to decompression. Writes are so fast that in-compressible data is unlikely to see a performance penalty from the use of LZ4 compression. The reduction in IO from LZ4 will typically be a performance win.
 
Note that larger record sizes will increase compression ratios on compressible data by allowing compression algorithms to process more data at a time.
 
=== NVMe low level formatting ===
 
See [[Hardware#NVMe_low_level_formatting]].


=== Pool Geometry ===
=== Pool Geometry ===
Line 160: Line 174:


== Database workloads ==
== Database workloads ==
Setting redundant_metadata=mostly can increase IOPS by at least a few percentage points by eliminating redundant metadata at the lowest level of the indirect block tree. This comes with the caveat that data loss will occur if a metadata block pointing to data blocks is corrupted and there are no duplicate copies, but this is generally not a problem in production on mirrored or raidz vdevs.


=== MySQL ===
=== MySQL ===
Line 174: Line 190:


Make separate datasets for PostgreSQL's data and WAL. Set recordsize=8K on both to avoid expensive partial record writes. Set logbias=throughput on PostgreSQL's data to avoid writing twice.
Make separate datasets for PostgreSQL's data and WAL. Set recordsize=8K on both to avoid expensive partial record writes. Set logbias=throughput on PostgreSQL's data to avoid writing twice.
== File servers ==
Create a dedicated dataset for files being served.
See [[Performance_tuning#Sequential_workloads]] for configuration recommendations.
== Sequential workloads ==
Set recordsize=1M on datasets that are subject to sequential workloads. Read [[Performance_tuning#Larger_record_sizes]] for documentation on things that should be known before setting 1M record sizes.
Set compression=lz4 as per the general recommendation for [[Performance_tuning#LZ4_compression|LZ4 compression]].
== Video games directories ==
Create a dedicated dataset, use chown to make it user accessible (or create a directory under it and use chown on that) and then configure the game store application to place games there. Specific information on how to configure various clients is below.
See [[Performance_tuning#Sequential_workloads]] for configuration recommendations before installing games.
=== Lutris ===
Open the context menu by left clicking on the triple bar icon in the upper right. Go to "Preferences" and then the "System options" tab. Change the default installation directory and click save.
=== Steam ===
Go to "Settings" -> "Downloads" -> "Steam Library Folders" and use "Add Library Folder" to set the directory for steam to use to store games. Make sure to set it to the default by right clicking on it and clicking "Make Default Folder" before closing the dialogue.
Also, set redundant_metadata=mostly for some slight space savings to provide slightly more storage for games in addition to following the general recommendations for video game directories. Even without redundant storage, you can typically restore both game saves and game data from the steam cloud when something goes wrong. This makes full metadata redundancy unnecessary.


== Virtual machines ==
== Virtual machines ==
Editor
348

edits

Navigation menu