Scrub/Resilver Performance

From OpenZFS
Revision as of 18:34, 11 November 2016 by Mahrens (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Saso Kiselkov of Nexenta gave a talk on Scrub/Resilver Performance at the OpenZFS Developer Summit 2016:

Video, Slides

Since its inception, ZFS has included a facility to both preemptively check the correctness of stored data, as well as to recover from drive failures. This mechanism, however, has always been fairly inefficient and with the growth of hard-drives, it is common to see resilver operations taking days to weeks. This is primarily inherent to the design of the resilver algorithm, which simply traverses the filesystem tree structures, rather than take into consideration that disks significantly prefer disk block-ordered sequential access.

This talk discusses some forthcoming work for OpenZFS, where we have achieved performance improvements on resilver by implementing an intelligent block sorting pre-scanner ahead of the regular resilver repair code. This allows us to reorder I/O in such a way as to achieve near sequential resilver throughput even with limited memory resource usage.