Difference between revisions of "Delphix Brainstorming"

From OpenZFS
Jump to navigation Jump to search
(Created page with "September 18, 2013 Leading up to Delphix's semi-annual Engineering Kick Off (EKO), there was a ZFS brainstorming meeting held at Delphix HQ. Below are ideas that came from th...")
 
m (Heading level fix.)
 
(7 intermediate revisions by 2 users not shown)
Line 1: Line 1:
September 18, 2013
== September 18, 2013 ==


Leading up to Delphix's semi-annual Engineering Kick Off (EKO), there was a ZFS brainstorming meeting held at Delphix HQ. Below are ideas that came from that meeting, ranging from ideas that are immediately actionable to more long-term and strategic thoughts.
Leading up to Delphix's semi-annual Engineering Kick Off (EKO), Delphix employees held a ZFS brainstorming meeting. Below are ideas that came from that meeting, ranging from ideas that can be pursued immediately to more long-term and strategic thoughts.
 
Bold indicates high-priority items.
(?) indicates more investigation needed before defining a project.
 
* ZFS self-tuning
* Estimate performance, consumption after destroys
** Predict performance improvement of freeing up space
* Ping-pong write, re-use blocks for the same file as it gets re-written
* Investigate effectiveness of pre-fetch (?)
** is lock contention a problem?
* Pure Storage collaboration
* '''Performance tests with compression'''
** '''Using compression histograms gathered from customers'''
* '''Missing metrics on ZFS performance'''
* '''DMU changes for async read, larger block sizes'''
* Trim (issuing trim commands to the storage)
* Write performance
** multi-block ZIL writer (for RAC) (?)
** How many metaslabs (?)
* DTrace provider
* '''Compressed ARC - ''George Wilson'''''
* '''ARC observability'''
** ARC sizing stats
*** Hit rate, theoretical optimal hit rate based on ghost lists, projection of hit rate given more memory
* '''Channel programs'''
* LUN removal
* Cross-pool cloning / distributed DSL
** Shadow replication, shadow blocks
** Streaming replication, send out blocks in syncing context
** Lightweight replication, remove some responsibility from app stack
* one pool/different vdev “classes”?
* '''Resumable send - ''Max Grossman'''''
* Compressed send(?)
* Data masking / block differencing (?)
** Efficiently store transformed data using a bit function from original data to transformed data
* '''Pool fragmentation analytics'''
** Provide feedback on when to add storage
** Provide “% fragmented” metric
* Data rebalancing/redistribution/defrag/placement
** Do we care about defrag on SSD?
* Testing
** '''Adding coverage for new features'''
** '''Realistic compression ratios in testing'''
** Automated tests for every change
** Better userland test coverage (full stack from IOCTL down)
** Better performance tests
* Scrub should be better (some kind of SLA)
** Some kind of guarantee on data corruption, how quickly it will be caught?
** Better error reporting
** zfs scrub (on specific filesystems)
** LBA ordered traversals
** pause/resume scrub (already have stop/restart)
* Fix broken blkptrs (automated?)
** Include estimates on reconstruction time
* '''Dataset property that sets owner of all contained files in constant time'''
 
 
[[File:zfs_brainstorming_sep18_2013.jpg|center|500px]]

Latest revision as of 03:34, 7 October 2013

September 18, 2013

Leading up to Delphix's semi-annual Engineering Kick Off (EKO), Delphix employees held a ZFS brainstorming meeting. Below are ideas that came from that meeting, ranging from ideas that can be pursued immediately to more long-term and strategic thoughts.

Bold indicates high-priority items. (?) indicates more investigation needed before defining a project.

  • ZFS self-tuning
  • Estimate performance, consumption after destroys
    • Predict performance improvement of freeing up space
  • Ping-pong write, re-use blocks for the same file as it gets re-written
  • Investigate effectiveness of pre-fetch (?)
    • is lock contention a problem?
  • Pure Storage collaboration
  • Performance tests with compression
    • Using compression histograms gathered from customers
  • Missing metrics on ZFS performance
  • DMU changes for async read, larger block sizes
  • Trim (issuing trim commands to the storage)
  • Write performance
    • multi-block ZIL writer (for RAC) (?)
    • How many metaslabs (?)
  • DTrace provider
  • Compressed ARC - George Wilson
  • ARC observability
    • ARC sizing stats
      • Hit rate, theoretical optimal hit rate based on ghost lists, projection of hit rate given more memory
  • Channel programs
  • LUN removal
  • Cross-pool cloning / distributed DSL
    • Shadow replication, shadow blocks
    • Streaming replication, send out blocks in syncing context
    • Lightweight replication, remove some responsibility from app stack
  • one pool/different vdev “classes”?
  • Resumable send - Max Grossman
  • Compressed send(?)
  • Data masking / block differencing (?)
    • Efficiently store transformed data using a bit function from original data to transformed data
  • Pool fragmentation analytics
    • Provide feedback on when to add storage
    • Provide “% fragmented” metric
  • Data rebalancing/redistribution/defrag/placement
    • Do we care about defrag on SSD?
  • Testing
    • Adding coverage for new features
    • Realistic compression ratios in testing
    • Automated tests for every change
    • Better userland test coverage (full stack from IOCTL down)
    • Better performance tests
  • Scrub should be better (some kind of SLA)
    • Some kind of guarantee on data corruption, how quickly it will be caught?
    • Better error reporting
    • zfs scrub (on specific filesystems)
    • LBA ordered traversals
    • pause/resume scrub (already have stop/restart)
  • Fix broken blkptrs (automated?)
    • Include estimates on reconstruction time
  • Dataset property that sets owner of all contained files in constant time


Zfs brainstorming sep18 2013.jpg