Difference between revisions of "Delphix Brainstorming"
Jump to navigation
Jump to search
Line 2: | Line 2: | ||
Leading up to Delphix's semi-annual Engineering Kick Off (EKO), there was a ZFS brainstorming meeting held at Delphix HQ. Below are ideas that came from that meeting, ranging from ideas that are immediately actionable to more long-term and strategic thoughts. | Leading up to Delphix's semi-annual Engineering Kick Off (EKO), there was a ZFS brainstorming meeting held at Delphix HQ. Below are ideas that came from that meeting, ranging from ideas that are immediately actionable to more long-term and strategic thoughts. | ||
Bold indicates high-priority items. | |||
(?) indicates more investigation needed before defining a project. | |||
* ZFS self-tuning | |||
* Estimate performance, consumption after destroys | |||
** Predict performance improvement of freeing up space | |||
* Ping-pong write, re-use blocks for the same file as it gets re-written | |||
* Investigate effectiveness of pre-fetch (?) | |||
** lock contention | |||
* Pure Storage collaboration | |||
* '''Performance tests with compression''' | |||
** '''Using compression histograms gathered from customers''' | |||
* '''Missing metrics on ZFS performance''' | |||
* '''DMU changes for async read, larger block sizes''' | |||
* Trim (issuing trim commands to the storage) | |||
* Write performance | |||
** multi-block ZIL writer (for RAC) (?) | |||
** How many metaslabs (?) | |||
* DTrace provider | |||
* '''Compressed ARC - ''George Wilson''''' | |||
* '''ARC observability''' | |||
** ARC sizing stats | |||
*** Hit rate, theoretical optimal hit rate based on ghost lists, projection of hit rate given more memory | |||
* '''Channel programs''' | |||
* LUN removal | |||
* Cross-pool cloning / distributed DSL | |||
** Shadow replication, shadow blocks | |||
** Streaming replication, send out blocks in syncing context | |||
** Lightweight replication, remove some responsibility from app stack | |||
* one pool/different vdev “classes”? | |||
* '''Resumable send - ''Max Grossman''''' | |||
* Compressed send(?) | |||
* Data masking / block differencing (?) | |||
** Efficiently store transformed data using a bit function from original data to transformed data | |||
* '''Pool fragmentation analytics''' | |||
** Provide feedback on when to add storage | |||
** Provide “% fragmented” metric | |||
* Data rebalancing/redistribution/defrag/placement | |||
** Do we care about defrag on SSD? | |||
* Testing | |||
** '''Adding coverage for new features''' | |||
** '''Realistic compression ratios in testing''' | |||
** Automated tests for every change | |||
** Better userland test coverage (full stack from IOCTL down) | |||
** Better performance tests | |||
* Scrub should be better (some kind of SLA) | |||
** Some kind of guarantee on data corruption, how quickly it will be caught? | |||
** Better error reporting | |||
** zfs scrub (on specific filesystems) | |||
** LBA ordered traversals | |||
** pause/resume scrub (already have stop/restart) | |||
* Fix broken blkptrs (automated?) | |||
** Include estimates on reconstruction time | |||
* '''Dataset property that sets owner of all contained files in constant time''' | |||
[[File:zfs_brainstorming_sep18_2013.jpg|center|500px]] | [[File:zfs_brainstorming_sep18_2013.jpg|center|500px]] |
Revision as of 16:38, 23 September 2013
September 18, 2013
Leading up to Delphix's semi-annual Engineering Kick Off (EKO), there was a ZFS brainstorming meeting held at Delphix HQ. Below are ideas that came from that meeting, ranging from ideas that are immediately actionable to more long-term and strategic thoughts.
Bold indicates high-priority items. (?) indicates more investigation needed before defining a project.
- ZFS self-tuning
- Estimate performance, consumption after destroys
- Predict performance improvement of freeing up space
- Ping-pong write, re-use blocks for the same file as it gets re-written
- Investigate effectiveness of pre-fetch (?)
- lock contention
- Pure Storage collaboration
- Performance tests with compression
- Using compression histograms gathered from customers
- Missing metrics on ZFS performance
- DMU changes for async read, larger block sizes
- Trim (issuing trim commands to the storage)
- Write performance
- multi-block ZIL writer (for RAC) (?)
- How many metaslabs (?)
- DTrace provider
- Compressed ARC - George Wilson
- ARC observability
- ARC sizing stats
- Hit rate, theoretical optimal hit rate based on ghost lists, projection of hit rate given more memory
- ARC sizing stats
- Channel programs
- LUN removal
- Cross-pool cloning / distributed DSL
- Shadow replication, shadow blocks
- Streaming replication, send out blocks in syncing context
- Lightweight replication, remove some responsibility from app stack
- one pool/different vdev “classes”?
- Resumable send - Max Grossman
- Compressed send(?)
- Data masking / block differencing (?)
- Efficiently store transformed data using a bit function from original data to transformed data
- Pool fragmentation analytics
- Provide feedback on when to add storage
- Provide “% fragmented” metric
- Data rebalancing/redistribution/defrag/placement
- Do we care about defrag on SSD?
- Testing
- Adding coverage for new features
- Realistic compression ratios in testing
- Automated tests for every change
- Better userland test coverage (full stack from IOCTL down)
- Better performance tests
- Scrub should be better (some kind of SLA)
- Some kind of guarantee on data corruption, how quickly it will be caught?
- Better error reporting
- zfs scrub (on specific filesystems)
- LBA ordered traversals
- pause/resume scrub (already have stop/restart)
- Fix broken blkptrs (automated?)
- Include estimates on reconstruction time
- Dataset property that sets owner of all contained files in constant time