Difference between revisions of "Projects"

From OpenZFS
Jump to navigation Jump to search
m
 
(22 intermediate revisions by 9 users not shown)
Line 1: Line 1:
== Active projects ==
== Active projects ==


Projects that are currently being worked on:
== Notes from meetings ==


* resumable send/receive ([[User:Csiden | Csiden]] is working on this)
=== Brainstorm, 18th September 2013 ===
* Storage of small files in dnode ([[User:Mahrens | Mahrens]] is working on this)
 
* Raspberry Pi support ([[User:Ryao | ryao]] is working on this)
[[Delphix_Brainstorming | Notes from the meeting that preceded Delphix's semi-annual Engineering Kick Off]] (EKO)
** Based on ZFS on Linux, which already works on ARM
* immediately pursuable ideas, plus long-term and strategic thoughts.
** Some unresolved issues with running out of kernel virtual address space


== Inter-platform coordination ideas ==
== Inter-platform coordination ideas ==


Ideas for projects that would help coordinate changes between platforms:
Ideas for projects that would help coordinate changes between platforms …
 
=== Mechanism for pull changes from one place to another ===
 
Make it easier to build, test, code review, and integrate ZFS changes into illumos.
 
=== Cross-platform test suite ===
 
One sourcebase, rather than porting STF to every platform?
 
Maybe integrate [https://github.com/behlendorf/xfstests XFS Test Suite].


* Mechanism for pull changes from one place to another
=== Userland ZFS ===
** Make it easier to build, test, code review, and integrate ZFS changes into illumos
 
* Cross-Platform Test Suite
We already have ztest / libzpool and want to:
** One sourcebase rather than porting STF to every platform?
* expand this to also be able to test more of zfs in userland
** Maybe integrate [https://github.com/behlendorf/xfstests XFS Test Suite]
* be able to run /sbin/zfs, /sbin/zpool against userland implementation
* Userland ZFS
* be able to run most of testrunner (and/or STF) test suite against userland implementation
** we already have ztest / libzpool
 
** want to expand this to also be able to test more of zfs in userland
=== ZFS (ZPL) version feature flags ===
** want to be able to run /sbin/zfs, /sbin/zpool against userland implementation
 
** want to be able to run most of testrunner (and/or STF) test suite against userland implementation
Import ZFS on Linux sa=xattr into illumos.
* ZFS (ZPL) Version Feature Flags
 
** Import ZFS on Linux sa=xattr into Illumos
=== /dev/zfs ioctl interface versioning ===
* /dev/zfs ioctl interface versioning
 
** Ensure that future additions/changes to the interface maintain maximum compatibility with userland tools
Ensure that future additions/changes to the interface maintain maximum compatibility with userland tools.
** Enable FreeBSD Linux jails / Illumos lx brandz to use ZFS on Linux utilities
 
* Port ZPIOS [http://zfsonlinux.org/example-zpios.html] to Illumos from ZFS on Linux
Enable FreeBSD Linux jails / illumos lx brandz to use ZFS on Linux utilities.
** This requires a rewrite to stop using Linux interfaces.
 
* Virtual machine images with OpenZFS (see [[Talk:Project_Ideas#Virtual_machine_images|discussion on talk page]])
=== Port ZPIOS from ZFS on Linux to illumos ===
** to easily try OpenZFS on a choice of distributions within a virtual machine
 
** images could be built for running on public clouds
[http://zfsonlinux.org/example-zpios.html ZPIOS example]
** images for installing to real hardware
 
This would require a rewrite to not use Linux interfaces.
 
=== Virtual machine images with OpenZFS ===
 
To easily try OpenZFS on a choice of distributions within a virtual machine:
* images could be built for running on public clouds
* images for installing to real hardware.
 
[[Talk:Project_Ideas#Virtual_machine_images | Discuss]] …


== General feature ideas ==
== General feature ideas ==
* [[Projects/ZFS Channel Programs | ZFS Channel Programs]]
 
* device removal
=== ZFS channel programs ===
** based on indirect vdevs, rather than bprewrite
 
* Reflink[http://lwn.net/Articles/331808/] support
Possible Channel Programs:
* Unified ashift handling [http://www.listbox.com/member/archive/182191/2013/07/search/YXNoaWZ0/sort/subj/page/3/entry/7:58/20130703201427:AEA03DD0-E43E-11E2-A883-F4AAC72FE4D2/]
* Recursive rollback (revert to a snapshot on dataset and all children, needs a new command line flag, -r is already taken)
* Raidz Hybrid Allocator (preferably compatible with pool version 29 for Solaris 10u11 compatibility)
 
* 1MB blocksize (preferably compatible with pool version 32, as pool-feature-flag)
=== Device removal ===
* Replace larger ZIO caches with explicit pages.
 
** Subproject: Document useful kernel interfaces for page manipulation on various platforms
Based on indirect vdevs, rather than bprewrite.
* Improved SPA namespace collision management
 
** Mostly needed by virtual machine hosts
=== Reflink support ===
** work in progress in Gentoo
 
** Temporary pool names in zpool import [http://www.listbox.com/member/archive/182191/2013/07/search/YXNoaWZ0/sort/subj/page/3/entry/6:58/20130701131204:56A77554-E271-11E2-8F75-EDC51164E148/]
[http://lwn.net/Articles/331808/ The two sides of reflink() <nowiki>[LWN.net]</nowiki>]
** Temporary pool names in zpool create
 
* TRIM Support
=== Unified ashift handling ===
** Realtime TRIM
 
** Freespace TRIM
[http://www.listbox.com/member/archive/182191/2013/07/search/YXNoaWZ0/sort/subj/page/3/entry/7:58/20130703201427:AEA03DD0-E43E-11E2-A883-F4AAC72FE4D2/ <nowiki>[illumos-zfs]</nowiki> Specifying ashift when creating vdevs] (2013-07-03)
*** Walk metaslab space maps and issue discard commands to the vdevs.
 
* Platform agnostic encryption support (preferably compatible with pool version 30, as pool-feature-flag)
=== RAID-Z hybrid allocator ===
** The early ZFS encryption code published in the zfs-crypto repository of OpenSolaris.org could be a starting point. A copy is available from [[User:Ryao | Richard Yao]] upon request.
 
* Deduplication improvements
Preferably compatible with pool version 29 for Solaris 10u11 compatibility.
** [http://www.listbox.com/member/archive/182191/2013/02/search/Ymxvb20gZmlsdGVycw/sort/time_rev/page/1/entry/8:16/20130212183221:70E13332-756C-11E2-996D-F0C715E11FC0/ Bloom Filter]
 
** Use SLAB allocation for deduplication table entries (easy to implement; will reduce DDT entries from 512-bytes to 320-bytes).
=== Replace larger ZIO caches with explicit pages ===
 
Subproject: document useful kernel interfaces for page manipulation on various platforms
 
=== Improved SPA namespace collision management ===
 
Needed mostly by virtual machine hosts. Work in progress in Gentoo.
 
Temporary pool names in zpool import  
* [http://www.listbox.com/member/archive/182191/2013/07/search/YXNoaWZ0/sort/subj/page/3/entry/6:58/20130701131204:56A77554-E271-11E2-8F75-EDC51164E148/ <nowiki>[illumos-zfs]</nowiki> RFC: zpool import -t for temporary pool names] (2013-07-01)
 
Temporary pool names in zpool create.
 
=== Deduplication improvements ===
 
Potential algorithms:
 
* [http://www.listbox.com/member/archive/182191/2013/02/search/Ymxvb20gZmlsdGVycw/sort/time_rev/page/1/entry/8:16/20130212183221:70E13332-756C-11E2-996D-F0C715E11FC0/ Bloom filter].
* [https://www.usenix.org/system/files/nsdip13-paper6.pdf Cuckoo Filter].
 
Convert synchronous writes to asynchronous writes when an ARC miss occurs during a lookup against the DDT.
 
Use dedicated kmem_cache for deduplication table entries:
* easy to implement
* will reduce DDT entries from 512-bytes to 320-bytes.
 
=== ZFS Compression / Dedup to favour provider ===
 
Currently, as a storage provider, if a customer has 100MB of quota available, and upload 50MB of data
which compresses/dedups to 25MB. The customer's quota is only reduced by 25MB. The reward favours the customer.
It is desirable as a provider, to be able to reverse this logic such that the customer's quota is reduced by
50MB and the 25MB compression/dedup saved, is to the provider's benefit. Similar to how Google/Amazon/Cloud-Feature.acme already handles it. You get 2G of quota, and any compression saved is to Google's benefit.
 
* property(?) to charge quota usage by before-compression-dedup size.
 
=== Periodic Data Validation ===
 
Problem: ZFS does a great job detecting data errors due to lost writes, media, errors, storage bugs, but only when the user actually accesses the data. Scrub in its current form can take a very long time and can have highly deleterious impacts to overall performance.
 
Data validation in ZFS should be specified according to data or business needs. Kicking off a scrub every day, week, or month doesn’t directly express that need. More likely, the user wants to express their requirements like this:
* “Check all old data at least once per month”
* “Make sure all new writes are verified within 1 day”
* “Don’t consume more than 50% of my IOPS capacity”
 
 
Note that constraints like these may overlap, but that’s fine — the user just must indicate priority and the system must alert the user of violations.
 
I suggest a new type of scrub. Constraints should be expressed and persisted with the pool. Execution of the scrub should tie into the ZFS IO scheduler. That subsystem is ideally situated to identify a relatively idle system. Further, we should order scrub IOs to be minimally impactful. That may mean having a small queue of outstanding scrub IOs that we’d send to the device, or it might mean that we try to organize large, dense contiguous scrub reads by sorting by LBA.
 
Further, after writing data to disk, there’s a window for repair while the data is still in the ARC. If ZFS could read that data back, then it could not only detect the failure, but correct it even in a system without redundant on-disk data.
 
- ahl
 
=== real-time replication===
https://www.illumos.org/issues/7166


== Lustre feature ideas ==
== Lustre feature ideas ==
The [http://www.lustre.org/ Lustre project] supports the use of ZFS as an Object Storage Target. They maintain their [http://wiki.lustre.org/index.php/Architecture_-_ZFS_for_Lustre#ZFS_Features_Required_by_Lustre own feature request page] with ZFS project ideas. Below is a list of project ideas that are well defined, benefit Lustre and have no clear benefit outside of that context.
The [http://www.lustre.org/ Lustre project] supports the use of ZFS as an Object Storage Target. They maintain their [http://wiki.lustre.org/index.php/Architecture_-_ZFS_for_Lustre#ZFS_Features_Required_by_Lustre own feature request page] with ZFS project ideas. Below is a list of project ideas that are well defined, benefit Lustre and have no clear benefit outside of that context.


* Collapsible ZAP objects (e.g. fatzip -> microzap downgrades)
=== Collapsible ZAP objects ===
* [http://wiki.lustre.org/index.php/Architecture_-_ZFS_for_Lustre#Data_on_Separate_Devices Data on Separate Devices]
 
* [http://wiki.lustre.org/index.php/Architecture_-_ZFS_large_dnodes Large dnodes]
E.g. fatzap -> microzap downgrades.
* [http://wiki.lustre.org/index.php/Architecture_-_ZFS_TinyZAP TinyZAP]
 
=== Data on separate devices ===
 
[http://wiki.lustre.org/index.php/Architecture_-_ZFS_for_Lustre#Data_on_Separate_Devices Architecture - ZFS for Lustre] …
 
=== TinyZAP ===
 
[http://wiki.lustre.org/index.php/Architecture_-_ZFS_TinyZAP Architecture - ZFS TinyZAP] …
 
== Awareness-raising ideas ==
 
… awareness of the [[Main_Page | quality, utility, and availability]] of open source implementations of ZFS.
 
=== Quality ===
 
Please add or [[Talk:Projects | discuss]] your ideas.  
 
=== Utility ===
 
==== ZFS and OpenZFS in three minutes (or less) ====
 
A very short and light video/animation to grab the attention of people who don't yet realise why ZFS is an extraordinarily good thing.
 
For an entertaining example of how a little history (completely unrelated to storage) can be taught in ninety seconds, see [http://www.youtube.com/watch?v=sqohqlTnLrE&list=PL2E867DCE2D1CEF00 Hohenwerfen Fortress - The Imprisoned Prince Bishop] ([http://www.salzburg-burgen.at/en/werfen/ context]) (part of the [http://www.zonemedia.at/en/projects/ ZONE Media] portfolio).
 
A very short video for ZFS and OpenZFS might throw in all that's good, using plain english wherever possible, including:
* very close to the beginning, the word ''resilience''
* ''verifiable integrity of data'' and so on
* some basic comparisons (NTFS, HFS Plus, ReFS)
 
– with the 2010 fork in the midst but (blink and you'll miss that milestone) the lasting impression from the video is that '''ZFS is great''' ('''years ahead of the alternatives''') and OpenZFS is rapidly making it better for a broader user base.
 
Hint: there exist many ZFS-related videos but many are a tad dry, and cover a huge amount of content. Aim for two minutes :-) … [[Talk:Projects | discuss…]]
 
=== Availability ===
 
Please add or [[Talk:Projects | discuss]] your ideas.  
 
=== General ===
 
The [http://www.youtube.com/channel/UC0IK6Y4Go2KtRueHDiQcxow OpenZFS channel on YouTube], begun October 2013 – to complement the automatically generated [http://www.youtube.com/channel/HCNjOOYCUqXF8 ZFS] channel.
 
https://twitter.com/DeirdreS/status/322422786184314881 (2013-02) draws attention to ZFS-related content amongst [http://www.beginningwithi.com/2013/04/11/technical-videos/ videos listed by Deirdré Straughan].

Latest revision as of 21:35, 7 October 2020

Active projects

Notes from meetings

Brainstorm, 18th September 2013

Notes from the meeting that preceded Delphix's semi-annual Engineering Kick Off (EKO)

  • immediately pursuable ideas, plus long-term and strategic thoughts.

Inter-platform coordination ideas

Ideas for projects that would help coordinate changes between platforms …

Mechanism for pull changes from one place to another

Make it easier to build, test, code review, and integrate ZFS changes into illumos.

Cross-platform test suite

One sourcebase, rather than porting STF to every platform?

Maybe integrate XFS Test Suite.

Userland ZFS

We already have ztest / libzpool and want to:

  • expand this to also be able to test more of zfs in userland
  • be able to run /sbin/zfs, /sbin/zpool against userland implementation
  • be able to run most of testrunner (and/or STF) test suite against userland implementation

ZFS (ZPL) version feature flags

Import ZFS on Linux sa=xattr into illumos.

/dev/zfs ioctl interface versioning

Ensure that future additions/changes to the interface maintain maximum compatibility with userland tools.

Enable FreeBSD Linux jails / illumos lx brandz to use ZFS on Linux utilities.

Port ZPIOS from ZFS on Linux to illumos

ZPIOS example

This would require a rewrite to not use Linux interfaces.

Virtual machine images with OpenZFS

To easily try OpenZFS on a choice of distributions within a virtual machine:

  • images could be built for running on public clouds
  • images for installing to real hardware.

 Discuss …

General feature ideas

ZFS channel programs

Possible Channel Programs:

  • Recursive rollback (revert to a snapshot on dataset and all children, needs a new command line flag, -r is already taken)

Device removal

Based on indirect vdevs, rather than bprewrite.

Reflink support

The two sides of reflink() [LWN.net]

Unified ashift handling

[illumos-zfs] Specifying ashift when creating vdevs (2013-07-03)

RAID-Z hybrid allocator

Preferably compatible with pool version 29 for Solaris 10u11 compatibility.

Replace larger ZIO caches with explicit pages

Subproject: document useful kernel interfaces for page manipulation on various platforms

Improved SPA namespace collision management

Needed mostly by virtual machine hosts. Work in progress in Gentoo.

Temporary pool names in zpool import

Temporary pool names in zpool create.

Deduplication improvements

Potential algorithms:

Convert synchronous writes to asynchronous writes when an ARC miss occurs during a lookup against the DDT.

Use dedicated kmem_cache for deduplication table entries:

  • easy to implement
  • will reduce DDT entries from 512-bytes to 320-bytes.

ZFS Compression / Dedup to favour provider

Currently, as a storage provider, if a customer has 100MB of quota available, and upload 50MB of data which compresses/dedups to 25MB. The customer's quota is only reduced by 25MB. The reward favours the customer. It is desirable as a provider, to be able to reverse this logic such that the customer's quota is reduced by 50MB and the 25MB compression/dedup saved, is to the provider's benefit. Similar to how Google/Amazon/Cloud-Feature.acme already handles it. You get 2G of quota, and any compression saved is to Google's benefit.

  • property(?) to charge quota usage by before-compression-dedup size.

Periodic Data Validation

Problem: ZFS does a great job detecting data errors due to lost writes, media, errors, storage bugs, but only when the user actually accesses the data. Scrub in its current form can take a very long time and can have highly deleterious impacts to overall performance.

Data validation in ZFS should be specified according to data or business needs. Kicking off a scrub every day, week, or month doesn’t directly express that need. More likely, the user wants to express their requirements like this:

  • “Check all old data at least once per month”
  • “Make sure all new writes are verified within 1 day”
  • “Don’t consume more than 50% of my IOPS capacity”


Note that constraints like these may overlap, but that’s fine — the user just must indicate priority and the system must alert the user of violations.

I suggest a new type of scrub. Constraints should be expressed and persisted with the pool. Execution of the scrub should tie into the ZFS IO scheduler. That subsystem is ideally situated to identify a relatively idle system. Further, we should order scrub IOs to be minimally impactful. That may mean having a small queue of outstanding scrub IOs that we’d send to the device, or it might mean that we try to organize large, dense contiguous scrub reads by sorting by LBA.

Further, after writing data to disk, there’s a window for repair while the data is still in the ARC. If ZFS could read that data back, then it could not only detect the failure, but correct it even in a system without redundant on-disk data.

- ahl

real-time replication

https://www.illumos.org/issues/7166

Lustre feature ideas

The Lustre project supports the use of ZFS as an Object Storage Target. They maintain their own feature request page with ZFS project ideas. Below is a list of project ideas that are well defined, benefit Lustre and have no clear benefit outside of that context.

Collapsible ZAP objects

E.g. fatzap -> microzap downgrades.

Data on separate devices

Architecture - ZFS for Lustre …

TinyZAP

Architecture - ZFS TinyZAP …

Awareness-raising ideas

… awareness of the quality, utility, and availability of open source implementations of ZFS.

Quality

Please add or discuss your ideas.  

Utility

ZFS and OpenZFS in three minutes (or less)

A very short and light video/animation to grab the attention of people who don't yet realise why ZFS is an extraordinarily good thing.

For an entertaining example of how a little history (completely unrelated to storage) can be taught in ninety seconds, see Hohenwerfen Fortress - The Imprisoned Prince Bishop (context) (part of the ZONE Media portfolio).

A very short video for ZFS and OpenZFS might throw in all that's good, using plain english wherever possible, including:

  • very close to the beginning, the word resilience
  • verifiable integrity of data and so on
  • some basic comparisons (NTFS, HFS Plus, ReFS)

– with the 2010 fork in the midst but (blink and you'll miss that milestone) the lasting impression from the video is that ZFS is great (years ahead of the alternatives) and OpenZFS is rapidly making it better for a broader user base.

Hint: there exist many ZFS-related videos but many are a tad dry, and cover a huge amount of content. Aim for two minutes :-) …  discuss…

Availability

Please add or discuss your ideas.  

General

The OpenZFS channel on YouTube, begun October 2013 – to complement the automatically generated ZFS channel.

https://twitter.com/DeirdreS/status/322422786184314881 (2013-02) draws attention to ZFS-related content amongst videos listed by Deirdré Straughan.