Difference between revisions of "Projects/ZFS Channel Programs"

Jump to navigation Jump to search
Line 80: Line 80:
The nvlist object created from this source code would look something like this:
The nvlist object created from this source code would look something like this:


[[File:Example_ZCP_nvlist.png|center|700px]]
[[File:Example_ZCP_nvlist.png|center|600px]]
 
This structure is constructed of nested nvpairs, using nvpairs with data type DATA_TYPE_NVLIST. The outermost nvlist is analogous to main() in C/Java. It contains the full channel program, with execution of the program starting at the first object in its data list and ending at the last. The blue nvpair immediately contained by the nvlist represents the iterate snapshots operation. Contained within it are its three arguments in green:
# “dataset”, the ZFS dataset whose snapshots we are iterating over
# “iterator”, the name of the ZCP variable to be added to the current scope which will store the full name of the current snapshot
# “body”, another nvlist containing nvpairs which describe the operations to be performed on each snapshot of the dataset
The body of zcp_iterate_snapshots contains a single ZCP operation in yellow, zcp_destroy_snapshot. zcp_destroy_snapshot takes a single argument “snap” in red, the name of the snapshot it is destroying. In this case, the name of that snapshot is resolved using a zcp_resolve operation with data=“snap”, which searches the current scope for the lexically closest variable named “snap”. For this example, that variable was created by the “iterator” argument to zcp_iterate_snapshots. Clearly, constructing even this simple ZCP program by hand using existing nvlist libraries would be an onerous task, so early development of libraries to help automate this process for channel programs will be very useful for development and testing.
 
Once this complete channel program object (i.e. nvlist) is passed to the kernel, it becomes the responsibility of the ZCP interpreter to check the correctness of the program in open context, in terms of both syntactic structure and filesystem permissions. The syntactic structure will be novel work, but the permissions checking can be handled by new versions of the existing check functions for DSL sync tasks. If these checks pass in open context the channel program will be queued for execution in the next syncing context just as DSL sync tasks are.
 
Once all checks have passed again in syncing context, the channel program can actually be executed. This would be performed by a breadth-first traversal of the ZCP program tree, with a call to a function analogous to the current DSL sync task sync function for each operation (e.g. snapshot, destroy, create) as well as custom interpreter code to implement different ZCP control statements (e.g. iterate_snapshots, iterate_children, if_equals). For the example ZCP program described by the nvlist above, a sample execution could be:
 
# Starting at the first item of the nvlist, the ZCP interpreter reads an nvpair with name=”zcp_iterate_snapshots” and recognizes it as a reserved control statement keyword. The check functions would have already ensured that the data type for this nvpair was DATA_TYPE_NVLIST, and that the nvlist contained had “dataset”, “iterator”, and “body” nvpairs. The string value associated with “dataset” is retrieved and used to get a list of snapshots. The interpreter will now iterate through these snapshots, at each iteration creating a variable in the local ZCP scope with the name stored in the “iterator” nvpair.
# The ZCP interpreter then retrieves the nvlist stored in the “body” nvpair and begins executing it. This should use the exact same code as Step 1, the only difference being that we now have a “snap” variable added to our local scope.
# The interpreter retrieves the first nvpair in the “body” nvlist, and identifies it using its name: “zcp_destroy_snapshot”. As with any ZCP operation that takes arguments, its data type is DATA_TYPE_NVLIST and so ZCP retrieves the nvlist stored with the “zcp_destroy_snapshot” nvpair.
# In the case of “zcp_destroy_snapshot”, we only have a single argument named “snap”, which must evaluate to a string. In this case, the “snap” argument maps to an nvlist containing a single nvpair which represents a “zcp_resolve” operation. The interpreter looks up the variable name provided to this zcp_resolve (“snap”) in the current scope and returns the string value that was stored there by zcp_iterate_snapshots. After evaluating this zcp_resolve, we return the discovered value to “zcp_destroy_snapshot”, which must check that the returned nvpair has type DATA_TYPE_STRING, but which can then perform the actual destroy.
# Steps 2, 3, and 4 will continually repeat as we iterate over the snapshot list created by zcp_iterate_snapshots in Step 1.
 
The key subtlety to notice here is that with this expression of multiple ZFS operations in a single entity, the channel program, we can guarantee that the desired semantics of this operation are upheld, i.e. given a dataset with snapshots S1, S2, …, SN this program terminates only when all those snapshots have been destroyed and guarantees that dataset has no other snapshots. Even better, channel programs would then allow one to quickly add functionality to this program, for instance by adding an additional operation that immediately creates a fresh snapshot once all of the destroys have completed. That kind of rapid prototyping and deployment of complex ZFS functionality with powerful consistency guarantees is missing today but would be a powerful addition.