So today I researched ZFS pretty heavily.
ZFS is a file system developed by Solaris, and to say the least it's pretty amazing.
It's got built in deduplication features, Compression features, and most of all, it's got built in redundancy and error correction.
Let's talk about each:
Deduplication is dealing with the blocks of data. The block size is user setable. I believe it's 128k by default, and the dedup is based on the checksum matching. (and snapshotting)
The compression features are user selectable. I would think that most implementations of this file system will be on a pretty beefy server. So there are a couple choices, including gzip the default is a faster algorithm that compresses less. But i chose to implement with the gzip-9 which is the most compression. For grins, I took a pretty random batch of documents and such (about 6 gigs worth) and threw them in a raidz with dedup and gzip-9 compression. It compressed/deduped to about 2gigs. Pretty impressive really for a file system.
Finally it's got built in error detection/correction and redundancy. raidz is an option for an initial storage pool. With raidz, the data blocks are always kept on two devices. this might seem weird, but it essentially turns into the idea of a raid1, but the storage capacity of a raid5. (without the write hole) In the background, there is a process that happens called scrubbing, where the file system checks the hashes against the storage pool database every so often and clears up any discrepancies. If there's a problem found in a block stored in either place, the block that comes up clean is copied to a new source.
Oh yes, and snapshotting. Due to the nature of the file system, writing new blocks allows the old block to be kept. When the new block is written, the data, along with any changed blocks are written to a new spot in the file system, and the pointer for the block then points to the new spot. This allows for a snapshot to be taken at any point, and any changed data will be written while the old data is kept in its original place. Very cool really.
Now, Unfortunately for politics, we won't see ZFS hit the mainstream for a long time. The license that it's released under is not compatible with the GPL license that linux is released under. Even though people have successfully ported the file system to make it native to linux, legally it cannot be released.
Right now it has to be run under fuse, which stands for User Space File System. The down side to this is that file system performance takes a minor hit. FreeNAS is one solution for using this file system nativly. I've got a server that I'm thinking about installing the newest FreeNAS 8.0 beta on. I'll give it a shot and let you all know how it goes.
No comments:
Post a Comment