Agile Developer, Berlin, Germany

13.01.2009

Build your own Drobo-Replacement based on ZFS

Filed under: os — pegolon @ 14:34
Tags: ,

The Drobo hype

I saw the first Drobo presentation video on YouTube almost 2 years ago. Since then I was longing to get one, but the price for an empty box being 490,- € without a single harddrive was too much in my oppinion.

As a regular podcast listener I heared everywhere about the sexyness of the Drobo: MacBreak Weekly mentions it in every episode, the guys at BitsUndSo are collecting them like crazy, and finally Tim Pritlove also got one. But then he mentioned on his podcast Mobile Macs his strange difficulties with the Drobo which made me think a little bit more about the subject.

The Drobo has in my opinion some major drawbacks:

  • it doesn’t know its hosted data and the used filesystem, so on a resilvering task it has to duplicate all parts of a harddrive, also the ones with noise
  • one has to set an upper limit for the hosted filesystem, so Drobo acts to the host machine as one physical drive with a given size
  • you cannot use all the disk space if you use drives with different sizes
  • it is limited to a maximum amount of 4 drives
  • if your Drobo fails you cannot access your data

We can do better!

So what would I want from my dream backup device?

  • it should not have an upper storage limit
  • it should be able to heal itself silently
  • it should be network accessible (especially for Macs)
  • the drives should also work when connected to other hardware
  • it should be usable as a Time Machine target
  • it HAS to be cheaper than a Drobo

It doesn’t have to be beautiful or silent, since I want to put it in my storage room and want to forget about it.

Initial thoughts

In my oppinion the most modern and future proof filesystem at the moment is ZFS. After listening to the very good podcast episode of Chaos Radio Express CRE049 in German about ZFS by Tim Pritlove I always wanted to use it. Unfortunately Apple is very lazy with its ZFS plans. It is included in Leopard but can only read ZFS and the plans for Snow Leopard are very vague. So Mac OS X is no option at the moment. Since I want it to be cheap, Mac hardware is also no option.

FreeBSD seems to have a recent version of ZFS, but I gave OpenSolaris a try, since ZFS is developed by Sun I think Solaris is the first OS where new features of ZFS will appear. Bleading edge is always best ;-) so I looked further into this setting.

Test driving OpenSolaris

I wanted to make some more investigations so before using real hardware I wanted to test drive it with a virtualization software. After downloading the current ISO version of OpenSolaris 2008.11 I tried to install it on my Mac with VMWare Fusion but at that time I didn’t know that Solaris and OpenSolaris are the same so I had difficulties setting up the VM properly.

So I tried out VirtualBox and hoped, since it is now owned by Sun, it will work like a charm virtualizing Solaris. I set up a new VM with one boot disk and four raid disks.

virtualbox-opensolaris

I switched on all the fancy CPU extensions. There is only one ISO image for both x86/x64 so I turned on x64 and it automatically used the 64-Bit kernel. The hardware recommencations for ZFS say, that it works best with 64 Bit and at least 1 GB of RAM.

Installing OpenSolaris from the Live system worked very well. I was very surprised by the polished look thanks to Gnome (although I am a KDE fanboy). ZFS is now used by default as the boot filesystem on Solaris so I had to do nothing to activate it.

When the installation was complete and the system was up and running I made a little tweak to get access via SSH to the VM. Since it used NAT I set up a port forward from 2222 to 22 on my Mac. I edited the XML file of my virtual machine (~/Library/VirtualBox/Machines/Open Solaris/Open Solaris.xml in my case) and inserted the following lines to the DataItem section:

      <ExtraDataItem name="VBoxInternal/Devices/e1000/0/LUN#0/Config/ssh/HostPort" value="2222"/>
      <ExtraDataItem name="VBoxInternal/Devices/e1000/0/LUN#0/Config/ssh/GuestPort" value="22"/>
      <ExtraDataItem name="VBoxInternal/Devices/e1000/0/LUN#0/Config/ssh/Protocol" value="TCP"/>

After starting the VM I could connect with “ssh -l <user> -p 2222 localhost” to it.

I used to work for several years with a Linux system and after that on Mac OS X. I had no problems adapting to the Solaris world, since they took many products from the open source world like bash and integrated it. So the learing curve using this system seems very flat.

To get some info about the system I entered the following commands:

Get info about the kernel mode

$ isainfo -kv
64-bit amd64 kernel modules

Listing all system devices

$ prtconf -pv
System Configuration:  Sun Microsystems  i86pc
Memory size: 1061 Megabytes
System Peripherals (PROM Nodes):

Node 0x000001
    bios-boot-device:  '80'
...

Show the system log

$ cat /var/adm/messages

Setting up the storage pool

Since I want to have one huge storage pool which can grow over the time I used RAIDZ.

The following commands where entered as root.

First to get all connected storage devices:

# format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
       0. c3d0 <DEFAULT cyl 2044 alt 2 hd 128 sec 32>
          /pci@0,0/pci-ide@1,1/ide@0/cmdk@0,0
       1. c5t0d0 <ATA-VBOX HARDDISK-1.0-8.00GB>
          /pci@0,0/pci8086,2829@d/disk@0,0
       2. c5t1d0 <ATA-VBOX HARDDISK-1.0-8.00GB>
          /pci@0,0/pci8086,2829@d/disk@1,0
       3. c5t2d0 <ATA-VBOX HARDDISK-1.0-8.00GB>
          /pci@0,0/pci8086,2829@d/disk@2,0
       4. c5t3d0 <ATA-VBOX HARDDISK-1.0-8.00GB>
          /pci@0,0/pci8086,2829@d/disk@3,0
Specify disk (enter its number): ^C

The green ids are the device ids we need to set up the storage pool. To create a pool named “tank” I entered:

zpool create -f tank raidz c5t0d0 c5t1d0 c5t2d0 c5t3d0

To show the available pools type:

# zpool list
NAME    SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
rpool  3,97G  2,90G  1,07G    73%  ONLINE  -
tank   31,8G   379K  31,7G     0%  ONLINE  -

rpool is the pool on the boot device. You can see, that the space you get connecting 4 drives with 8 GB is almost 32 GB. When you store something on that pool it is stored redundantly and uses about 30 % more space to ensure the safety when one device is failing.

Now I created a filesystem on that pool

# zfs create tank/home

It is linked automatically to /tank/home. To get all live zfs filesystems enter

# zfs list
NAME                     USED  AVAIL  REFER  MOUNTPOINT
rpool                   3,44G   482M    72K  /rpool
rpool/ROOT              2,71G   482M    18K  legacy
rpool/ROOT/opensolaris  2,71G   482M  2,56G  /
rpool/export             746M   482M    21K  /export
rpool/export/home        746M   482M    19K  /export/home
rpool/export/home/mk     746M   482M  45,4M  /export/home/mk
tank                     682M  22,7G  26,9K  /tank
tank/home                681M  22,7G  29,9K  /tank/home

In this example I copied the OpenSolaris ISO image to my new filesystem. It occupies 681M. On the pool it occupies 911M.

#zpool list
NAME    SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
rpool  3,97G  3,44G   545M    86%  ONLINE  -
tank   31,8G   911M  30,9G     2%  ONLINE  -

A very nice feature of ZFS is built-in compression. Properties of file systems are inherited so if you set compression on tank/home and create a new system inside of it it is compressed automatically:

# zfs set compression=on tank/home
# zfs get compression tank/home
NAME       PROPERTY     VALUE      SOURCE
tank/home  compression  on         local
# zfs create tank/home/mk
# zfs get compression tank/home/mk
NAME          PROPERTY     VALUE         SOURCE
tank/home/mk  compression  on            inherited from tank/home

Health insurance

ZFS data validity is ensured by internal checksums so it can see on the fly if data is still valid and can reconstruct if necessary.

To get the status of a pool enter

# zpool status -v tank
  pool: tank
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c5t0d0  ONLINE       0     0     0
            c5t1d0  ONLINE       0     0     0
            c5t2d0  ONLINE       0     0     0
            c5t3d0  ONLINE       0     0     0

errors: No known data errors

A scrub is a filesystem check which should be done with consumer quality drives on a weekly basis and can be trigged by

# zpool scrub tank

and after some time check the result with

# zpool status -v tank
  pool: tank
 state: ONLINE
 scrub: scrub completed after 0h2m with 0 errors on Tue Jan 13 11:36:38 2009
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c5t0d0  ONLINE       0     0     0
            c5t1d0  ONLINE       0     0     0
            c5t2d0  ONLINE       0     0     0
            c5t3d0  ONLINE       0     0     0

errors: No known data errors

Gnome Time Machine

When logging into a Gnome session and browsing through the menues I noticed the program “Time Slider Setup”.

time-slider

When you activate it, it will create ZFS snapshots on a regular basis. These snapshots don’t waste disk space and you can travel back in time (not as fancy as with Time Machine, but who cares) with the Gnome file browser Nautilus.

time-slider-in-action

That is a killer feature and if I still would be a Linux/Java developer, this would be a good reason for me to switch to OpenSolaris. You don’t have to use a RAID with ZFS to make snapshots, so your filesystem has a built-in time-based versioning system. *cool*

Network accessiblity

My next step was to set up Netatalk on Solaris using this guide. I first had some difficulties and had to install gmake and gcc. The described patches didn’t work correctly since the file /usr/ucbinclude/sys/file.h was missing, so I changed the #ifdef statement from

#if defined( sun ) && defined( __svr4__ )
#include </usr/ucbinclude/sys/file.h>
#else /* sun __svr4__ */

to

#if defined( sun ) && !defined( __svr4__ )
#include </usr/ucbinclude/sys/file.h>
#else /* sun __svr4__ */

in all source files where it occurred:

  • etc/atalkd/main.c
  • etc/cnid_dbd/dbif.c
  • etc/papd/main.c
  • etc/papd/lp.c
  • sys/solaris/tpi.c

I need further work with this to get it running with Time Machine.

Getting real

So everything looks promising at first glance. The next logical step would be to buy  the hardware components and try everything out. I configured bare systems with midi towers using AMD or Intel processors for approximately 170,- to 190,- €. Thats about half the price of a Drobo. For each SATA hard drive I would buy a hot pluggable case. Another option would be to use external USB drives but that might lead to a quiet cluttered and fragile construction.

I need to investigate further to use the correct components. Motherboards tested with OpenSolaris are listed here: http://www.sun.com/bigadmin/hcl/

ZFS has a hotplug feature so if a device fails it can be replaced without rebooting and typing in any commands. But if that fails one can also enter some commands into the commandline.

Next steps

I really need to invest more time with the VM and try to corrupt some disk images and how ZFS reacts to that. Also expanding existing pools with different sizes will be interesting.

Of course I need to get Netatalk working and try to use it as a Time Machine target. I could use the VM on a different host to simulate to final system.

Stay tuned for my future investigations and please don’t start trolling about the superiority of the Drobo. I think that is a matter of taste.

20 Comments »

  1. Just FYI–Netatalk is available for Drobo and DroboShare which can make it a Time Machine network target. See: http://code.google.com/p/backmyfruitup/ and http://www.drobo.com/droboapps

    Also, don’t forget to factor in the value of your time in trying to create your Drobo replacement :-) Seriously! :-)

    Comment by Tom — 13.01.2009 @ 18:43 | Reply

    • Wow, even the Drobo people are reading my unworthy thoughts.

      I am aware of DroboShare and AFP support, but that adds more to the high price tag.

      The amount of time I have to put into this project will not be little but I see it as a possiblity to learn something new.

      Comment by pegolon — 20.01.2009 @ 13:17 | Reply

  2. Very nice.
    I recently did the same but on a real machine. I wasnt that comfortable with the terminal, but I got used to it and it works very good. I use SMB and NFS (which are both build in) to share the files to my Macs. SMB seems to have a problem with umlauts but maybe not generally. I’m investigating that :P
    Otherwise it works great. All the little stuff can be remote-configured via SSH, if you want the GUI you can use X to be your X-server for OpenSolaris. Very nice and way faster than VNC (which also works out of the box).
    The flexibility of ZFS is also great. I use 3 1TB-drives in a raidZ and 2 500GB-drives in a mirror put together in one zpool. Further drives can easily be added (but only whole pools can be added, not a drive to an existing pool) or the drives can be replaced with bigger ones.

    Comment by Georg — 14.01.2009 @ 12:39 | Reply

  3. Couple of comments:

    - OS2008.11 includes native CIFS support, so it may be easiest to have ZFS share your mounts out via CIFS, the Mac is happy to work with them and there is little if any penalty to performance (if you can bear CIFS on the network!), any Windows clients will be covered as well then
    - ZFS snapshots can be made visible in the filesystem through a .zfs directory, hence your clients can see snapshots and recover files manually from the snaps
    - Time Machine will be quite happy to run against a sparsebundle file created in the filesystem (basically a thin provisioned disk image), once it knows where it is, TM will take care of remounting the FS to gain access, back up and unmount automatically … search Google for discussions around TimeTamer and Drobo and also searh out TimeMachineEditor which tweaks the plist pref for backup interval
    - ZFS performance is very dependent on heavy ARC utilisation (in memory cache), so factor more memory in to any purchases as a first attempt at improving perf.
    - OS2008.11 has an storage-nas package profile, which minimises an install whilst enabling all the bits you’d like for a NAS (see http://www.opensolaris.com/learn/features/whats-new/200811/, section 3.1.2)

    HTH, my Drobo was a short term solution pending a ZFS life solution … so this might be a path well travelled!

    Craig

    Comment by CraigM — 15.01.2009 @ 0:30 | Reply

  4. My main attraction point in Drobo is the possibility to put in any size drive and the whole raid will adapt to it. How can something similar be simulated with ZFS? Pooling toghether different pools? RAIDing pools? Never touched ZFS so far, there’s GOT to be a wizard out there who could come up with an idea? How do you people get around the fact that raidZ cannot be grown? How do you combine your raids and pools so you can add disks to your NAS? Thanks for your brilliant insight!

    Comment by simon — 25.01.2009 @ 18:05 | Reply

  5. hmmm – attempted post “discarded” try again with a quick post.

    Comment by joseph — 23.02.2009 @ 20:39 | Reply

  6. ok, that worked so here’s a quick recap.

    I’ve had a drobo and have successfully upgraded to 4*1TB, however I’m less excited about their now-closed forums & support.

    I’ve also had zfs on solaris, opensolaris, & nexenta, but now struggling with the cifs / sbm disappearing share issue.

    Solaris smbd issues a message “NbtDatagramDecoder[11]: too small packet” (google for more) about every twelve minutes and eventually the share no longer works and “sharemgr show -vp” hangs.

    so, a head’s up and insight appreciated,
    joseph

    Comment by joseph — 23.02.2009 @ 20:47 | Reply

  7. Just a naive question: in your setup what if your hard disk named c3d0 in pool ‘rpool’ were to go down? Will you be able to use data from the pool named ‘tank’?

    Cheers

    Comment by ducky — 25.02.2009 @ 16:08 | Reply

    • Of course the rpool also has to be mirrored at least to keep the system up and running. But you could also attach the drives to another system and access the zpool from there.

      Comment by pegolon — 25.02.2009 @ 17:03 | Reply

  8. the second option definitely looks worth investigating on my part. Thanks!

    Comment by ducky — 25.02.2009 @ 18:03 | Reply

  9. [...] Build your own Drobo-Replacement based on ZFS (tags: sysadmin zfs solaris backup) [...]

    Pingback by links for 2009-03-09 « Bloggitation — 09.03.2009 @ 8:02 | Reply

  10. I’ve lusted after the Drobo since learning about them last year, but I’ve been disappointed with the lack of performance(we bought one for the office to mess around with; copying to/from it while someone else is accessing a database file leaves much to be desired).

    As a result, I’ve been in the market to build a NAS. This opens up an avenue of opportunity as long as I can get my raid cards to work with opensolaris!

    Now, if I could just grow my array the way the drobo does, I’d be a very happy camper!

    Comment by hamn1egg — 27.03.2009 @ 16:49 | Reply

  11. Actually, if you are using ZFS then you should avoid any Hardware raid cards. They mess up ZFS and ZFS can not do all fancy error checking and error repair. The HW raid interfers. The recommendation is to NOT use HW raid. Sell the card.

    Comment by Kebabbert — 11.04.2009 @ 16:24 | Reply

  12. [...] Build your own Drobo-Replacement based on ZFS « Agile Developer, Berlin, Germany interesting usage of ZFS to duplicate features in some consumer storage/backups (tags: zfs storage drobo backup) [...]

    Pingback by tecosystems » links for 2009-05-06 — 07.05.2009 @ 21:03 | Reply

  13. So, how is the machine working?
    Got all the pluses of the drobo and none of the con’s?

    Comment by Omer — 27.05.2009 @ 2:16 | Reply

    • Hi, great article. Some little update showing how the whole thing is performing now will be nice! Thanks for the post!

      Comment by Saulo — 22.09.2009 @ 20:49 | Reply

  14. Omer > Well drobo gives you support, good compact design and “just plug and it works” taht you won’t find here, most retail products have the same “pros” that the DIY don’t have.

    Good idea, I’m considering building a server/nas of my own, already got all the hardware … now playing with ESXi :)

    Comment by tryon — 15.07.2009 @ 15:03 | Reply

  15. I too am on a similiar adventure of using an advanced file system (ie ZFS, brtfs, etc.) to make my own hyper-drobo with a less than hyper price tag. As a suggestionm have you looked at nexenta or freenas as a stand alone all in one solution> I understand they both support ZFS and will support brtfs when finalized in linux kernel. Hope this discussion thread is still alive.

    Comment by lazarus — 25.09.2009 @ 8:16 | Reply

    • Hi Lazarus,
      actually those stand-alone solutions did not offer enough space for 6 harddrives or they did not seem to have the right drivers to run OpenSolaris on them. At the moment I am quite glad with my choice.
      Cheers,
      Markus

      Comment by pegolon — 27.09.2009 @ 7:03 | Reply

  16. Hi,

    I’m also building a new NAS at the moment, using my hardware which I bought two years ago and used with Debian. Now I replaced the four hard drives with six bigger disks and switched to OpenSolaris.

    I also have a Drobo. There are some more pro’s and con’s:
    + power consumption of the Drobo should be lower than my pc hardware
    + management and status (!) software
    + DroboPro may be used with 8 hard disks and single or double redundancy
    - just a storage box
    - no power switch
    - no information about S.M.A.R.T.

    Also I’m looking for an USB LCD device as a status display. Do you know if there exists a small Windows App for retrieving status information?

    Best regards and thank you for the nice article,

    Nils.

    Comment by Nils — 30.10.2009 @ 0:07 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.