Brendan Gregg made an unusual discovery, shouting a HD produces pikes of latency. We’ll see sound proof data centers now? Another point to solid-state drivers.
I wonder if playing loud music near my computer makes IO slower.
Brendan Gregg made an unusual discovery, shouting a HD produces pikes of latency. We’ll see sound proof data centers now? Another point to solid-state drivers.
I wonder if playing loud music near my computer makes IO slower.
In this demo Roman Strobl shows the new functionality available from OpenSolaris 2008.11 . Time Slider allows users get instant rollback and point in time snapshots. This is possible the thought the ZFS Snapshots and now can be easily used by users thought the Time Slider in the Gnome interface.
More about news functionalities in OpenSolaris 2008.11 in the live demo “What’s new in OpenSolaris 2008.111” also with Roman.
Take a look in this interesting post of Dan MacAskill, CEO of SmugMug, about his experiences on OpenSolaris servers with ZFS and MySQL.
SmugMug is a digital photo sharing website, focused on professional photographers. The site is also famous for storing huge amounts of data (photos) using local filesystems and the Amazon S3 service. Just on S3 are 100 800 terabytes of data (according to this and this).
Very interesting pots because he tels in details his experieces over some diferents points of view.
This week I did another presentation outside my city. This time it was at Maracanau in the Comsolid, a open source and digital inclusion event.
My first presentation was about ZFS filesystem and how you can take benefits from it like pooling storage and self healing. I used as base for examples my last post on it, Trying to corrupt data in a ZFS mirror.
![]() |
|
![]() |
|
![]() |
My next talk was about OpenSolaris. We had a lot of questions and interesting about this. We burned some cds with OpenSolaris 2008.5 and also distributed others versions of OpenSolaris like Solaris 10.
And my last presentation was a quick talk about high performance computing, a short version on that I already did before.
Was a interesting event mainly because the public was composed primarily by young students with few background on TI. It was a challenge to present some new concepts like pooling storage for those who aren’t familiar with filesystems management. I tried to keep my talk as simpler as I could and focus on daily problems and showing that you can avoid them with some open source technologies.
The full album is available at http://flickr.com/photos/silveiraneto/sets/72157605632001295/.
Ilustrative image :P
This is the first of a serie of posts I’d like to write while I’m studying more about OpenSolaris. The idea is to create simple posts showing a specific feature through practical examples that you can reproduce in your computer.
One of the most interesting feature on OpenSolaris is the 128-bit filesystem ZFS.
For those who are starting with ZFS, the main diference is the abstraction used for volumes. Unlike traditional file systems, which reside on single devices and thus require a volume manager to use more than one device, ZFS filesystems are built on top of virtual storage pools called zpools. One zpool is constructed of virtual devices (vdevs), which are themselves constructed of block devices: files, hard drive partitions, or entire drives (the recommended usage).
In this first experiment we will construct a mirrored zpool (RAID-1) and so try to corrupt its data and see what happens. In a mirrored pool the data is replicated into many disks and that eliminates the critical point, ie if one disks stops the data is not corrupted. You’ll can create a mirror with two or more disks and inside a pool you can have many mirrors. By example, one pool of 100Gb made by two mirrors, each one with 50Gb and each mirror made by volumes of 25Gb. You’ll scale your pool according your needs and capabilities.
This part of corrupt data make this experiment a little dangerous. You have these options:
- Install OpenSolaris in your disk and have at least two more disks to make a mirrored zpool. I don’t recommend this option because if you don’t know exactly what you are doing you can lose important data if you use the wrong volumes.
- Install OpenSolaris in a virtual machine and create fake volumes for this experiment. If you make some mistake nothing too bad will happen. That’s the option I’m using. Here I’m using VirtualBox with OpenSolaris 2008.5. VirtualBox is a free virtual machine, easy to use and works well with OpenSolaris.
Although there is already a graphical tool for manage ZFS, this is not available at OpenSolaris 2008.5. Also for who are studying ZFS a little bit deeper, know how to manage it by command line tools is interesting.
With your OpenSolaris booted, open a terminal and log yourself as root. Consult your available devices with echo|format.
If you are familiar with Linux, OpenSolaris nomenclature for devices may sound strange. I recommend you to take a look at this document.
To create a pool with the devices c4d1 (80G) and c5d1 (60GB) just type zpool create ourpool mirror c4d1 c5d1.
Explaining this command word by word:
Diagram of ourpool. Icons from Everaldo Coelho.
If your command works, it’ll works silently e will returns nothing. For check pool’s status do a zpool status ourpool.
This output shows that a pool called ourpool is ONLINE and is made of one only mirror, that is made of two devices c4d1 e c5d1.
We can list all pools with zpool list.
Ourpool has approximately 60Gb size which 900kb is already used for store metadata. As we did a mirror using volume of 60Gb and 80Gb, the mirror size is determined by the smaller volume. The another pool, rpool is a pool that OpenSolaris creates by defaul to place the system.
Now we’ll populate the pool with data. These data could be real important data like data base files, your photo collection or personal documents. For illustrative effect I’m using a 100Mb empty file called data. mkfile 100m data.
While the file creation I did a zpool iostat -v ourpool too see the IO traffic in the pool. Note that there’s traffic on both disks as they form a mirror.
We will create and save a file of md5 checksum of date to be able to check its integrity later, md5sum data > data.md5. Too see if a checksum matches we do a md5sum –check data.md5.
Now comes the critical part of this simulation. We will simulate a physical defect on the disc. Storage devices will fail at some point, but we don’t know when. When it happens it can corrupt your data or stop important applications.
Let’s get 20Mb of garbage from /dev/urandom e throw them in the disk c4d1, dd if=/dev/urandom of=/dev/dsk/c4d1 bs=1024 count=20480. There’s more fun (and expensive) ways to case physical defects in a disk, take a look into this video where they use ZFS and hammers. :)
Ready, the damage was done. Let’s look the pool status, zpool status ourpool.
We see no error but the ZFS uses strongly memory cache. Let’s force clean this cache by disabling and enabling the pool. First cd / to assure we are not into the pool, so zpool export ourpool followed by zpool import ourpool.
Checking it’s status again, zpool status ourpool.
Pool remains ONLINE but ZFS noticed that something is wrong.
Let see the data integrity, md5sum –check data.md5.
This is one of the characteristic of self healing in ZFS. The corruption that occurred in one volume was silently repaired. In a traditional volume manager you would not only lost our data but not event know that a corruption has occurred.
In this point the system administrator should be warmed to take some action on the defective disk. Here some advices:
I also did a screencast the resumes the entire process:
This post is a english translation for this post.