Monday, June 23, 2014

Provisioning Aggregates

In the previous posts we installed our ONTAP simulator and applied some basic configuration settings. We also connected to the cluster via System Manager. From within System Manager we can login to the cluster, expand "storage" in the left pane and take a look at our disks.
It looks like the vsim comes with 2 "shelves" of 14 disks each. You'll also notice the first 3 disks are already assigned to an aggregate, aggr0, the node miamicl-01's root aggregate. This leaves 11 spares to create another aggregate with. Below the 11 spares you'll notice another "shelf" of disks which have a state of "present". These disks have not been assigned to the node yet. So let's assign them to miamicl-01. This cannot be done from System Manager and must be done via command line. We're just going to assign all of the unowned disks to miamicl-01. But you can specify individual disks, all unowned disks, or all disks on a specific adapter. See the below example:
miamicl::> storage disk assign -disk miamicl-01:v4.* -owner miamicl-01

miamicl::> storage disk show
                     Usable           Container
Disk                   Size Shelf Bay Type        Position   Aggregate Owner
---------------- ---------- ----- --- ----------- ---------- --------- --------
miamicl-01:v4.16     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v4.17     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v4.18     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v4.19     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v4.20     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v4.21     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v4.22     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v4.24     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v4.25     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v4.26     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v4.27     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v4.28     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v4.29     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v4.32     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v5.16     1020MB     -   - aggregate   dparity    aggr0     miamicl-01
miamicl-01:v5.17     1020MB     -   - aggregate   parity     aggr0     miamicl-01
miamicl-01:v5.18     1020MB     -   - aggregate   data       aggr0     miamicl-01
miamicl-01:v5.19     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v5.20     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v5.21     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v5.22     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v5.24     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v5.25     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v5.26     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v5.27     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v5.28     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v5.29     1020MB     -   - spare       present    -         miamicl-01
miamicl-01:v5.32     1020MB     -   - spare       present    -         miamicl-01
28 entries were displayed.


Now we have assigned all the previously unowned disks to node miamicl-01. And if we refresh the disk page in System Manager we see all disks are assigned and have been made spares.


 We are ready to carve up an aggregate. Now how you want to do this really depends on what you're planning to do. Do you want one large aggregate or multiple smaller ones? For my lab it doesn't really matter, it gets tore down and rebuilt every few months anyway. But for this post I will make two, one via System Manager and one via the command line. 

From System Manager expand "Storage" and click on "Aggregates". Then click on "create" 
This will bring up the "Create Aggregate Wizard". 
Click "Next".
Name your aggregate (Its nice to use meaningful names instead of the auto-generated name)
Specify RAID Type, RAID-DP is default and what we'll use in this post.
Click "next"
Click the button "Select Disks"


The wizard automatically determines the minimum number of hot spares (in this case 1 disk) and leaves the remaining spares in a group (in this case 24). We select the group. If we had other spares of another type there would be a second group in this list:

In this post I'm creating two 12 disk aggregates, so I will drop the number of capacity disks to use from 24 down to twelve, leaving us another 12 spares to make a second aggregate with via CLI. 
Then I hit "save and close" 
And now I can change the size of my RAID groups. 
In this example I have a 12 disk aggregate , I could put them all in one RAID group, but I'm going to split this one into two RAID groups. I'll try to remember to illustrate why for you later. So I change my RAID group size to 6 and hit save and close, then hit create.

Now let's do it from the command line.

miamicl::*> storage aggregate create -aggregate aggr2 -diskcount 12
[Job 16] Job succeeded: DONE. Warning: Creation of aggregate "aggr2" has been initiated.  12 disks need to be zeroed before they can be added to the aggregate.  The process has been initiated.  Once zeroing completes on these disks, all disks will be added at once.  Note that if the system reboots before the disk zeroing is complete, the aggregate will not exist.

miamicl::*> storage aggregate show
Aggregate     Size Available Used% State   #Vols  Nodes            RAID Status
--------- -------- --------- ----- ------- ------ ---------------- ------------
aggr0        900MB   43.43MB   95% online       1 miamicl-01       raid_dp,
                                                                   normal
aggr1_2rg   7.03GB    7.03GB    0% online       0 miamicl-01       raid_dp,
                                                                   normal
aggr2           0B        0B    0% creating     0 miamicl-01       raid_dp,
                                                                   initializing
3 entries were displayed.

miamicl::*>

And storage aggregate show now displays aggr0, aggr1_2rg (created from SysMgr), and aggr2 which we just made in the CLI and is still initializing. Time to grab a beer while disks zero and aggregate creation completes. 

OK, aggrs are created, and I wanted to show you why I made an aggr with 2 RAID groups. DATA ONTAP uses RAID-DP so there is really 2 parity disks in each RAID group. This allows the RAID group to suffer double disk failures and remain online. Not like we're too concerned with data protection or losing disks to parity in this post, but I think it's worth demonstrating.

So above when we created aggr1_2rg using System Manager we made it out of two, six-disk RAID groups. I unowned my one spare and disabled disk autoassign to help demonstrate this. Check out how many disks have to be failed to bring the aggregate offline:

Aggregate aggr1_2rg (online, raid_dp, degraded) (block checksums)
  Plex /aggr1_2rg/plex0 (online, normal, active)
    RAID group /aggr1_2rg/plex0/rg0 (double degraded, block checksums)

      RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------  ------------- ---- ---- ---- ----- --------------    --------------
      dparity   v4.22   v4    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      parity    FAILED          N/A                        1020/ -
      data      v4.24   v4    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      data      v4.25   v4    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      data      v4.29   v4    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      data      FAILED          N/A                        1020/ -

    RAID group /aggr1_2rg/plex0/rg1 (double degraded, block checksums)

      RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------  ------------- ---- ---- ---- ----- --------------    --------------
      dparity   FAILED          N/A                        1020/ -
      parity    v5.29   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      data      FAILED          N/A                        1020/ -
      data      v5.32   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      data      v5.27   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      data      v4.32   v4    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448


So after failing out 4 disks in aggr1_2rg, the aggregate is still online. Aggr2 which only has one RAID group would only be able to survive half that many disk failures. Out of my 12 1GB disks aggr1_2rg, with it's two RAID groups, has 7GB of usable space. While aggr2 with its 12 1GB disks lumped into one RAID group provides an additional 1.7GB of usable space (8.7GB total). There is a pretty good document on the NetApp Support Site that does a way better job of explaining RAID-DP and RAID groups than I ever could. Check it out here.

Upcoming articles will go over provisioning and testing a vserver, or storage virtual machine, for CIFS.

No comments: