Netapp Deduplication – quick setup guide

Deduplication refers to the elimination of redundant data in the storage. In the deduplication process, duplicate data is deleted, leaving only one copy of the data to be stored. However, indexing of all data is still retained should that data ever be required. Deduplication is able to reduce the required storage capacity since only the unique data is stored. 

 

Netapp supports deduplication where only unique blocks in the flex volume is stored and it creates a small amount of additional metadata in the dedup process. The NetApp deduplication technology allows duplicate 4KB blocks anywhere in the flexible volume to be deleted and stores a unique one.

The core enabling technology of deduplication is fingerprints. These are unique digital signatures for every 4KB data block in the flexible volume.

 

When deduplication runs for the first time on a flexible volume with existing data, it scans the blocks in the flexible volume and creates a fingerprint database, which contains a sorted list of all fingerprints for used blocks in the flexible volume. After the fingerprint file is created, fingerprints are checked for duplicates and if found, first a byte-by-byte comparison of the blocks is done to make sure that the blocks are indeed identical. If they are found to be identical, the block’s pointer is updated to the already existing data block and the duplicate data block is released and inode is updated.

 

 

Netapp Deduplication commands:

  1. Enable dedup (asis) license.

    fractal-design> sis on /vol/demovol
  2. If you have a new flex volume which was just created, follow this step to enable ASIS deduplication

    fractal-design> sis on /vol/demovol
    Deduplication for "/vol/demovol" is enabled.
    Already existing data could be processed by running "sis start -s /vol/demovol”

  3. If you have already existing flex volume with data in it, follow this step.

    fractal-design> sis start -s /vol/demovol
  4. Checking the status of deduplication.

    fractal-design> vol status demovol
    Volume          State   Status          Options
    VolArchive      online  raid_dp, flex   nosnap=on
                            sis
    Containing aggregate: 'aggr0'
    fractal-design>

    fractal-design> sis status /vol/demovol
    Path            State   Status      Progress
    /vol/demovol    Enabled Idle        Idle for 00:02:12
    fractal-design>

  5. Check the storage space saved due to deduplication

    fractal-design> df -s /vol/demovol
    Filesystem      used    saved   %saved
    /vol/demovol/   9316052 0       0%
    fractal-design>

  6. If you have to run deduplication at a later point of time on this volume, just do a “sis start /vol/demovol”.
  7. The sis can be scheduled using “sis config” command.
  8. Done.

More netapp blog posts at : http://unixfoo.blogspot.com/search/label/netapp

Solaris LDOMs virtualization : setup guide

Sun Logical Domains or LDoms is a full virtual machine that runs an independent operating system instance and contains virtualized CPU, memory, storage, console, and cryptographic devices. This technology allows you to allocate a system resources into logical groupings and create multiple, discrete systems, each with their own operating system, resources, and identity within a single computer system. We can run a variety of applications software in different logical domains and keep them independent of performance and security purposes. The LDoms environment can help to achieve greater resource usage, better scaling, and increased security and isolation.

 


Logical & Control domain :
The control domain communicates with the hypervisor to create and manage all logical domain configurations within a server platform. The Logical Domains Manager is used to create and manage logical domains. The Logical Domains Manager maps logical domains to physical resources. Without access to the Logical Domains Manager all logical domain resource levels remain static. The initial domain created when installing Logical Domains software is a control domain and is named primary.



Image from : http://www.sun.com/blueprints/0207/820-0832.pdf

You can download Logical Domain manager from http://sun.com/ldoms . Please read the release notes for system firmware requirements and patch requirements. By default, Ldoms software gets installed to /opt/SUNWldm/. Make sure the below commands works - and that confirms Logical domain manager is running.


solfoo23# /opt/SUNWldm/bin/ldm list
Name State Flags Cons VCPU Memory Util Uptime
primary active -t-cv SP 32 16128M 49% 90mm


Creating default services : You need to create the default virtual services that the control domain uses to provide disk services, console access and networking. The below commands explains them.

  1. Create Virtual Disk server(vds) : Virtual disk server helps importing virtual disks into a logical domain from the control domain.

    solfoo23# ldm add-vds primary-vds0 primary

  2. Create Virtual Console concentrator Server(vcc) : Virtual Console concentrator server provides terminal service to logical domain consoles.

    solfoo23# ldm add-vcc port-range=5000-5100 primary-vcc0 primary

  3. Create Virtual Switch server(vsw) : Virtual Switch server enables networking between virtual network devices in logical domains.

    solfoo23# ldm add-vsw net-dev=e1000g0 primary-vsw0 primary

  4. List the default services created

    solfoo23# ldm list-services primary
    VDS
    NAME VOLUME OPTIONS DEVICE
    primary-vds0

    VCC
    NAME PORT-RANGE
    primary-vcc0 5000-5100

    VSW
    NAME MAC NET-DEV DEVICE MODE
    primary-vsw0 00:11:5a:12:dc:fc e1000g1 switch@0 prog,promisc

Control Domain Creation : The next step is to perform the initial setup of the primary domain, which will act as the control domain. You should specify the resources that the primary domain will use and what will be released for use by other guest domains. In this document, we are creating the control domain with 2 cpu's and 1gb RAM.     

solfoo23# ldm set-mau 0 primary
solfoo23# ldm set-vcpu 2 primary
solfoo23# ldm set-memory 1024M primary


Now, set these modified configuration permanent using list-spconfig option.    

solfoo23# ldm list-spconfig
factory-default [current]    

solfoo23# ldm add-spconfig initial    

solfoo23# ldm list-spconfig
factory-default [current]
initial [next]


Reboot the server and it will come up with initial configuration.

Networking between domains : Networking between control, service and other domains is disabled by default. To enable this, the virtual switch device should be configured as a network device. On the server console and perform the following network configuration steps.

  1. Plumb the virtual switch(vsw0)

    solfoo23# ifconfig vsw0 plumb

  2. Bring down the primary interface

    solfoo23# ifconfig e1000g1 down unplumb

  3. Configure Virtual switch with the primary interface details

    solfoo23# ifconfig vsw0 <ip> netmask <netmask> broadcast + up

  4. Modify the hostname file to make this configuration permanent

    solfoo23# mv /etc/hostname.e1000g1 /etc/hostname.vsw0

  5. Enable Virtual Network terminal server daemon

    solfoo23# svcadm enable vntsd

Now the setup is done. Run "ldm list-bindings primary" and make sure they are ok.

Logical Domain Creation : Now that the system is ready, prepare and plan for the logical domain configuration. In this document, we are creating a logical domain with 2 CPUs and 1GB memory and "domfoo" is the name.

solfoo23# ldm add-domain domfoo
solfoo23# ldm add-vcpu 2 domfoo
solfoo23# ldm add-memory 1G domfoo
solfoo23# ldm add-vnet vnet1 primary-vsw0 domfoo
solfoo23# ldm add-vdsdev /dev/dsk/c1t2d0s2 vol1@primary-vds0
solfoo23# ldm add-vdisk vdisk1 vol1@primary-vds0 domfoo
solfoo23# ldm bind domfoo
solfoo23# ldm set-var auto-boot\?=false domfoo
solfoo23# ldm start-domain domfoo


You will be able see the domain using "ldm list-domain"

solfoo23# ldm list-domain
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv SP 2 2G 0.2% 3h 4m
domfoo inactive ----- 2 1G



Connect to the logical domain console by telneting to the virtual console port.

solfoo23# telnet localhost 5000
Trying 127.0.0.1...
Connected to localhost....
Escape character is ’^]’.
Connecting to console "domfoo" in group "domfoo" ....
Press ~? for control options ..
{0} ok


Your LDom is up!. You can install it using jumpstart. Your LDoms environment is ready!


Refer Solaris LDOM virtualization document links for more information :



Netapp performance monitoring : sysstat : Part II

Here are some explanations on the columns of netapp sysstat command.

Cache age : The age in minutes of the oldest read-only blocks in the buffer cache. Data in this column indicates how fast read operations are cycling through system memory; when the filer is reading very large files, buffer cache age will be very low. Also if reads are random, the cache age will be low. If you have a performance problem, where the read performance is poor, this number may indicate you need a larger memory system or  analyze the application to reduce the randomness of the workload.



Cache hit : This is the WAFL cache hit rate percentage. This is the percentage of times where WAFL tried to read a data block from disk that and the data was found already cached in memory. A dash in this column indicates that WAFL did not attempt to load any blocks during the measurement interval.

CP Ty : Consistency Point (CP) type is the reason that a CP started in that interval. The CP types are as follows:

  • - No CP started during sampling interval (no writes happened to disk at this point of time)
  • number Number of CPs started during sampling interval
  • B Back to back CPs (CP generated CP) (The filer is having a tough time keeping up with writes)
  • b Deferred back to back CPs (CP generated CP) (the back to back condition is getting worse)
  • F CP caused by full NVLog (one half of the nvram log was full, and so was flushed)
  • H CP caused by high water mark (rare to see this. The filer was at half way full on one side of the nvram logs, so decides to write on disk).
  • L CP caused by low water mark
  • S CP caused by snapshot operation
  • T CP caused by timer (every 10 seconds filer data is flushed to disk)
  • U CP caused by flush
  • : continuation of CP from previous interval (means, A cp is still going on, during 1 second intervals)

The type character is followed by a second character which indicates the phase of the CP at the end of the sampling interval. If the CP completed during the sampling interval, this second character will be blank. The phases are as follows:

  • 0 Initializing
  • n Processing normal files
  • s Processing special files
  • f Flushing modified data to disk
  • v Flushing modified superblock to disk

CP util : The Consistency Point (CP) utilization, the % of time spent in a CP.  100% time in CP is a good thing. It means, the amount of time, used out of the cpu, that was dedicated to writing data, 100% of it was used. 75% means, that only 75% of the time allocated to writing data was utilized, which means we wasted 25% of that time. A good CP percentage has to be at or near 100%.


You can use Netapp SIO tool to benchmark netapp systems. SIO is a client-side workload generator that works with any target. It generates I/O load and does basic statistics to see how any type of storage performs under certain conditions.

Netapp performance monitoring : sysstat - Part I

Netapp sysstat is like vmstat and iostat rolled into one command. It reports filer performance statistics like CPU utilization, the amount of disk traffic, and tape traffic. When run with out options, sysstat will prints a new line every 15 seconds, of just a basic amount of information. You have to use control-C (^c) or set the interval count (-c count ) to stop sysstat after time. For more detailed information, use the -u option. For specific information to one particular protocol, you can use other options. I'll list them here.

  • -f FCP statistics
  • -i iSCSI statistics
  • -b SAN (blocks) extended statistics
  • -u extended utilization statistics
  • -x extended output format. This includes all available output fields. Be aware that this produces output that is longer than 80 columns and is generally intended for "off-line" types of analysis and not for "real-time" viewing.
  • -m Displays multi-processor CPU utilization statistics. In addition to the percentage of the time that one or more CPUs were busy (ANY), the average (AVG) is displayed, as well as, the individual utilization of each processor. This is only handy on multi proc systems. Won't work on single processor machines.

You can use Netapp SIO tool to benchmark netapp systems. SIO is a client-side workload generator that works with any target. It generates I/O load and does basic statistics to see how any type of storage performs under certain conditions.

Netapp Simulator – Installation steps

Those who are newly learning Netapp can use Netapp Data OnTap Simulator to get comfortable with Netapp commands. This tool gives you the experience of administering and using a NetApp storage system with all the features of Data ONTAP.  The Simulator can be downloaded from http://now.netapp.com/NOW/cgi-bin/simulator ( you need NOW access ). The simulator has fully functional license keys for all Netapp functionalities.

The simulator can be loaded onto a Red Hat or SuSE Linux box and looks and feels exactly like Data ONTAP. Almost anything you can do with Data ONTAP can be done with the simulator. Without purchasing new hardware or impacting your production environment, you can test functionality, export NFS and CIFS shares etc.

Steps to install Simulator:

  1. Download the simulator iso from netapp and mount it on a linux machine.
  2. [root@fedora01 ~]# mount -o loop 7.3.2-sim-cdrom-image-v22.iso /mnt
    [root@fedora01 ~]# cd /mnt
    [root@fedora01 mnt]# ls
    disks2.tgz  disks.tgz  doc  license.htm  readme.htm  runsim.sh  setup.sh  sim.tgz  Vmware, Linux and Simulator installation.doc

  3. Install the simulator software, using the setup script as below
  4. [root@fedora01 mnt]# ./setup.sh

    Script version 22 (18/Sep/2007)
    Where to install to? [/sim]: /data/
    /data/ already exists. This may overwrite files.
    Are you sure? [no]: yes
    Would you like to install as a cluster? [no]: no
    Would you like full HTML/PDF FilerView documentation to be installed [yes]: yes
    Continue with installation? [no]: yes
    Creating /data/
    Unpacking sim.tgz to /data/
    Configured the simulators mac address to be [00:50:56:14:1c:d9]
    Please ensure the simulator is not running.
    Your simulator has 3 disk(s). How many more would you like to add? [0]: 21

    The following disk types are available in MB:
            Real (Usable)
      a -   43   ( 14)
      b -   62   ( 30)
      c -   78   ( 45)
      d -  129   ( 90)
      e -  535   (450)
      f - 1024   (900)

    If you are unsure choose the default option a
    What disk size would you like to use? [a]: e
    Disk adapter to put disks on? [0]:
    Use DHCP on first boot? [yes]:
    Ask for floppy boot? [no]:
    Checking the default route...
    You have a single network interface called eth0 (default route) .
    Which network interface should the simulator use? [default]:
    The recommended memory is 512MB.
    How much memory would you like the simulator to use? [512]:
    Create a new log for each session? [no]: yes
    Adding 21 additional disk(s).
    Complete. Run /data/runsim.sh to start the simulator.
    [root@fedora01 mnt]#

  5. Run /data/runsim.sh and provide the details during the first boot and halt the simulator.

  6. Run /data/runsim.sh to start the Netapp simulator again. You can launch the FilerView with url http://ip-address-of-simulator/na_admin.
  7. [root@fedora01 data]# /data/runsim.sh
    runsim.sh script version Script version 22 (18/Sep/2007)
    This session is logged in /data/sessionlogs/log-1265126584

    NetApp Release 7.3.2: Thu Oct 15 03:13:41 PDT 2009
    Copyright (c) 1992-2009 NetApp.
    Starting boot on Tue Feb  2 16:03:05 GMT 2010
    Tue Feb  2 16:03:35 GMT [fmmb.current.lock.disk:info]: Disk v4.16 is a local HA mailbox disk.
    Tue Feb  2 16:03:35 GMT [fmmb.current.lock.disk:info]: Disk v4.17 is a local HA mailbox disk.
    Tue Feb  2 16:03:35 GMT [fmmb.instStat.change:info]: normal mailbox instance on local side.
    Tue Feb  2 16:03:37 GMT [raid.cksum.replay.summary:info]: Replayed 0 checksum blocks.
    Tue Feb  2 16:03:37 GMT [raid.stripe.replay.summary:info]: Replayed 0 stripes.
    sparse volume upgrade done. num vol 0.
    Vdisk Snap Table for host:0 is initialized
    Tue Feb  2 16:03:40 GMT [vol.language.unspecified:info]: Language not set on volume vol0. Using language config "C". Use vol lang to set language.
    Tue Feb  2 16:03:40 GMT [rc:notice]: The system was down for 39 seconds
    Tue Feb  2 16:03:56 GMT [dfu.firmwareUpToDate:info]: Firmware is up-to-date on all disk drives
    Tue Feb  2 16:03:56 GMT [sfu.firmwareUpToDate:info]: Firmware is up-to-date on all disk shelves.
    Tue Feb  2 16:03:56 GMT [netif.linkUp:info]: Ethernet ns0: Link up.
    Tue Feb  2 16:03:57 GMT [perf.archive.start:info]: Performance archiver started. Sampling 22 objects and 195 counters.
    add net default: gateway 192.168.0.1
    Tue Feb  2 16:04:03 GMT [mgr.boot.disk_done:info]: NetApp Release 7.3.2 boot complete. Last disk update written at Tue Feb  2 16:03:01 GMT 2010
    Tue Feb  2 16:04:03 GMT [mgr.boot.reason_ok:notice]: System rebooted after a halt command.
    CIFS local server is running.

    Password:
    unixfoo-simulator> Tue Feb  2 16:04:13 GMT [console_login_mgr:info]: root logged in from console



Documents related to Netapp Simulator :


TopOfBlogs

hit counter
Technology Blogs - Blog Catalog Blog Directory

  © Free Blogger Templates Columnus by Ourblogtemplates.com 2008

Back to TOP