Xen performance monitoring

Xen’s Hypervisor does not have an easy collection of performance counters. The management machine - “Domain-0″ is actually a privileged virtual machine, and thus - get its own small share of CPUs and RAM. Collecting performance information on it will lead to collecting performance information for a single VM, and not the whole bunch. “xentop” allows collection of information, however, combining this with Cacti , or any other SNMP-based collection tool is a bit tricky. A great solution is provided by Ian P. Christian in his blog post about Xen montoring and there is a script that collects the performance details.

Reference


OpenSource Tools for AIX

Here are several websites that has OpenSource Tools for AIX. The "IBM AIX Toolbox" has the most tools, but not necessarily the most current versions. The other links are also good to note.

Reference : AIXpert Blog

Linux Kernel panic reboot

By default after a kernel panic, Linux kernel just waits there for a system administrator to hit the restart or powercycle button.  This is because of the value set on "kernel.panic" parameter.

[root@linux23 ~]# cat /proc/sys/kernel/panic
0
[root@
linux23 ~]# sysctl -a | grep kernel.panic
kernel.panic = 0
[root@linux23 ~]#

To disable this and make the Linux OS reboot after a kernel panic, we have to set an integer N greater than zero to the paramter "kernel.panic", where "N" is the number of seconds to wait before a automatic reboot. 
For example , if you set N = 10 , then the system waits for 10 seconds before automatic reboot. To make this permanent, edit /etc/sysctl.conf and set it.

[root@linux23 ~]# echo "10" > /proc/sys/kernel/panic
0
[root@
linux23 ~]# grep kernel.panic /etc/sysctl.conf
kernel.panic = 10
[root@linux23 ~]#
This helps in preventing manual intervention after a kernel panic. Setup some kernel dump or netdump to capture the kernel crash debug information.

Netapp Ontap Management SDK

Netapp provides an SDK ( Netapp API ) that contains resources necessary to develop third-party applications which monitor and manage Netapp Filers. The SDK contains libraries, code samples and bindings in C and Perl for the Netapp Ontap management API programming.

The Ontap SDK contains ..
  • SDK Core API library bindings in C and C++, Java and Perl.
  • SDK Core API and Data ONTAP API documentation, sample codes, developer tools, Design guides.
  • Manage ONTAP SDK Help.
For more detailed information read : http://communities.netapp.com/docs/DOC-1110
You can download the SDK from : http://communities.netapp.com/docs/DOC-1365


Solaris : Auditing File attributes

Solaris has a software registry which maintains information of software packages installed. The registry is invaluable for auditing the system to determine what software has been changed, installed, removed, or patched. The software registry contains a database of installed files. This database is physically located in the file /var/sadm/install/contents . Each file, special file, and directory installed on the system has an entry in this database. If some attributes of files are changed after installation, "pkgchk" command can find it out and report it. A good command for auditing. Here is an example..

solaris98# pkgchk
ERROR: /etc/apache/magic
    file size <12965> expected <12441> actual
    file cksum <8026> expected <33401> actual
ERROR: /etc/apache/mime.types
    file size <14987> expected <9957> actual
    file cksum <46595> expected <27635> actual
ERROR: /etc/auto_master
    file size <113> expected <395> actual
    file cksum <9773> expected <34676> actual
ERROR: /etc/default/dhcpagent
    file size <3394> expected <2826> actual
    file cksum <26394> expected <43621> actual

Some fiiles are expected to change such as /etc/system - which gets edited by sysadmin very often. pkgchk has a -n option that will bypass checking these files. Though this is a tempting option to use for reducing the amount of output from an audit, it is good to know what got changed.

solaris98# pkgchk -l -p /etc/system
Pathname: /etc/system
Type: editted file
Expected mode: 0644
Expected owner: root
Expected group: sys
Referenced by the following packages:
        SUNWcsr
Current status: installed

solaris98#


If you want to check what got changed on a filesystem , you can use find & pkgchk to know it. Check the example below.

solaris98# find /usr -mount -exec pkgchk -p {} \;
ERROR: /usr
    permissions <0755> expected <0775> actual
WARNING: no information associated with pathname </usr/platform/TSBW>
WARNING: no information associated with pathname <8000>
WARNING: no information associated with pathname </usr/platform/TSBW>
WARNING: no information associated with pathname <Ultra-2i>
..

Reference : http://www.sun.com/blueprints/1299/repairing.pdf

Linux Tips : Useful links

Came across the below links and it has lot of useful linux tips - neat and great. Checkout them


Netapp Storage Commands

Here are some of the useful functions of "storage" command in Netapp.


1) To show all disks on the system : Use "storage show disk -T" to display all the disks attached to the filer, the disk serial number, vendor, model, disk firmware version and type of disk (SATA/ATA/FCAL)

# rsh filer12 storage show disk -T
DISK                  SHELF BAY SERIAL           VENDOR   MODEL      REV TYPE
--------------------- --------- ---------------- -------- ---------- ---- ------
0d.16                   1    0  xxxxxxxxxxxxxxxx NETAPP   X276 NA07 FCAL
0d.17                   1    1  xxxxxxxxxxxxxxxx NETAPP   X276 NA07 FCAL
0d.18                   1    2  xxxxxxxxxxxxxxxx NETAPP   X276 NA07 FCAL
0d.19                   1    3  xxxxxxxxxxxxxxxx NETAPP   X276 NA07 FCAL
0d.20                   1    4  xxxxxxxxxxxxxxxx NETAPP   X276 NA07 FCAL
0d.21                   1    5  xxxxxxxxxxxxxxxx NETAPP   X276 NA07 FCAL
0d.22                   1    6  xxxxxxxxxxxxxxxx NETAPP   X276 NA07 FCAL
...

2) To see complete information of a particular disk : Use "storage show disk -a <disk-id>" to view complete information of a netapp disk. This command gives you the shelf, bay, serial number of disk, disk speed and many other.

# rsh filer12 storage show disk -a 0d.99
Disk:             0d.99
Shelf:            5
Bay:              13
Serial:           xxxxxxxxxxxxxxxxxxxx
Vendor:           NETAPP
Model:            X276
Rev:              NA07
RPM:              10000
WWN:              xxxxxxxxxxxxxxxxxxa
UID:              xxxxxxxxxxxxxxxxx:00000000:00000000:00000000:00000000
Downrev:          no
Pri Port:         B
Power-on Hours:   N/A
Blocks read:      0
Blocks written:   0
Time interval:    00:00:00
Glist count:      0
Scrub last done:  00:00:00
Scrub count:      0
LIP count:        0
Dynamically qualified:  No
#

3) To list all storage adapters on the filer : Use "storage show adapter -a" command to display all the storage adapters (hba) on the filer.

# rsh filer12 storage show adapter -a

Slot:            0a
Description:     Fibre Channel Host Adapter 0a (Dual-channel, QLogic 2322 rev. 3)
Firmware Rev:    3.3.25
FC Node Name:    xxxxxxxxxxxxxxxxxxx
FC Packet Size:  2048
Link Data Rate:  2 Gbit
SRAM Parity:     Yes
External GBIC:   No
State:           Enabled
In Use:          No
Redundant:       Yes

Slot:            0b
Description:     Fibre Channel Host Adapter 0b (Dual-channel, QLogic 2322 rev. 3)
Firmware Rev:    3.3.25
FC Node Name:    xxxxxxxxxxxxxxxxxxx
..

4) To get shelf details of filer : Use "storage show shelf <shelf-id>" command to display the details of the shelf and its partner shelf.


# rsh filer12 storage show shelf 0c.shelf2
Shelf name:    0c.shelf2
Channel:       0c
Module:        A
Shelf id:      2
Shelf UID:     xxxxxxxxxxxxxxxxxxxxxxx
Term switch:   N/A
Shelf state:   ONLINE
Module state:  OK

                               Loop  Invalid  Invalid  Clock  Insert  Stall  Util    LIP
Disk    Disk     Port            up      CRC     Word  Delta   Count  Count  Percent Count
  Id     Bay    State         Count    Count    Count
----------------------------------------------------------------------------------------
[IN  ]          OK                0        0        0      8       0      0    71     0
[OUT ]          OK                0        0        0      0       0      0    52     0
[  32]     0    OK                0        0        0     32       0      0     0     0
[  33]     1    OK                0        0        0     32       0      0     2     0
[  34]     2    OK                0        0        0     24       0      0     0     0
[  35]     3    OK                0        0        0     24       0      0     1     0
[  36]     4    OK                0        0        0      8       0      0     2     0
[  37]     5    OK                0        0        0     24       0      0     4     0
...

More Netapp commands at : http://unixfoo.blogspot.com/search/label/netapp



Netapp volume commands

How to find failed disks on a filer ? . "vol status -f" command gives you the failed disk on a filer.

# rsh filer12 vol status -f

Broken disks

RAID Disk       Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------  ------------- ---- ---- ---- ----- --------------    --------------
failed          3a.33   3a    2   1   FC:A   -  FCAL 10000 68000/139264000   68552/140395088
failed          4a.28   4a    1   12  FC:A   -  FCAL 10000 68000/139264000   68552/140395088
#

How to find spare disks on a filer ? . "vol status -s" command gives you the spare disks on a filer.

# rsh filer12 vol status -s

Spare disks

RAID Disk       Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------  ------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block or zoned checksum traditional volumes or aggregates
spare           2a.45   2a    2   13  FC:A   -  FCAL 10000 68000/139264000   68552/140395088
spare           4a.57   4a    3   9   FC:A   -  FCAL 10000 68000/139264000   68552/140395088
#

To find out the disks in an aggregate : Use "aggr status -r <aggregate-name>" to list all the disks that are part of the aggregage. This command gives the plex, raid and disk information.

# rsh filer12 aggr status -r aggr0
Aggregate aggr0 (online, raid_dp) (block checksums)
  Plex /aggr0/plex0 (online, normal, active)
    RAID group /aggr0/plex0/rg0 (normal)

      RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------  ------------- ---- ---- ---- ----- --------------    --------------
      dparity   4a.15   4a    2   13  FC:A   -  FCAL 10000 68000/139264000   69536/142410400
      parity    4a.16   4a    4   12  FC:A   -  FCAL 10000 68000/139264000   68552/140395088
      data      4a.22   4a    4   8   FC:A   -  FCAL 10000 68000/139264000   68552/140395088
#


To find out the volumes on a filer : "vol status" command is used to find volume on a filer. It gives the volume names and its status (online/offline/restricted)

# rsh filer12 vol status
         Volume State      Status            Options
          vol10 online     raid_dp, flex     no_i2p=on
          vol11 online     raid_dp, flex     no_i2p=on
           root online     raid_dp, flex     root, no_i2p=on
          vol12 offline    raid_dp, flex     no_i2p=on
#

More Netapp commands at : http://unixfoo.blogspot.com/search/label/netapp

TopOfBlogs

hit counter
Technology Blogs - Blog Catalog Blog Directory

  © Free Blogger Templates Columnus by Ourblogtemplates.com 2008

Back to TOP