OCFS2 is a POSIX-compliant shared-disk cluster file system for Linux capable of providing both high performance and high availability.  Cluster-aware applications can make use of parallel I/O for higher performance. OCFS2 is mostly used to host Oracle RAC database on Linux clusters.

The below steps shows how to create ocfs2 filesystem on top a multipath'd SAN lun and mount it on Linux clusters.

  1. Identify the nodes that will be part of your cluster.
  2. Export/Zone the LUNs on the SAN end and check whether they are accessible on all the hosts of the cluster. (fdisk -l or multipath -ll)
  3. If you need multipathing, configure multipath and the multipathing policy based on your requirement. For multipath setup, refer Redhat’s multipath guide.
  4. Create OCFS2 configuration file (/etc/ocfs2/cluster.conf) on all the cluster nodes.
  5. The example presents you a sample cluster.conf for a 3 node pool. If you have heartbeat IP configured on these cluster nodes, use the heartbeat IP for ocfs2 cluster communication and specify the hostname (without FQDN). Copy the same file to all the hosts in the cluster.

    [root@oracle-cluster-1 ~]# cat /etc/ocfs2/cluster.conf
    node:
            ip_port = 7777
            ip_address = 203.21.2.101
            number = 0
            name = oracle-cluster-1
            cluster = ocfs2

    node:
            ip_port = 7777
            ip_address = 203.21.2.102
            number = 1
            name = oracle-cluster-2
            cluster = ocfs2

    node:
            ip_port = 7777
            ip_address = 203.21.2.103
            number = 2
            name = oracle-cluster-3
            cluster = ocfs2

    cluster:
            node_count = 3
            name = ocfs2

    [root@oracle-cluster-1 ~]#

  6. On each node check the status of OCFS2 cluster service and stop "o2cb" if the service is already running.

    # service o2cb status
    # service o2cb stop
     
  7. On each node, load the OCFS2 module.

    # service o2cb load
     
  8. Make the OCFS2 service online on all the nodes.

    # service o2cb online
  9. Now your OCFS2 cluster is ready.
  10. Format the SAN device from any one of the cluster node.

    # mkfs.ocfs2 -b 4k -C 32k -L oraclerac /dev/mapper/mpath0

    -b : Block size (values are 512, 1K, 2K and 4K bytes per block)
    -C : Cluster size (values are 4K, 8K, 16K, 32K, 64K, 128K, 256K, 512K and 1M)
    -L : Label


    Note : Replace /dev/mapper/mpath0 with your device name.
  11. Update /etc/fstab on all the nodes in the cluster with the mount point.

    Like : /dev/mapper/mpath0 /u01 ocfs2 _netdev 0 0
  12. Mount the /u01 volume using mount command

    # mount /u01
     
  13. Enable ocfs and o2b service at runlevel 3.

    # chkconfig --level 345 o2cb on ; chkconfig --level 345 ocfs2 on
  14. The /u01 repository setup on a SAN Lun is done.
  15. You can now configure Oracle RAC database on this filesystem.

The Lustre file system is a distributed high performance cluster filesytem that redefines I/O performance and scalability for large and complex computing environments. This is ideally suited for data-intensive applications which requires the high IO performance.

Lustre components

MDS – Metadata Server: The MDS server makes metadata stored in the metadata targets available to Lustre clients.

MDT – Metadata Target: This stores metadata, such as filenames, directories, permissions, and file layout, on the metadata server.

OSS – Object Storage Server: The OSS provides file I/O service, and network request handling for the OSTs. The MDT, OSTs and Lustre clients can run concurrently on a single node. However, a typical configuration is an MDT on a dedicated node, two or more OSTs on each OSS node, and a client on each of a large number of compute nodes.

OST – Object Storage Target: The OST stores data as data objects on one or more OSSs. A single Lustre file system can have multiple OSTs, each serving a subset of file data.

Client: The systems that mount the Lustre filesystem.

Steps to create Lustre FS

Configure Lustre Management Server (lustre-mgs.unixfoo.biz – Server 1)

  1. Add the disk to volume manager

    [root@lustre-mgs mnt]# pvcreate /dev/sdb
    Physical volume "/dev/sdb" successfully created

    [root@lustre-mgs mnt]# pvs
      PV         VG   Fmt  Attr PSize   Pfree
      /dev/sdb        lvm2 --   136.73G 136.73G

  2. Create lustre volume group

    [root@lustre-mgs mnt]# vgcreate lustre /dev/sdb
      Volume group "lustre" successfully created
    [root@lustre-mgs mnt]# vgs
      VG     #PV #LV #SN Attr   VSize   Vfree
      lustre   1   0   0 wz--n- 136.73G 136.73G

  3. Create logical volume "MGS" (the management server)

    [root@lustre-mgs ~]# lvcreate -L 25G -n MGS lustre
  4. Create Lustre Management filesystem.

    [root@lustre-mgs ~]# mkfs.lustre --mgs /dev/lustre/MGS

       Permanent disk data:
    Target:     MGS
    Index:      unassigned
    Lustre FS:  lustre
    Mount type: ldiskfs
    Flags:      0x74
                  (MGS needs_index first_time update )
    Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
    Parameters:

    checking for existing Lustre data: not found
    device size = 10240MB
    2 6 18
    formatting backing filesystem ldiskfs on /dev/lustre/MGS
            target name  MGS
            4k blocks     0
            options        -J size=400 -q -O dir_index,uninit_groups –F

    mkfs_cmd = mkfs.ext2 -j -b 4096 -L MGS  -J size=400 -q -O dir_index,uninit_groups -F /dev/lustre/MGS
    Writing CONFIGS/mountdata
    [root@lustre-mgs ~]#

  5. Activate the lustre management filesystem using mount command.

    [root@lustre-mgs ~]# mount -t lustre /dev/lustre/MGS /lustre/MGS/
    [root@lustre-mgs ~]# df
    Filesystem           1K-blocks      Used Available Use% Mounted on
    /dev/sda2             54558908   5276572  46466144  11% /
    /dev/sda1               497829     29006    443121   7% /boot
    tmpfs                  8216000         0   8216000   0% /dev/shm
    /dev/lustre/MGS       10321208    433052   9363868   5% /lustre/MGS
    [root@lustre-mgs ~]#

Configure Lustre Metadata Server (In this guide, both the management & metadata server runs on the same host)

  1. Create logical volume "MDT_unixfoo_cloud"

    [root@lustre-mgs ~]# lvcreate -L 25G -n MDT_unixfoo_cloud lustre
  2. Create Lustre Metdata filesystem for the filesystem “unixfoo_cloud”.

    [root@lustre-mgs ~]# mkfs.lustre --fsname=unixfoo_cloud --mdt  --reformat --mgsnode=lustre-mgs@tcp0 /dev/lustre/MDT_unixfoo_cloud
       Permanent disk data:
    Target:     unixfoo_cloud-MDTffff
    Index:      unassigned
    Lustre FS:  unixfoo_cloud
    Mount type: ldiskfs
    Flags:      0x71
                  (MDT needs_index first_time update )

    Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
    Parameters: mgsnode=10.217.0.237@tcp mdt.group_upcall=/usr/sbin/l_getgroups
    device size = 20480MB
    2 6 18

    formatting backing filesystem ldiskfs on /dev/lustre/MDT_unixfoo_cloud
            target name  unixfoo_cloud-MDTffff
            4k blocks     0
            options        -J size=400 -i 4096 -I 512 -q -O dir_index,uninit_groups –F
    mkfs_cmd = mkfs.ext2 -j -b 4096 -L unixfoo_cloud-MDTffff  -J size=400 -i 4096 -I 512 -q -O dir_index,uninit_groups -F /dev/lustre/MDT_unixfoo_cloud
    Writing CONFIGS/mountdata
    [root@lustre-mgs ~]#

  3. Activate the metdata filesystem using mount command.

    [root@lustre-mgs ~]# mkdir /lustre/MDT_unixfoo_cloud
    [root@lustre-mgs ~]# mount -t lustre  /dev/lustre/MDT_unixfoo_cloud /lustre/MDT_unixfoo_cloud

Configure OSTs ( servers: oss1, oss2 .. )

  1. Add /dev/md0 to volume manager

    [root@oss1 ~]# pvcreate /dev/md0
  2. Create volume group "lustre"

    [root@oss1 ~]# vgcreate lustre /dev/md0
  3. Create logical volume (ost) for unixfoo_cloud

    [root@oss1 ~]# lvcreate -n OST_unixfoo_cloud_1 -L 100G lustre
  4. Create OST lustre filesystem

    [root@oss1 ~]# mkfs.lustre --fsname=unixfoo_cloud --ost --mgsnode=lustre-mgs@tcp0 /dev/lustre/OST_unixfoo_cloud_1
    mkfs_cmd = mkfs.ext2 -j -b 4096 -L unixfoo_cloud-OSTffff  -J size=400 -i 16384 -I 256 -q -O dir_index,uninit_groups -F /dev/lustre/OST_unixfoo_cloud_1
    Writing CONFIGS/mountdata
    [root@oss1 ~]#

  5. Activate the OST by using mount command

    [root@oss1 ~]# mkdir -p /lustre/unixfoo_cloud_oss1
    [root@oss1 ~]# mount -t lustre /dev/lustre/OST_unixfoo_cloud_1 /lustre/unixfoo_cloud_oss1

Mount on the client:

  1. Mount the lustre filesystem unixfoo_cloud

    [root@lustreclient1 ~]# mount -t lustre lustre-mgs@tcp0:/unixfoo_cloud /mnt
    [root@lustreclient1 ~]# df –h
    Filesystem            Size  Used Avail Use% Mounted on
    /dev/sda2              52G  5.1G   44G  11% /
    /dev/sda1             487M   29M  433M   7% /boot
    tmpfs                 7.9G     0  7.9G   0% /dev/shm
    lustre-mgs@tcp0:/unixfoo_cloud
                           99G  461M   93G   1% /mnt
    [root@lustreclient1 ~]#
  2. Done.

Disclaimer

All the information presented on this blog is provided on the basis of as is and is meant for reference purpose only without any expressed or implied warranty. Use of the information and its application in any form is sole discretion and responsibility of the user. unixfoo.blogspot.com is not responsible for any damage or loss arising out of use of information presented here. This web site has links to external web sites and unixfoo.blogspot.com is not responsible for the contents at those sites.