Patching Oracle Exalogic – part 2

Patching Oracle Exalogic – updating the IB Gateway switches

In my last post I promised to write some more on the details of patching, using patch set update 4 (i.e. Januari 2012 patchset update 13113092) as an example. So let’s get started on patching the infrastructure, by looking at updates for the Infiniband Gateway switches.

I will demonstrate that these switches can be upgraded in a rolling fashion, without interrupting the network services (except for a few seconds) and keeping the Exalogic online while doing so!

1.2     Patching the Gateway switches

First thing to note is that patching of the infrastructure is done under user root, not weblogic. After unzipping the el_infrastructure_10022.zip file (see my previous post on patching) we find the following:

.

[root@xxxxcn1 ~]# cd /u01/common/patches/todo/13113092/Infrastructure
/1.0.0.2.2/
[root@xxxxcn1 1.0.0.2.2]$ ls
BaseImage  NM2-36p  NM2-GW  one-command  README.html  README.txt
ZFS_Storage_7320

First thing to do when starting this is some careful preparation: by thoroughly checking the provided README.html file and also checking for additional information provided on My Oracle Support (MOS) like upgrade advisors, i.e. the “Exalogic January 2012 PSU Infrastructure Upgrade Guide [ID 1392684.1]“. Be sure to also check the “known issues” document for your PSU.

Then we do some version checking to see whether we need to apply a component update or not, since the patchset is cumulative it is possible that some of the updates have already been applied earlier.

The README.html file for the infrastructure part says:

If you are running either v1.0.0.0.0 or v1.0.0.1.0 of Exalogic Infrastructure, you must apply all the infrastructure patches/upgrades included in this PSU in the following order:

1.Exalogic Infrastructure

a. InfiniBand Gateway Switch (NM2-GW)

b. InfiniBand Switch 36 ( NM2-36p )

c. ZFS Storage Appliance (ZFS_Storage_7320)

i.   Q3.2

ii.  ILOM on the storage head

iii. Q3.3

d. Base Image v1.0.0.2.2 (rolling update, node at a time)

2. Exalogic Configuration Utility (ECU, previously called one-command)

Summarizing, the order of patching is as follows: first the network switches, then the storage appliance, then the OS on the compute nodes. Since we have a quarter rack configuration, there is no MM2-36p switch installed so we don’t have to update it. We only have to update the two NM2-GW switches in our rack.

1.3     Checking current versions on the switches

Now, we first check the current software versions for the IB gateway switches. The README says the following:

This section contains instructions on upgrading NM2-GW InfiniBand Gateway switches in an Exalogic rack from version 1.1.2-3 (factory default on Exalogic X2-2 racks shipped with either v1.0.0.0.0 or v1.0.0.1.0 of the Exalogic Base Image) to version 1.3.2-1.

After logging in as root, we can use the version command to check the software version:

.

[root@xxxxgw1 ~]# version
SUN DCS gw version: 1.3.2-1
Build time: Feb 17 2011 10:02:40
FPGA version: 0x33
SP board info:
Manufacturing Date: 2010.12.30
Serial Number: "NCD600077"
Hardware Revision: 0x0006
Firmware Revision: 0x0000
BIOS version: SUN0R100
BIOS date: 06/22/2010
.
[root@xxxxgw2 ~]# version
SUN DCS gw version: 1.3.2-1
Build time: Feb 17 2011 10:02:40
FPGA version: 0x33
SP board info:
Manufacturing Date: 2010.12.31
Serial Number: "NCD600233"
Hardware Revision: 0x0006
Firmware Revision: 0x0000
BIOS version: SUN0R100
BIOS date: 06/22/2010

As it turns out, this particular patchset update is not very suited for demonstration of updates for the Infiniband Gateway switches in our case, as we already arrived at the required patchlevel (1.3.2-1) by doing the october 2011 patchset 12825625. Instead, I will therefore take the upgrade to version 2.0.0.0.0 (patch 13795376) as an example here. For this update, the Infiniband Gateway switches have to be upgraded to SUN DCS version 2.0.4-1.

First we have to do a number of prerequisite checks, which I will not mention here (but which are important to best ensure the update goes through flawlessy). Then we perform the upgrade of the two gateway switches in a rolling fashion, so we don’t interrupt network services and users and applications kan keep working. We do this by first upgrading the switch that is not the active master switch. Let’s find out which of the two has this role:

.

[root@xxxxgw1 ~]# getmaster
Local SM enabled and running
20120117 10:03:08 Master SubnetManager on sm lid 27 sm guid 0x2128be561ac0a0
: SUN IB QDR GW switch xxxxgw2
[root@xxxxgw2 ~]# getmaster
Local SM enabled and running
20120117 10:03:20 Master SubnetManager on sm lid 27 sm guid 0x2128be561ac0a0
: SUN IB QDR GW switch xxxxgw2

OK, gateway number 2 (GW02) is the master switch at present. That means we should upgrade the GW01 switch first, have them switch roles and then upgrade GW02 to finish up.

1.4     Upgrading GW01

The README for the 2.0.0.0.0 upgrade states the following (very similar to the README for the jan 2012 PSU, but a little more elaborate). The patch file is loaded via FTP from the Exalogic storage, where we have set up an ftp user called patcher for this in advance.

To upgrade the secondary NM2-GW switches, complete the following steps:

1. Switch to the ILOM shell by running the spsh command on the command line:

# spsh

->

2. Ensure that you have created the patches share in the ZFS storage appliance, and

enabled the FTP service on the share with the permission for root access, as described in the top-level README file, which is included in the upgrade kit.

Load the firmware upgrade package using the command:

-> load -source ftp://root:<root_password>@<storage_host>//<path_to_NM2-GW_fw_ upgrade_binaries_on_patches_share>/sundcs_gw_repository_2.0.4_1.pkg

OK, easy enough, let’s do that:

.

[root@xxxxgw1 ~]# spsh
Oracle(R) Integrated Lights Out Manager
Version ILOM 3.0 r47111
Copyright (c) 2010, Oracle and/or its affiliates. All rights reserved.
-> load -source ftp://patcher@xxxxsn-priv//export/common/patches/todo/
13795376/Infrastructure/2.0.0.0.0/NM2-GW/2.0.4-1/sundcs_gw_repository_2.0.4_1.pkg
Error: URL should specify IP Address. Hostname is not supported.
Firmware image update failed.
load: Command Failed

Hmm, I guess we should use the IP address of the storage instead of it’s name. Also I found that we need to supply the password directly, so we try again, and then it goes through:

.

-> load -source ftp://patcher:mypassword@<ZFS storage VIP address>//export/
common/patches/todo/13795376/Infrastructure/2.0.0.0.0/NM2-GW/2.0.4-1/
sundcs_gw_repository_2.0.4_1.pkg
Downloading firmware image. This will take few minutes.
NOTE: Firmware upgrade will upgrade firmware on SUN DCS gw Kontron module,
I4 and BridgeX. Upgrade takes few minutes to complete.
ILOM will enter a special mode to load new firmware. No other tasks
should be performed in ILOM until the firmware upgrade is complete.
Are you sure you want to load the specified file (y/n)? y
Setting up environment for firmware upgrade. This will take few minutes.
Starting SUN DCS gw FW update
==========================
Performing operation: I4 A
==========================
I4 fw upgrade from 7.3.0(INI:1) to 7.4.0(INI:1):
Upgrade started...
Upgrade completed.
INFO: I4 fw upgrade from 7.3.0(INI:1) to 7.4.0(INI:1) succeeded
==========================
Performing operation: BX A
==========================
BX fw upgrade from 8.3.3166(INI:4) to 8.4.2740(INI:5):
Upgrade started...
Upgrade completed.
INFO: BX fw upgrade from 8.3.3166(INI:4) to 8.4.2740(INI:5) succeeded
==========================
Performing operation: BX B
==========================
BX fw upgrade from 8.3.3166(INI:4) to 8.4.2740(INI:5):
Upgrade started...
Upgrade completed.
INFO: BX fw upgrade from 8.3.3166(INI:4) to 8.4.2740(INI:5) succeeded
===========================
Summary of Firmware update
===========================
I4 status                :  FW UPDATE - SUCCESS
I4 update succeeded on   :  A
I4 already up-to-date on :  none
I4 update failed on      :  none
BX status                :  FW UPDATE - SUCCESS
BX update succeeded on   :  A, B
BX already up-to-date on :  none
BX update failed on      :  none
=========================================
Performing operation: SUN DCS gw firmware update
=========================================
SUN DCS gw Kontron module fw upgrade from 1.3.2-1 to 2.0.4-1:
Please reboot the system to enable firmware update of Kontron module. The download
of the Kontron firmware image happens during reboot.
After system reboot, Kontron FW update progress can be monitored in browser using
URL [http://GWsystem] OR at OS command line prompt by using command [telnet GWsystem 1234]
where GWsystem is the hostname or IP address of SUN DCS GW.
Firmware update is complete.

.

OK that worked fine, now exit the service processor shell and reboot it:

-> exit
[root@xxxxgw1 ~]# reboot -n
Broadcast message from root (pts/0) (Tue Mar 20 10:55:25 2012):
The system is going down for reboot NOW!
[root@xxxxgw1 ~]# Connection to xxxxgw1.qualogy.com closed by remote host.
Connection to xxxxgw1.qualogy.com closed.

Wait a bit for the GW02 switch to come back up, then log back in to verify it and check the version:

.

% ssh root@xxxxgw1.qualogy.com
root@xxxxgw1.qualogy.com's password:
Last login: Tue Mar 20 09:22:49 2012 from 192.168.110.219
FW upgrade completed successfully on Tue Mar 20 11:02:32 CET 2012.
Please run the "fwverify" CLI command to verify the new image.
This message will be cleared on next reboot.
You are now logged in to the root shell.
It is recommended to use ILOM shell instead of root shell.
All usage should be restricted to documented commands and documented
config files.
To view the list of documented commands, use "help" at linux prompt.
[root@xxxxgw1 ~]# fwverify
Checking all present packages:
................................................................................
.............................................................. OK
Checking if any packages are missing:
.................................................................................
........................................................ OK
Verifying installed files:
..................................................................................
.......................................................... OK
[root@xxxxgw1 ~]# version
SUN DCS gw version: 2.0.4-1
Build time: Oct 17 2011 10:04:07
FPGA version: 0x33
SP board info:
Manufacturing Date: 2010.12.30
Serial Number: "NCD600077"
Hardware Revision: 0x0006
Firmware Revision: 0x0000
BIOS version: SUN0R100
BIOS date: 06/22/2010

OK, done! There’s more checking to do but I’ll skip it here for both for clarity and brevity.

1.5     Switching network control from GW02 over to GW01

Now that we have succesfully upgraded GW01, we can now make it the master switch so that GW02 is freed from network control duty and can be upgraded as well. We can do this by temporarely disabling the subnet manager on GW02, forcing a switchover:

.

[root@xxxxgw2 ~]# disablesm
Stopping partitiond daemon.                                [  OK  ]
Stopping IB Subnet Manager..-.                             [  OK  ]

Check on both GW01 and GW02 after waiting a few seconds:

.

[root@xxxxgw1 ~]# getmaster
Local SM enabled and running
20120320 10:47:30 Master SubnetManager on sm lid 12 sm guid 0x2128be529ac0a0 :
SUN IB QDR GW switch xxxxgw1 192.168.110.250
[root@xxxxgw2 ~]# getmaster
Local SM not enabled
20120320 10:47:39 Master SubnetManager on sm lid 12 sm guid 0x2128be529ac0a0 :
SUN IB QDR GW switch xxxxgw1 192.168.110.250

1.6     Upgrading GW02

So now the GW01 has become the master switch and we can upgrade GW02 in the same way.  After completing the upgrade for GW02 and checking the version, we should make sure the subnet manager is re-enabled on GW02 so it can again watch GW01′s back and quickly takeover control if the need arises.

.

[root@xxxxgw2 ~]# enablesm
Starting IB Subnet Manager.                                [  OK  ]
Starting partitiond daemon.                                [  OK  ]

Cool, we have in fact perfomed a rolling upgrade on the NM2-GW switches, and while we were upgrading them one after the other, the Exalogic stayed online!

.

Note: ususally there are some small post-upgrade steps to do which I will not mention here.

1.7     Next time

Next time, we will have a look at how the ZFS 7320 storage appliance kan be upgraded in a similar fashion, using the rolling upgrade principle.

OVER DE AUTEUR

Jos Nijhoff is an experienced Application Infrastructure consultant at Qualogy. Currently he plays a key role as technical presales for Qualogy's exclusive Exalogic partnership with Oracle for the Benelux area. Thus he keeps in close contact with Oracle presales and partner services, but maintains an independent view. He gives technical guidance and designs, reviews, manages and updates the application infrastructure before, during and after the rollout of new and existing Oracle (Fusion) Applications & Fusion Middleware implementations. Jos is also familiar with subjects like high availability, disaster recovery scenarios, virtualization, performance analysis, data security, and identity management integration with respect to Oracle applications.

2 Reacties op Patching Oracle Exalogic – part 2

  1. Paul Done schreef:

    This is really useful Jos capturing all the pertient info and examples in one place – will be a very handy refernece.

  2. Jos Nijhoff schreef:

    Thanks Paul! (for those of you who don’t know Paul : Paul is Technical Director, Engineered Systems for Oracle and member of the EMEA Engineered Systems Architecture Team. He is based in the UK and has his own blog : http://pauldone.blogspot.com.

    I have met Paul twice, at the Exalogic TOPGUN event in Linlithgow in juli 2011 and last month at the next Exalogic TOPGUN event over there.

Geef een reactie

Het e-mailadres wordt niet gepubliceerd. Verplichte velden zijn gemarkeerd met *

De volgende HTML tags en attributen zijn toegestaan: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Contact

Hebt u vragen of suggesties?

Mail info@qualogy.com!


De Bruyn Kopsstraat 9

2288EC Rijswijk (ZH)

The Netherlands

+31.(0)70 319 5000

  • Blog

  • Tags

  • @qualogy_news

  • @qresources

  • Reacties

  • Blijf in contact

    +31.(0)70 319 5000