Monday, December 5, 2022

What's ODABR snapshot & how to efficiently use it to patch ODA from 18 to 19.x

This image has an empty alt attribute; its file name is image-1.png


Intro

Although most of the focus, nowadays, has shifted to migration of on-premises workloads into the Cloud, companies still leverage Oracle databases engineered systems like Oracle database appliance to run their databases on-prem. As a matter of fact ,ODA is a low entry pricing and flexible CPU licensing platform that can still host workloads that aren’t mature enough to go to the cloud. Until then, system updates fall under the customer’s responsibility. In today’s use case, patching your ODA software version from 18.8 to 19.x will require to upgrade your OS from Linux Enterprise 6 to 7. But how does Oracle make that move seamless and safe in case of failure?This is why I chose to discuss a tool called ODABR that allows rollback capability during OS upgrade on ODAs.

BACKUP BEFORE YOU PATCH
It will be especially interesting to learn how to effectively use it with reduced available storage when patching an ODA to 19.6. Read more about ODA release matrix in the official Oracle blog 


Patching process to ODA 19.6

The upgrade from 18.8 to 19.6 has two main stages :

  1. A first pass to upgrade the Linux from OEL 6 to OEL 7.

  2. A second to update the ODA binaries (DCS and Grid) as for previous versions.

 

What’s ODABR

ODA backup & recovery is a utility developed by Oracle engineer Ruggero Citton, which allows to backup and recover an ODA node using consistent & incremental System backups on Bare metal ODAs as described in Oracle support Note ID 2466177.1. ODABR is a perquisite for the 1st stage (OS upgrade to OEL7) as it will save a disk restore point in case of rollback after ODA patching  failure (precheck will even fail if the tool is not installed).

ODA backups

System Node Backup includes following filesystems:

  • / : Root file system

  • /boot : Boot partition

  • /opt : opt file system (OAK/DCS,TFA, OWG, ASR)

  • /u01: Grid Infrastructure, RDBMS binaries

  • Grid Infrastructure OCR file

There are 2 types of backups with ODABR but only one is needed when patching the ODA to 19.6

  • Consistent backup is guarantee by the LVM snapshot feature (used during patching)

  • Incremental backup managed automatically using rsync features (physical copy to specified destination)


LVM snapshot used by ODABR

ODABR is just reusing Linux LVM snapshot feature that create two copies of the same logical Volume, where one is used for backup purposes while the other continues in operation. The delta is tracked since snapshot creation

  • Snapshot creation is quick & doesn’t need stopping the server.

  • A Snapshot will use only the space needed to accommodate the difference between the two LVs (delta also called Copy-on-Write (CoW) )

ODABR installation
Download and install the rpm: >> odabr-2.0.1

[root@odadev1~]# rpm -Uvh odabr-2.0.1-62.noarch.rpm 
odabr-2.0.1.62 has been installed on /opt/odabr succesfully!


Backup Syntax

Usage:
odabr backup [-snap] [-destination <dest path> [-dryrun][-silent]] | [-mgmtdb]
       [-osize <opt snapsize>][-rsize <root snapsize>][-usize <u01 snap size>]

odabr infosnap --- show available snapshots
odabr delsnap --- delete all snapshots

The backup syntax is pretty straightforward with  -snap & -destination (nfs/local path or ssh/rsync) as main option 


Patching to 19.6 challenge with limited Free space

 
Before upgrading the OS, ODABR will create LVM snapshots for the file systems that need 190GB of free space:

root LVM snapshot  30Gb
opt  LVM snapshot  60Gb
u01  LVM snapshot 100Gb

But in most situations, old systems unused space is lower.
Example:  A node with only 78GB unused space which will cause an error during the patching prechecks

[root@odadev2 ~]# df -Ph / /u01 /opt
Filesystem                          Size  Used Avail Use% Mounted on
/dev/mapper/VolGroupSys-LogVolRoot   30G  6.9G   22G  25% /
/dev/mapper/VolGroupSys-LogVolU01   148G  104G   37G  75% /u01  
/dev/mapper/VolGroupSys-LogVolOpt    59G   38G   19G  68% /opt
=== 78GB available only

PRECHECK ERROR

# odacli create-prepatchreport -v 19.6.0.0.0 -os
# odacli describe-prepatchreport -i 12d61cda-1cef-40b9-ad7d-8e087007da23v

Patch pre-check report
------------------------------------------------------------------------
Job ID: 666f7269-7f9a-49b1-8742-2447e94fe54e
Description: Patch pre-checks for [OS]
Status: FAILED
Created: November 7, 2022 5:30:42 PM CEST
Result: One or more pre-checks failed for [OS]
Pre-Check Status Comments
----------------------- -------- --------------------------------------
Validate LVM free space Failed Insufficient space to create LVM
snapshots on node: odadev1.

Expected free space(GB): 190, available space GB): 78.



Workarounds  

In case of limited Free space we have 2 options

1. Cowboy

My Oracle ACE peer Fernando Simon explains a drastic way to reduce the /u01 footprint in his excellent blogpost- Patch ODA from 18.3 to 19.8. Part 2 by unmounting the disk and using both resize2fs & lvreduce to claim free space.


2. Manual OADR backup with custom snapshot size 

A snapshot will require as much storage space as changes made in the logical volume, meaning the OS upgrade change will be the main source of all the changes stored in the snapshots.
Solution: run a manual backup by specifying lower size required for /, /opt, and /u01 snapshots , but you need to run the patchreport at least one time.
Example

With only 98G free space, we can run adapt the FS snapshots to lower sizes (opt=30g , root=5g,  u01=70G)

[root@odadev1 ~]# df -Ph / /opt /u01

Filesystem                          Size  Used Avail Use% Mounted on /dev/mapper/VolGroupSys-LogVolRoot   30G  7.6G   21G  28% / /dev/mapper/VolGroupSys-LogVolOpt    59G   41G   16G 73% /opt /dev/mapper/VolGroupSys-LogVolU01   148G   80G   61G  57% /u01

-- Actual free space

[root@odadev1 ~]# pvs  

PV         VG          Fmt  Attr PSize   PFree   /dev/md
1   VolGroupSys lvm2 a--u 446.00g 98.00g

Note: specify lower values for the lvm snapshots size than the actual filesystem usage.
- odacli update-server command will use these custom snapshots (98GB) during the upgrade instead of creating larger ones automatically which would take 190GB. 

[root@odadev1 ~]# /opt/odabr/odabr backup -snap -osize 30 -rsize 5 -usize 70

¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦  odabr - ODA node Backup Restore - Version: 2.0.1-62 Copyright Oracle, Inc. 20  --------------------------------------------------------                       Author: Ruggero Citton <ruggero.citton@oracle.com> RAC Pack, Cloud Innovation ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦
SUCCESS
: 2022-11-7 12:10:18:...snapshot backup for 'opt' created successfully SUCCESS: 2022-11-7 12:10:20:...snapshot backup for 'u01' created successfully SUCCESS: 2022-11-7 12:10:20:...snapshot backup for 'root' created successfully SUCCESS: 2022-11-7 12:10:20: LVM snapshots backup done successfully


-- Check the created LVM snapshots

[root@odadev02 ~]# /opt/odabr/odabr infosnap LVM snap name         Status                COW Size              Data% -------------         ----------            ----------            ------ root_snap             active                5.00 GiB              0.05% opt_snap              active                30.00 GiB             0.02% u01_snap              active                70.00 GiB             0.02%

As shown above and below, the size of the snapshot will only contain the changes written during the OS upgrade.

[root@odadev1 ~]# lvs
  LV         VG          Attr       LSize   Pool Origin     Data% Meta% Move Log
  LogVolDATA VolGroupSys -wi-a-----  10.00g
  LogVolOpt  VolGroupSys owi-aos---  60.00g
  LogVolRECO VolGroupSys -wi-a-----  10.00g
  LogVolRoot VolGroupSys owi-aos---  30.00g
  LogVolSwap VolGroupSys -wi-ao----  24.00g
  LogVolU01  VolGroupSys owi-aos--- 150.00g
 
opt_snap   VolGroupSys swi-a-s---  30.00g      LogVolOpt  0.01 <— snapshot
  root_snap  VolGroupSys swi-a-s---   5.00g      LogVolRoot 0.04 <— snapshot
  u01_snap   VolGroupSys swi-a-s---  70.00g      LogVolU01  0.02 <- snapshot



ODABR tips when patching

  • You can use the "odabr –dryrun” option before choosing the right size .

  • When custom snapshots already exist on the system during odacli create-prepatchreport run, the precheck fails, because it expects to create these snapshots itself. However, odacli update-server –c OS still continues with the upgrade. 

  • Use -force option during upgrade to skip the auto backup.

    # odacli update-server -v 19.6.0.0.0 -c os --local --force Verifying OS upgrade Current OS base version: 6 is lessthan target OS base version: 7 OS needs to upgrade to 7.7

  • Run ODABR backup right after the repository update in order to avoid extracting the patch a second time

    $  odacli update-repository –f oda-asm-zipfile1,zipfile2,zipfile3,zipfile4


      You can now follow the rest of the guided steps to patch ODA from 18.8 to 19.9

  • When Running the post upgrade checks: You’ll be asked to delete the snapshots

    [root@odadev1]# ./odacli update-server-postcheck -v 19.6.0.0.0
    Comp Pre-Check Status Comments
    ---- --------------- -------- ---------------------------------
    OS ODABR snapshot WARNING ODABR snapshot found. Run 'odabr delsnap'

    -- Delete the snapshots
    [root@odadev1]# /opt/odabr/odabr delsnap
    INFO: 2022-11-07 20:44:55: Removing LVM snapshots
    SUCCESS: 2022-11-07 20:44:55: ...snapshot for 'opt' removed successfully
    SUCCESS: 2022-11-07 20:44:55: ...snapshot for 'u01' removed successfully
    SUCCESS: 2022-11-07 20:44:56: ...snapshot for 'root' removed successfully



 
Recovering from a Failed Operating System Upgrade

In case things go south, we can always rollback sine we have a restore point.

  1. Download ODARescue Live Disk ISO image for the 19.6 release to enable booting the node on which the OS upgrade failed: See Oracle Support Note 2495272.1:This image has an empty alt attribute; its file name is image.png
    Then Configure the ODA system on Oracle ILOM to boot from the ISO image

  2. Specify the NFS location, including the IP address and path with file name, for the ISO image.

    -set /SP/services/kvms/host_storage_device/remote server_URI=nfs://10.10.1.1:/export/iso/ODARescue_LiveDisk.iso
  3. Configure the ISO image from the Oracle ILOM Service Processor (SP) serial console so that you can use the ISO image to boot the Oracle Database Appliance system.

    -> set /SP/services/kvms/host_storage_device/ mode=remote -> set /HOST boot_device=cdrom

  4. Reboot the ODA host from ILOM using ODARescue ISO image.

  5. Login as root user with password "welcome1" ( user "odalive" can also be used).

  6. If you decide to revert to the Oracle Linux 6 configuration after troubleshooting, then run the below 

    # odarescue ol6restore
    ol6restore will restore:
    boot/efi partition
    LVM snapshots (root, opt, u01)
    grub v1

    This command restores the Oracle Linux 6 configuration using the snapshots that were taken using ODABR.

Conclusion

  • ODABR is a very convenient tool that can help you backup & recover your server from OS corruption
  • We also learned how to reduce the snapshot footprint before upgrading the ODA from 18.8 to 19.6
  • With this in mind, you can patch your ODA to 19.6 safely even if your free space is lower than 190GB
  • I hope this can help learn more about this tool which got me curious back when I first patched ODA to 19.6 couple of years ago 

        Thank you for reading

No comments:

Post a Comment