Tuesday, August 3, 2021

OVM series part 2: What to collect when opening an SR

Intro

In my last post, I have described few commands within Oracle OVM manager CLI. This time, the topic is about how to identify the relevant information of your OVM environment as well as the logs and their location for each component.
OVM servers and OVM manager have their own logs, and when issues occur It’s good to have them handy before reaching out to Oracle support. Checking the details of your configuration preemptively also helps making sure you are communicating the right release number of the components along with their state quickly.

Check the configuration

Oracle VM Manager

Configuration can be checked  through ovm_admin command. content is usually set by default

[root@em1 ~]# /u01/app/oracle/ovm-manager-3/bin/ovm_admin --listconfig

Oracle VM Manager Release 3.4.4 Admin tool
Oracle VM Manager Configuration
Database type               : MySQL
Database Server hostname    : localhost
Database name               : ovs <-- Database storing the ovmm metadata
Database Listener port      : 49500
Oracle VM Database user     : ovs
WebLogic Server admin       : weblogic
Oracle VM Manager admin     : admin

Or by checking the configuration file  where you can find the OVMM UUID and build number useful in case of reinstall/restore

[root@em1 ~]# cat /u01/app/oracle/ovm-manager-3/.config
DBTYPE=MySQL
DBHOST=localhost
SID=ovs
LSNR=49500
OVSSCHEMA=ovs
APEX=8080
WLSADMIN=weblogic
OVSADMIN=admin
COREPORT=54321F
UUID=0004fb00000100007cc584fa7bf6e57f
BUILDID=3.4.4.1709 <--- exact build number

Here you can have a look at the ovmm service  and the mysql backup configuration

[root@em1 ~]# cat  /etc/sysconfig/ovmm
JVM_MEMORY_MAX=4096m
JVM_MAX_PERM=512m
RUN_OVMM=YES
DBBACKUP=/u01/app/oracle/mysql/dbbackup
DBBACKUP_CMD=/opt/mysql/meb-3.12/bin/mysqlbackup
UUID=0004fb00000100007cc584fa7bf6e57f

Oracle VM server

Configuration can be checked  through The xm commands which interact with the Xen hypervisor in the ovm host

[root@ovm-01 ~]# xm info
host                   : ovm-01
release                : 4.1.12-124.20.3.el6uek.x86_64
version                : #2 SMP Thu Oct 11 17:47:32 PDT 2018
machine                : x86_64
nr_cpus                : 24
nr_nodes               : 2
cores_per_socket       : 6
threads_per_core       : 2
cpu_mhz                : 3059
hw_caps                : bfebfbff:2c100800:00000000:01703f00:029ee3ff:00000000:xxx:000
virt_caps              : hvm hvm_directio
total_memory           : 147447
free_memory            : 75478
free_cpus              : 0
xen_major              : 4
xen_minor              : 4
xen_extra              : .4OVM
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          :
xen_commandline        : placeholder dom0_mem=max:3792M allowsuperpage dom0_vcpus_pin dom0_max_vcpus=20 crashkernel=512M@64M
cc_compiler            : gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18.0.7)
cc_compile_by          : mockbuild
cc_compile_domain      : us.oracle.com
cc_compile_date        : Thu Sep  6 08:24:27 PDT 2018
xend_config_format     : 4

-- list the vms in the ovm host

[root@ovm-server04 ~]# xm list
Name                                        ID   Mem VCPUs      State   Time(s)
0004fb00000600000f199f55466d4b84            14  8199     1     -b---- 100696.0
0004fb00000600001a440b79c4e6e194            20  4099     2     -b----  36437.3

Directory purpose
------------ -------------------------------------------------------------------------

/etc/xen      Contains OVM Server configuration files for the OVM Server daemon
and virtualized guests.

OVM agent

You can check the version by running an rpm command on the agent package  or checking its service status

[root@ovm-04 ~]# rpm -aq ovs-agent
ovs-agent-3.4.5-11.el6.x86_64

[root@ovm-04 ~]# service ovs-agent status



Available logs

Oracle VM Server directories
  all log directories are usually located under /var/log directory, each of them logs information for a specific module

Directory         Purpose
----------------- ----------------------------------------------------------------------- /var/log          Contains the OVM Agent log file, ovs-agent.log.
                  Contains the ovmwatch.log, which logs virtual machine life cycle events.
                  Contains the ovm-consoled.log, which logs remote VNC console access, and all communication with OVM Manager. /var/log/xen      Contains OVM Server log files.
/var/log/messages Contains OVM Server messages.

 LogFile          ------      Purpose
-------------------------- ----------------------------------------------------------------
/var/log/xen/xend.log    ---- All OVM Server daemon actions. Same output as xm log command. /var/log/xen/xend-debug.log More detailed logs of OVM Server daemon. /var/log/xen/xen-hotplug.log log of hotplug events if a device or network script doesn’t start up or become available. /var/log/xen/qemu-dm.pid.log log for each hardware virtualized guest. Replace the pid this in the file name. /var/log/ovs-agent.log         log for OVM Agent. /var/log/osc.log               log for OVM Storage Connect Plug-ins. /var/log/ovm-consoled.log      log for the OVM virtual machine console. /var/log/ovmwatch.log log for the OVM watch daemon.

Oracle VM Server Command-Line Tools
Beside tailing ovs-agent.log you can also use xm diagnostic commands

command Purpose
--------- ----------------------------------------------------------
xentop   Displays real-time CPU/Mem information about OVM Server and domains
xm dmesg Displays log information on the hypervisor.
xm log   Displays log information of the OVM Server daemon.

OVM manager

The main log file is AdminServer.log which has multiple versions according to its rotation.

  Directory File
-------------------------------------------------------------------------- -----------
/u01/app/oracle/ovm-manager-3/domains/ovm_domain/servers/AdminServer/logs/ AdminServer.log    
access.log
AdminServer-diagnostic.log

You can also use an ovm tool called OvmLogTool.py to generate an error summary log based on all AdminServer.log versions

[root@ovm-manager01 bin]# cd /u01/app/oracle/ovm-manager-3/ovm_tools/bin [root@ovm-manager01 bin]# python OvmLogTool.py -s -o summary
processing input file: /u01/app/oracle/ovm-manager-3/domains/ovm_domain/servers/AdminServer/logs/AdminServer.log00001 processing input file: /u01/app/oracle/ovm-manager-3/domains/ovm_domain/servers/AdminServer/logs/AdminServer.log

[root@ovm-manager01 bin]# ll summary
-rw-r--r--. 1 root root 985903 Jul 25 06:25 summary


SR related diagnostic logs

Oracle support will probably ask you the following logs right after opening the SR.

SOSREPORT

This report will collect diagnostic and configuration information from OVM manager host and all linked OVM servers.
i.e.: (rpm versions, syslog, network config, filesystems, disk partition details, loaded kernel modules & status of all services)

[root@em1 ~]# sosreport -v
sosreport (version 3.4)
Press ENTER to continue, or CTRL-C to quit.

Please enter your first initial and last name [em1]:
Please enter the case id that you are generating this report for []: 3-236xxxxx

Setting up archive ...
Setting up plugins ...
Running plugins. Please wait ...
Running 94/94: yum...
Creating compressed archive...
Your sosreport has been generated and saved in:/var/tmp/sosreport-em1.3-236xxx-YYYYMMDD.tar.xz

Note sometimes the report fails to run on one of the OVM servers.In that case, just run it separately in each OVM server. 

VMPINFO Diagnostic Tool For Oracle

VMPinfo is a script that collects diagnostic information for OVM, including the OVM Manager and all linked Oracle VM Servers. this script includes the run of sosreport, hence it’s enough to only run VMPinfo command without sosreport 

[root@em1]# cd /u01/app/oracle/ovm-manager-3/ovm_tools/support/
[root@em1 support]# ./vmpinfo3.sh --username=admin listservers
Enter OVM Manager Password:
The following server(s) are owned by this manager: ['ovm-01', 'ovm-02']

[root@ovm-manager01 support]# ./vmpinfo3.sh --username=admin

Enter OVM Manager Password:

Gathering files from all servers. This process may take some time.
Gathering OVM Model Dump files Gathering sosreport from ovm-01 Gathering sosreport from ovm-02
Data collection complete
Gathering OVM Manager Logs
Clean up metrics
Copying model files
Copying DB backup log files
Running lightweight sosreport
Archiving vmpinfo3-20210725-073817

=======================================================================================
Please send /tmp/vmpinfo3-3.4.6.2424-20210725-073817.tar.gz to Oracle support
=======================================================================================


Check for Database Corruption in OVM Manager

OVM manager operations and metadata are stored in a MySQL repository database. The database is automatically backed up daily but if any corruption is detected the backups will stop and all new changes won’t be recovered in case of a crash. 

[root@em1 mysql]# cd  /u01/app/oracle/mysql/data
[root@em1 data] # cat my.cnf |grep log
log-error=/u01/app/oracle/mysql/data/mysqld.err
innodb_log_group_home_dir=/u01/app/oracle/mysql/data
innodb_log_buffer_size=256M
innodb_log_file_size=768M
innodb_flush_log_at_trx_commit=2
innodb_log_files_in_group=2
--
[root@em1 mysql]# tail mysqld.err


Consistency check

1. Look for ONF (Object Not Found) errors in OVMM DB.

[root@em1 ~]# /usr/bin/ovm_shell.sh -u admin
    Password:
    OVM Shell: 3.4.4.1709 Interactive Mode
    --- Run below commands in this order in OVMM shell prompt.
    >>> om = OvmClient.getOvmManager()
    >>> f = om.getFoundryContext()   
    >>> f.fixupScan()                
    [11509]   <--- If the result is not empty then there is corruption and inconsistencies in OVMM DB.

2. Validate if Daily AutoFullBackup DB backups stopped working.

[root@em1 ~]# ls -ldt /u01/app/oracle/mysql/dbbackup/AutoFullBackup*
/u01/app/oracle/mysql/dbbackup/AutoFullBackup-20210722_222459 <--- must be <=24h old
/u01/app/oracle/mysql/dbbackup/AutoFullBackup-20210721_222433

3. Look for Object Not Found (ONF) and Cluster is null errors in OVM Manager Admin Server logs.

[root@em1 ~]# egrep -iR "cluster is null|ObjectNotFound|inconsistencies" /u01/app/oracle/ovm-manager-3/domains/ovm_domain/servers/AdminServer/logs/AdminServer*


Action If corruption is confirmed, then perform a OVM Manager DB rebuild (Regeneration)  Doc ID 2038168.1


Conclusion

This was just a preview of what you might look at when troubleshooting the OVM environment. For my part , the common issues happening most of the time are related to some job locks or network stack hiccups . For the later the Support will also ask you to run commands like “netstat -ltnp” , “nc –zv” or “tcpdump” between the manager and the OVM hosts. In the next post we will be covering the backup and recovery tools available for OVM.

Thank you for reading

No comments:

Post a Comment