Saturday, December 23, 2023

What Autoupgrade won't Catch for you when moving to 19c Part1: Ghost OLAP

Granny Fight GIF - KungFuPanda3 Training Slow GIFs

Intro


So far, I have used Oracle AutoUpgrade, many times in 3 different OS’. Yet the more you think you’ve seen it all and reached the highest confidence level, another Upgrade will come byte you in the butt. The truth is every maintenance in any software or platform is unique, Oracle databases are no exception.
Automation will not solve all your problems which means organizations will still need humans when things get nasty.


This is my last series on AutoUpgrade troubleshooting as I anticipate reduced work on DBs in the near future. but I wanted to document few fixes to save your production upgrade from blowing up.   Enjoy


AutoUpgrade is still the best

Despite some issues, AutoUpgrade remains the best option to upgrade databases to 19c and it’s easy to agree, after checking the below methods available to upgrade/migrate to 19c in this migration white paper.


The environment

In my case, I needed to migrate my 12c CDB to 19c, while preserving the Data Guard setup & reducing downtime. 

PlatformSource CDB database SITarget CDB SIGrid /ASM   Dataguard
Linux RHEL 812.1.0.2 Enterprise Edition19.17.0.0 Enterprise Edition Yes YES


AutoUpgrade

19c jdk              autoupgrade.jar
1.8.0_201     Build.version 22.4.220712 


The Upgrade strategy


While the upgrade process itself isn't covered here, I’ll mention the steps required to reproduce our AutoUpgrade in a Data Guard protected environment. If you want to look further into the steps, check out the excellent article by Daniel Overby Hansen called How to Upgrade with AutoUpgrade and Data Guard.

Overview of upgrade with a data guard
Prerequisites

The following is assumed to be already completed on both primary and standby hosts:

  • Install and patch a new 19c Oracle Database Home to the latest RU

  • Installing and patching a new 19c grid infrastructure to the latest RU

  • Upgrading the existing 12c grid into the new one (19c)

The steps

  • Stop Standby Database

  • Upgrade the primary DB

    • Run AutoUpgrade with the appropriate Config file [analyze, fixups, deploy]

  • After Upgrade

    • Restart Data Guard

      • Update the listener and /etc/oratab on the standby host.

      • Upgrade the DB by updating the Oracle Home information (srvctl upgrade database)

      • Re-enable Data Guard

      • Update RMAN catalog to new 19c client’s version

Reproduce the issue

After running AutoUpgrade Analyze to clear all warnings from the prechecks. The deploy unfortunately crashed.


The Configuration

-  The Config file as shown below, can defer redo transport & stop Data Guard broker automatically if in use.

#Global configurations global.autoupg_log_dir=/u01/install/Autoupgrade/UPG_logs ################### # Database number 1 ################### upgd1.sid=PROD upgd1.source_home=/u01/app/oracle/product/12.2.0.1/dbhome_1 upgd1.target_home=/u01/app/oracle/product/19.0.0/dbhome_1 upg1.log_dir=/u01/install/Autoupgrade/UPG_logs/PROD upg1.run_utlrp=yes upg1.source_tns_admin_dir=/u01/app/oracle/product/12.2.0.1/dbhome_1/network/admin upg1.timezone_upg=yes upg1.restoration=yes

1. Autoupgrade analyze

C:\> java -jar autoupgrade.jar -config UP19_PROD.cfg -mode analyze

2. Autoupgrade deploy

The environment was ready for a go so I launched the deploy phase

C:\> java -jar autoupgrade.jar -config UP19_PROD.cfg -mode deploy
... An hour later
upg> lsj
+----+-------+---------+---------+-------+--------------+--------+------------+
|Job#|DB_NAME| STAGE |OPERATION| STATUS| START_TIME | UPDATED| MESSAGE |
+----+-------+---------+---------+-------+--------------+--------+------------+
| 110| PROD |DBUPGRADE|STOPPED | ERROR | Nov 02 16:42 | |UPG-1400 |
+----+-------+---------+---------+-------+--------------+--------+------------+
upg>
----------------------------------------------
Errors in database [PROD-MYPDB1] Stage [DBUPGRADE]
Operation [STOPPED] Status [ERROR]
Info [ Error: UPG-1400 UPGRADE FAILED [FSUAT]
Cause: Database upgrade failed with errors
REASON: ORA-00604: error occurred at recursive SQL level 1


The OLAP Error

The upgrade phase never finished but most of the PDBs  were having issues halfway through the upgrade (incomplete catalog)


This is just an example for one of the error received by most of the PDBs  in the source 12c  CDB
ORA-00604


When I check the line described in the error in catupgrd log file we can see the below excerpt


When we look at the status of the components on those impacted PDBs we can see that OLAP API is Invalid


Furthermore if we check the plugging database violations for these PDBs we find 2 OLAP culprits

NAME CAUSE MESSAGE STATUS -------- -------- ------------------------------------------------------- --------- MYPDB1 OPTION Database option APS mismatch:PDB installed version PENDING 19.0.0.0.0. CDB in stalled version NULL. MYPDB1 OPTION Database option XOQ mismatch: PDB installed version PENDING 19.0.0.0.0. CDB in stalled version NULL.


Explanation


Cause

This occurred because their previous upgrade from 11g to 12c didn't properly remove the deprecated 11g OLAP component from their PDBs after the conversion to multitenant. Refer to the preupgrade run note below.


Indeed looking back at the Autoupgrade prechecks , we can notice that most PDBs have APS (Analytic Workspace)and XOQ (OLAP API) components that are there but market as ‘OPTION OFF’.


Expectation


This is where the AutoUpgrade should come in and flag these sort of issues as critical right from the early Analyze stage to help DBAs avoid upgrade crash during a production upgrade. Opening an SR is already a loss in terms of planned downtime.   


Solution


We’ll have to manually cleanup OLAP remnants before resuming the upgrade as described in  Doc ID 1940098.1

I have gathered all the sequenced commands in 2 scripts olap_remove.sql along with remove_olap_leftovers.sql

$ vi olap_remove.sql

col  name new_val pdb_name  noprint
select name from v$pdbs;
spool &pdb_name..log

prompt  ----> Remove OLAP Catalog
@?/olap/admin/catnoamd.sql
prompt  ----> Remove OLAP API
@?/olap/admin/olapidrp.plb
@?/olap/admin/catnoxoq.sql

prompt  ----> Deinstall APS - OLAP AW component
@?/olap/admin/catnoaps.sql
@?/olap/admin/cwm2drop.sql

prompt  ----> cleanup leftovers and Recompile invalids
@remove_olap_leftovers.sql
@?/rdbms/admin/utlrp.sql
spool off

Run the cleanup script for each PDB 

SQL> alter session set container=MYPDB1; @olap_remove
@remove_olap_leftovers.sql
alter session set container=MYPDB2; @olap_remove
@remove_olap_leftovers.sql

------ Repeat for all PDBs


Checks

Once the removal performed , we should verify that is no conflicting OLAP issue is left in the environment    

------ Repeat for all PDBs

1. The status of the components
SQL> select COMP_ID, COMP_NAME, VERSION, STATUS from dba_registry;
---- No OLAP component should be listed (Valid 19c options only)

2. Confirm there is no violations remaining
SQL> SELECT name, cause, message,status
FROM pdb_plug_in_violations
where STATUS != 'RESOLVED' ORDER BY time;

Resume the job

upg> resume -job 110

After this the Autoupgrade completed successfully and the standby database was re-enabled as expected in the remaining steps.


CONCLUSION

  • While automation tools like AutoUpgrade are powerful, they can't predict and fix all potential bottlenecks.

  • Staying vigilant and utilizing troubleshooting skills remains crucial.

  • However, the integrated flagging of known issues would go a long way into improving the user experience of DBAs fostering greater adoption in the future.


   
Thanks for reading  

Friday, December 8, 2023

Tech Content creator toolkit: the cheat sheet

Intro

Some of you are probably at a stage where daily work isn’t enough, and interacting with the tech community is the fulfilling mission you’ve been waiting for. But In order to do so, you need to level up your game. Whether your goal is to be a blogger, speaker, podcaster, or social media maverick, you need the right tools to make your creator’s life easy and your content pop! This cheat sheet, curated while I was looking for better ways of content creation, won't fix all your problems, but it'll set you up for a solid creative start. 
 
The tips are arranged based on the persona they target (blogger,speaker,podcaster,social media creator) .


Table of contents

I. Blogger


A. The platform

That’s the first stop.

1.WordPress

  Almost everybody knows about this 20 year old web/blogging platform and below options are available:

  • Self hosted: Download and install it yourself in an environment shipped with PHP and Mysql DB

  • WordPress.com: Has a paid and free version that you can directly start with

    • You can start a free website or blog today with everything you need to grow including:

      • Themes and patterns,SEO, site statistics, social media sharing,Built-in newsletters & RSS, Brute-force protection, monetization, and Spam protection

  2. BlogSpot

   Another OG, BlogSpot aka Blogger is a free platform acquired by google in 2003 that’s super simple 

  • It provides Full control on page format (font size etc) through it’s HTML editor

  • Integrated with Google Analytics to track user activity, engagement and more (see tutorial )
     


3. The new kids on the block

Each decade brings its own hype; just as Myspace was once cool, today's tech creators prefer different gems to share stuff.Medium as one of the first online publishing services has always been in the lead but you can see the new kids on the block are steadily climbing. Keep in mind that Medium caters to all topics where the others are mainly for software developers. *This data was sourced from https://trends.google.com/.*

Here is a brief description of each one of them (for a deep dive comparison check this article)

 
  3.1 DEV   

   dev.to is the largest online community for software developers that offers a platform for sharing 
   knowledge, with no paywalls or ads, but instead relies on revenue from sponsors, listings, and their shop.  
   They provide a Markdown text editor and a public API for automating publishing workflows.
.      
    Typical use
: Ex
change knowledge and experience with the largest developer community.

   
   3.2 Medium

     Medium is popular publishing platform that allows all kinds of writers to share and monetize their content
     with a large reader base. While it’s user-friendly for most, it may not be suitable for developers due to  
     the lack of Markdown support, syntax highlighting, & API integration.
    Tip: A friend covered the $5/month membership fee with just 3 posts submitted to a specific publication. Top 100 Medium Tags for Better Ranking and Visibility in 2023 | by BIKRAM  SARKAR | SYNERGY | Medium


       Typical use
: To write about diverse topics and monetise your work, and a faster audience outreach.

   3.3 Hashnode

    Hashnode is a free blogging platform and community for developers that allows you to publish articles on
    your own domain with a custom blog page. It offers customization options, including a custom CSS
    feature, and supports Markdown for code embeds and syntax highlighting.

  
   Typical use: to completely customise your blog page to represent your brand and link your own domain.

  
  3.4 Hackernoon

     Hackernoon is a retro looking platform covering topics like soft development, startups, AI, and crypto.
     They moved away from a Medium publication due to limitations in embedding tables and lack of syntax
     highlighting. If your post becomes a top story then you’ll be lucky to get it translated in 13 languages.


    Typical use: work with a pro team of editors & post to a platform accepting only high-quality content.


B. The Write up

Following options will help you craft, enrich and format your content:

 

1. Open live writer

Open Live Writer is an open source editor enabling users to author, edit, and publish blog posts. It’s based on a fork of the old and discontinued Windows Live Writer code.Open Live Writer works with many popular blog service providers such as WordPress, Blogger, TypePad, Moveable Type, DasBlog and many more.

         

I am actually writing this blog using Open Live Writer.

Note: Direct picture upload from your workstation to google BlogSpot will fail but you can paste the image in a GitHub Readme then copy it back to the live writer editor.



2. ChatGPT

There’s no shame to leverage LLMs to structure, refine your writing, especially for grammar, intro, and conclusions.
I sometimes catch myself doing funny things like asking GPT to rewrite my sentences as `Tony soprano`.   


3. Generate your Posters using AI

  It’s easier to be creative nowadays, using prompts on GenAI tools to design a better illustration to your topic. 

  Don’t try to copy others though as your post might lose the audience. Stick to relevant stuff.

  • Midjourney  the best quality but paid option only (from $8/ month)

  • DALL·E 3 is the free OpenAI image generator integrated in Microsoft platform

  • Limitation: the text injection in the images is often inaccurate and flaky which requires a manual edit.    


4. Photopea.com, Photoshop for the poor

Photopea is an Online Photo Editor that lets you edit photos, apply effects, filters, add text, crop or resize pictures.
It does almost all what Adobe PS does but on a web browser which is insane.

EDUTECH SKILLS on LinkedIn: how to use photopea | photopea tutorial |  photopea tutorial in hindi |…

For example, I can correct the flaky text in the AI Generated image by re writing it in photopea .

5. Carbon, Beautiful code snippet images

Carbon is an online tool to create nice images of your code snippets. You just need to type and your code will be highlighted according to the chosen language (80 programming languages supported).

Carbon Code Snippets. Sharing code snippets has never been… | by Connor  Hansen | hvnsen | Medium



II. Speaker


A. The Write up

Same as the one discussed above in the blogger section.


B. PowerPoint

I’ll highlight AI and non AI tools in this section

                             

1. Copilot (Windows 11)

If you are lucky to have upgraded to Windows11,  Copilot is your new AI creative friend!
                   

Copilot in PowerPoint is an AI-powered assistant that empowers your creativity in your slides. It helps you create new, summarize, and organize your presentation, along with best design based on your content.

The Copilot icon on the PowerPoint ribbon.

Screenshot of Copilot in PowerPoint menu with Create a presentation from file prompt highlighted
  1.1 Create presentation from file

   With Copilot in PowerPoint, you can create a presentation from an existing Word document by adding the file link,
   and it will generate slides, apply layouts, and choose a theme for you.Word document URL pasted into PowerPoint Copilot pane


   1.2 Create a presentation from a prompt

     You can also create a new presentation using Copilot based on any prompt > “Create a presentation about xx”.

Screenshot of Copilot in PowerPoint with a prompt entered in to create a presentation

2. Convert to/from pdf 

Sometimes you need to export the final slides into pdf for your audience, or import pages from a pdf white paper .

My go to website is www.ilovepdf.com as it’s 100% FREE and easy to use.

iLovePDF Review (Pros & Cons), Alternatives [2023] | TalkHelper

3. Multilingual presos 

If you are, or live in a bilingual geographic area like me, translation is part of the job.The go to platform is obviously google translate but some of you might ignore that it allows to translate literal files (Doc,pdf,ppt,excel ..etc) 

 10 best ways to use Google Translate-get the most out of Google Translate

It can even translate images for you! these options will save manual work 


4. Split the animations into a pdf 

   PPspliT

  Converting a PPT to a PDF while maintaining the animation steps is something important when sharing your deck.
   PPspliT is a slick PowerPoint add-in, that allows to split those animations into several slides before the export.


III. Podcaster  

Though I had to catch up with this one,  there are a bunch of aspects I can share from what I learned so far.


A. Podcast platform

I will share 3 in total since but my favorite is definitely anchor for a larger list and comparison check this article.PODCAST

1. anchor

  anchor.fm is what am using which is absolutely free even after being snapped by Spotify.It’s been awesome so far.Anchor.fm - Creative at Home

  • Excellent for beginners Great for Spotify users

  • Absolutely free and great support (very quick)

  • Unlimited storage: no limit on the number of episodes you upload

  • Provides tools for editing and recording podcasts, as well as access to Spotify tunes

  • Seamless video podcasting


2. Podbean

    Podbean is another popular free podcasting platform for hosting.

  • Great for businesses and enterprises

  • Excellent promotion tools

  • Provides a website

  • only Up to 5 hours of audio and up to 10 episodes for the Free plan


3. Acast

   Acast is smilar to anchor, offers quite a lot of features in its free plan such as create a website and podcast player.

  • Unlimited episodes and bandwidth.

  • Basic analytics and marketing tools (i.e transcription).

  • Video teaser and YouTube distribution.

  • Social Media Management


B. Streaming platform

This can be very diverse depending on your taste but you can choose from options such as:

1. Zoom

No introduction needed

2. Streamyard

    Streamyard My favorite for audio quality and the following reasons  StreamYard

  • Local audio/video recordings (host/guests) no loss due to weak internet

  • Will do the eq and noise reduction for you.

  • A trimmer tool to edit and post clips (YouTube Shorts)

  • Live Streaming (twich,YoutubFB,X) , Screen-sharing

  • 24/7 Live Support and a generous free plan (5 hours/m)

3. Riverside

Riverside.fm offers studio-quality local recordings without the downtime

  • Screen sharing

  • 4k quality available

  • Live audience call-in: Listeners can call in with questions and comments, like a radio talk show

  • Streaming to social media

4. Zencastr

zencastr is another alternatives for streaming that can get the job done. But they have some flaws


5. OBS + VDO.ninja
      OBS + VDO.ninja is a powerful combination that offers many benefits for podcasters and live streamers.

GitHub - steveseguin/vdo.ninja: VDO.Ninja is a powerful tool that lets you  bring remote video feeds into OBS or other studio software via WebRTC.
  • OBS: Open Broadcaster Software is a free/open-source software for recording & live streaming videos.

  • VDO.ninja: is a free, secure, & ultra-low latency peer-to-peer video bridge that allows users to bring live video from their computer, or friends directly into OBS.


Trad offs Here are some of the advantages and tradeoffs of using OBS + VDO.ninja:

C. Audio editing

     Another crucial side of podcast recording, the edit, raw version is often not as clean.

   1. Audacity

     Audacity is a free and open-source tool that can edit, and mix audio for you with basic features like noise gate.

   2. Adobe audition

     Adobe Audition is a professional audio workstation that lets you do a refined work using industry’s best, but
     it’s expensive.

   3. Cubase (Steinberg)

      I personally use Cubase which is a professional DAW (digital audio Workstation) used by musicians, composers and producers. The only reason I use it is because of my previous life in the underground music industry.
   

  Tip: Streamyard will do the eq and noise reduction for you.


D. Video editing YouTube 

1. Veed.IO

 VEED.IO is an online video editing platform offering a range of features to help create quality video content.Hero Image.png


Some of the advantages of VEED.IO for podcast video editing include:

  • Effortless podcast editing

  • Pro video editing features

  • AI-powered audio editing

  • The perfect feature for me was the dynamic Sound Wave

IV. Social Media  


A. text Format

     When you post in twitter or linkedin you often want to highlight or use bold characters but it’s not trivial.
     I usually go to yaytext.com and generate all the bold and italic text I need.
 

B. Scheduling platforms

   1. Tipefully

      Typefully is a great tool for scheduling social media posts.

  • It offers a generous free plan

  • Allows you to write, schedule, and publish great Twitter tweets and threads.

  • Provides analytics and metrics about your account, that help  grow your following

  • Use AI to rewrite & improve your tweets

  • My favorite is the tweet preview option that’s handy to check before you tweet

Other services with a Free plan

  • Simplified: social media management tool, with insights on performance and engagement.

  • Buffer: social media management tool that experts often turn to for scheduling LinkedIn posts in advance.


Conclusion

  • In one line: Content creator journey is long, but I hop this article helps!

Thanks for reading


OCI FortiGate HA Cluster - Reference Architecture: Code review & Fixes

Intro


OCI Quick Start repositories on GitHub are collections of Terraform scripts and configurations provided by Oracle. These repositories are designed to help orgs quickly deploy common infrastructure setups on OCI Platform.
Each Quick Start focuses on a specific use case or workload, which simplify the process of provisioning on OCI using Terraform. A sort of IaC based reference architecture.


Today, we will code review one of those reference architecture which is a Fortinet firewall Solution deployed in OCI.
Note: This article won’t discuss the architecture, but will rather address its terraform code flaws and fixes.



Why some errors never get to your OCI Resource Manager stack?


  • Certain Terraform errors may not reach your RM stack due to its design. For instance, RM allows the hardcoding of specific variables, like availability domains, directly in its interface. This sidesteps the need for these variables to be checked by native conditions in the TF code.

  • Moreover, RM reads these variables from the schema.yaml file, altering the behavior compared to local Terraform CLI execution. This approach can result in certain errors being handled or bypassed within the RM environment, creating a distinction from standard Terraform workflows.



The stack: FortiGate HA Cluster using DRG - Reference Architecture


The stack is a result of the collaboration of both Oracle and Fortinet. This architecture is based on a Hub & Spoke topology, using FortiGate firewall from OCI Marketplace. I actually deployed it while working on one of my projects.


For details of the architecture, see Set up a hub-and-spoke network topology.


The repository


You will find this terraform config under the main oci-fortinet github repository. But not in the root directory.



The Errors


At the time of writing this, the errors were still not fixed despite opening issues and sharing the fix. You can see that the last commit goes back to 2 years. You will need to clone the repo and cd to the drg-ha-use-case subdirectory 

$ git clone https://github.com/oracle-quickstart/oci-fortinet.git

$ cd use-cae/drg—ha-use-case

$ terraform init


1.  Data source error in Regions with unique AD

  

You will face this issue on a region with only one availability domain (i.e ca-toronto-1) as the data source of the availability domain will fail the terraform execution plan.


CAUSE:  See issue #8 

  • In the above error terraform complains about the availability data source having only one element

  • This impacts 2 of the “oci_core_instance resource” blocks (2 web-vms, 2 db-vms).

  •  Problem? 

    • count.index for the data source block will always be equal 0 on single AD regions (1 element).
      See data_source.tf line 8-10. This configuration hasn’t been tested in single AD regions.

      $ vi data_source.tf

      # ------ Get list of availability domains
      8 data "oci_identity_availability_domains" "ADs" { 9  compartment_id = var.tenancy_ocid 10 }



  • Reason:

    • In terraform the count.index always starts at 0, if you have a resource with a count of 4, the count.index object will be 0, 1, 2, and 3.

    • Let’s take for example the "web-vms" oci_core_instance block in compute.tf > line 235

    • If we run the condition:
      - The variable availability_domaine_name is empty
      - The ads data source length = 1 element. That means that the AD name will be equal to
      ads data_source collection with an index value of [0+1] =

    • data…ads.availability_domains[1] doesn’t exist as it only contains 1 element
       

Solution 

Complete the full availability domain conditional expression on line 235 and line 276 (web-vms/db-vms)

  • Add the case where data source ads.availability_domains has 1 element (the region has one AD only)



Bad logic 

Seeking the name of the count.index+1 availability domain is still wrong when the region has more than 1 AD

  • Example: say you want to create 3 vms and your region has 2 Availability domains >1 .

    • The first iteration [0] will set count.index+1 = 1 ( 2nd data source element = AD2) 

    • Then the second iteration sets a count.index+1 = 2 ( 3rd data source element=AD3)

    • The 2nd and 3rd iteration will always fail because there’s only 2 ADs (index list [0,1]).



2. Wrong compartment argument in the security list data sources

  

Another issue you will run into is a failure to deploy subnets due to data source collection being empty (no element).


CAUSE:  See issue #9 

  • In the above error terraform complains that {allow_all_security} data source is empty

    • This impacts all fortigate subnets blocks in the config as they all share the same security lists.

Reason:

  • In this configuration there are 2 compartments , one for compute and another for network resources

  • If you take a look at  "allow_all_security"  block in datasource.tf > line 64-to-74

  • You’ll notice a wrong compartment ID in the security lists data source (compute instead of network)


  

    Solution 
     

    This was a silly mistake, but took me a day to figure out while delving through a pile of new terraform files.

    All you need to do is replace the compute compartment variable by var.network_compartment_ocid

    Edit network.tf line 64-74

    # ------ Get the Allow All Security Lists for Subnets in Firewall VCN

    data
    "oci_core_security_lists" "allow_all_security" {
      compartment_id = var.network_compartment_ocid    <--- // CORRECT Compartment
      vcn_id         = local.use_existing_network ? var.vcn_id: oci_core_vcn.hub.0.id
    ...


    3. More code inconsistencies


    I wasn’t done debugging as I found other misplaced compartment variables in some vnic attachments data sources

    • See datasource.tf : line 103-115 &118-130, you need to replace them by var.compute_compartment_ocid 



    Conclusion & recommendations

    • This type of undetected code issues ,is why I never trust the first deployment in Resource Manager.
      In order to avoid problems in the future, especially if you decide to migrate out of RM at some point, I suggest the following workflow:

      1. Run locally and validate any code bug

      2. Run on Resource Manager

      3. Store to git repo (blue print with eventual versioning)

    • I hope this was helpful as the issues I opened are still unsolved for over a year in their GitHub repo.  



    Wednesday, November 29, 2023

    Cloud Showdown: Bare Metal vs. VMs in OCI – Pros & Cons

    Intro


    The migration journey to the cloud for a business comes in different shapes and colors. Today, we’ll explore a quick comparison between Bare metal and VM platforms, which are two IaaS compute options available in Oracle Cloud infrastructure
    . Although specific to OCI, you might find similar benefits and trade-offs in other Cloud platforms.


    In this short post, we will revisit what the VM platform has to offer compared to the Bare Metal Option and remind where Bare Metal offering still make sense.



    Why Opt for Virtualized Platforms over Bare Metal? 


    At present, your organization might utilize bare metal servers to support your critical applications. While BM servers offer high performance and dedicated resources, there are several compelling advantages to migrating to VM-based machines within OCI.

    Side note: Broadcom just acquired VMware and decided to split it, which brings a lot of uncertainty to its customers and partners. So you might as well consider your options.    

    Here’s a small list:

      I. Enhanced Agility


      With VM-based machines in OCI, you can dynamically scale resources up or down, ensuring optimal performance while maximizing cost-efficiency. 

      • High scale VM provisioning  
      • No need to wait for a new physical host to deploy more resources as VMs can be created by thousands with a base CPU power up to 32 core for intel and 64 for AMD.

      • Elastic compute shapes

              Only in VMs can you access flex shapes (intel/ADM) that allow for custom number of CPUs and memory size
               to fit your specific application needs.Example: High memory but low CPU workloads (3CPU | 112GB).


      • You can change your VMs shape without having to rebuild your instances or redeploy your applications.

      • Extended Memory VMs    


        In May 2023, OCI launched Extended memory VM instances to provide more memory and CPU cores that exceed the amount a single physical socket carries (see table below).
        Supported flex shapes:

        • VM.Standard3.Flex , VM.Standard.E3.Flex & VM.Standard.E4.Flex



        • How does that work?
          The extended VMs are given cores and memory across multiple physical sockets.

          However, you should remember to optimize your application layer to be NUMA aware.

          • Extended AMD flex example


          • Extended Intel flex example



      • Block volume performance auto tuning
        Enables Block Volume to adjust the volume's performance between levels you specify, based on the actual monitored performance of a volume like CPU autos calling but for storage. learn more here.


        How does that work?
         

        • You set the min and max performance based on volume performance units per GB(VPUs/GB)

        • More VPUs will allocate more resources to a volume, increasing IOPS/GB and throughput/GB

        • Block Volume adjusts the performance to the min level as much as possible.

        • As volume load increases, the performance is scaled up as needed, on a best-effort basis.

        • The metrics used to trigger the tuning are

          • Volume throttled operations

          • Volume guaranteed VPUs/GB, IOPS, and throughput

        • Scale to 0 : Detached volume perf autotuning feature, it even enables to adjusts the performance level to Lower Cost (0 VPUs/GB), When the volume is detached.



      II. Cost effectiveness

      • Turn off the light service

        You can schedule shutdown of idle servers when not needed (after Hours/weekends..) & stop paying for compute to save up money (not possible in BM hosts that stay up even if underlying VMs are down)

      • Host and hypervisor overhead

        Unlike on BM hosts,The physical and hypervisor layer is taken care by Oracle Cloud, which will leave lot more time for your Ops team to focus on the application performance and enable the developers.

      • License Compliance Simplified
        Migrating to OCI VM-based machines eliminates the need to pin cores to comply with software license agreements. Oracle provides "Intellectual Property (IP) License Assurance" for VM instances, which means you no longer have to allocate dedicated cores for specific software licenses. This allows you to optimize resource utilization and reduce costs.

      • Bring Your Own License (BYOL)   
        OCI VM-based machines offer the flexibility to leverage your existing licenses through the BYOL program. You can bring your current licenses for Oracle Database, WebLogic Server, and other Oracle products and enjoy cost savings by deploying them on VM instances in OCI. This way, you can maximize your existing investments and minimize licensing costs.

      III. VM Infrastructure added value services

      • Optimized Network Virtualization



        OCI's VM-based machines leverage a highly optimized KVM layer that takes full advantage of isolated network virtualization. The network virtualization is separated from the host and hypervisor, ensuring enhanced security and isolation for your applications and data. This architecture provides a robust and reliable foundation for your workloads.

      • Simplified Management and Deployment


        OCI's VM-based machines are fully integrated with Oracle's suite of management and automation tools. This includes all API based tooling; OCI Console, OCI CLI, and Terraform via Resource Manager.

        These tools simplify provisioning, monitoring, and managing your VM instances, ensuring a seamless migration experience and easing Full stack DR implementation.

      • Enhanced Observability: 

        OCI's VM-based machines have native integration with comprehensive monitoring/observability tools through the Cloud Observability and Management Platform. This platform streamlines logging and offers specialized metrics and insights for WebLogic Server and Oracle Database

        • WebLogic Server Monitoring
          Native monitoring allowing you to track critical metrics of your WebLogic Server instances, such as response times, throughput, JVM memory utilization, and thread pool usage.
          You can set up alerts based on thresholds for these metrics to ensure that you are notified when any performance degradation occurs (i.e response times).

        • Database
          Monitor key database performance metrics, such as CPU usage, memory utilization, I/O latency, and query execution times from OCI console and offer proactive alerts and loggings . OEM is also supported for enterprise edition licenses.

      • Native Security Features: not out of the box in Bare metal

        • OS Management Service
          Allows automation of patch management process through scheduled patching for your OCI VMs, which ensures that your VM instances are up to date with the latest OS security patches, reducing the risk of exploitation exposure.

        • Vulnerability Scanning and Security (VSS): 
          Provides comprehensive visibility into the security posture of your VM-based instances.
          VSS scans your instances regularly to identify and report all Common Vulnerabilities and Exposures (CVEs) not protected on the VMs

        • Audit Capabilities  


          OCI offers built-in audit features that provide comprehensive visibility into the VMs and enable you to track and monitor changes, access, and activities within your environment.

          You can generate audit logs that capture critical events, configuration changes, user authentication, and resource provisioning, allowing you to meet compliance requirements, detect unauthorized activities, and enhance the security of your infrastructure.


      • IV. How about Isolation, Compliance for my VMs?

      • Dedicated Virtual Machine Host (Mixed solution)

        What if your company just can’t certify VMs in a multitenant infrastructure due to regulatory reasons, and must comply with isolation and licensing requirements for entire servers (host-based license)?

      • Bare metal might be the solution, but you still don’t want the overhead of maintaining the hypervisor layer.

              OCI Dedicated VM hosts, answer that very issue by allowing to run VM instances on dedicated servers, which
               are single tenant and not shared with other customers.

        • Advantages

          • Simplicity: the entire hypervisor layer is managed & supported by OCI (less overhead)

          • Most OCI VM features supported provisioning,managing VMs via the console, API,CLI 

          • A range of dedicated VM host shapes to choose from like Intel/ADM & flexible ones

          • Shapes that support flexible hosted VMs billed based on OCPUs & RAM separately.

        • Caveats

          • You are still billed for the entire host upon creation like Bare Metal host

          • Some OCI compute VM features are not supported:

            • Autoscaling, Burstable instances, Capacity reservations

            • Instance shape change, Instance Pools

            • Reboot & live migrations (use manual migration instead)

          • No CPU overcommit possible and less control compared to classic Bare Metal option  



      V. BYOL considerations in the cloud


      There are few things worth noting regarding BOYL licensing in OCI and in the cloud in general.
       
      Scaling and partitioning:


      OCI License Manager

      To simplify licensing management for both Oracle and 3rd-party software in OCI, Oracle has made a free License Manager service 
      which allows to: 

      • Eliminate overhead for software procurement and licensing

      • Enabling easy tracking and reporting of license utilization

      • Proactive monitoring and notifications for licensing needs


      Flexible shapes recap

      Here’s a sample of flexible shapes like E series (AMD) but there’s more in the OCI flex compute shape reference




      CONCLUSION

      • This brief overview captures key aspects/trade-offs of the Virtualized platform Vs the Bare Metal option.

      • While there's a plethora of capabilities to explore, this blog focused on the most relevant ones.

      • I strongly believe that besides few exceptions VMs are the best IaaS option for you out there

      • If your organization heavily dependent on hardware resources, private cloud is a better place for you.

       

      Check with Eclipsys to help you with licensing