ora-base

Advanced Oracle Database Engineering, Exadata Optimization, and SQL Performance Tuning.

Proxmox VE 9.1 — Oracle VM Layout: Dual RAC Clusters, OMS, OKV and GoldenGate

This article picks up where the networking and storage setup left off. The Proxmox host (pve01.lab.example.com) is running with two VLAN-aware bridges, a ZFS mirror pool (vmpool), and shared ASM zvols already created. We now create all ten VMs using the qm CLI, configure shared storage for RAC, and verify each group before installing Oracle software. 1. VM Inventory and Resource Plan VM ID Hostname Role vCPU RAM OS Disk Notes 100 rac1-node1 RAC Cluster 1, Node 1 8 32 GB 80 GB Shared ASM 101 rac1-node2 RAC Cluster 1, Node 2 8 32 GB 80 GB Shared ASM 110 rac2-node1 RAC Cluster 2, Node 1 8 32 GB 80 GB Shared ASM 111 rac2-node2 RAC Cluster 2, Node 2 8 32 GB 80 GB Shared ASM 120 oms01 Enterprise Manager OMS 1 4 24 GB 80 GB + 200 GB data Primary OMS 121 oms02 Enterprise Manager OMS 2 4 24 GB 80 GB + 200 GB data Secondary OMS 130 okv01 Oracle Key Vault Primary 4 16 GB 80 GB TDE key store 131 okv02 Oracle Key Vault Secondary 4 16 GB 80 GB Paired node 140 ogg01 GoldenGate Microservices (Extract) 4 16 GB 80 GB + 200 GB trail OGG 23ai MA 141 ogg02 GoldenGate Microservices (Replicat) 4 16 GB 80 GB + 200 GB trail OGG 23ai MA Total: 56 vCPU / 240 GB RAM — host has 32 threads and 256 GB. vCPU is intentionally overcommitted (lab workloads are not all active simultaneously). RAM headroom: 16 GB for Proxmox host. ...

Proxmox VE 9.1 on AMD EPYC 9124 — Networking and Storage for Oracle Workloads

This article covers the host-level setup of a purpose-built AMD EPYC 9124 tower server for Oracle workload virtualisation using Proxmox VE 9.1. The hardware was purchased with BIOS pre-configured for IOMMU, SR-IOV, and Global C-States. Before creating a single VM, the network bridges, VLANs, and ZFS storage pools must be correct — mistakes here cascade into every VM that follows. 1. Hardware Reference Component Detail Case Fractal Design Define 7 XL Motherboard Supermicro H13SSL-N (E-ATX, AMD SP5) CPU AMD EPYC 9124 — 16C/32T, 3.1 GHz, 125W cTDP RAM 8× 32 GB DDR5-4800 ECC Registered = 256 GB OS NVMe 2× Samsung 990 PRO 1 TB M.2 PCIe 4.0 (ZFS mirror) VM NVMe 4× Samsung PM9A3 1.92 TB U.2 PCIe 4.0 (ZFS pool) 1GbE NICs 2× onboard (H13SSL-N) 10GbE NICs 1× Intel X550-T2 PCIe → 2 ports (SR-IOV capable) PSU SilverStone VERTEX PX-1000 (1000W, 80+ Platinum) BIOS IOMMU on · SR-IOV on · Global C-States on · cTDP 125W 2. Proxmox VE 9.1 Installation 2.1 Boot and Install Download the Proxmox VE 9.1 ISO from https://www.proxmox.com/en/downloads and write to USB: ...

Reading an Exadata AWR Report: A Full Case Study from Symptom to Root Cause

AWR reports are simultaneously the most powerful and most misread performance tool in Oracle’s arsenal. Most DBAs jump straight to Top SQL, copy the SQL_ID, and start tuning — skipping the sections that tell them why that SQL became a problem in the first place. This case study walks through a realistic AWR from a production Exadata system, section by section, using a deliberately constructed problematic scenario. Every number is fabricated for illustrative purposes; every diagnostic conclusion is grounded in real Oracle internals. ...

AWR and ASH Analysis Methodology: A Structured Approach to Oracle Performance Diagnostics

Performance diagnostics without a methodology is archaeology — you dig until you find something interesting, not necessarily something important. Oracle’s AWR and ASH provide an extraordinarily detailed record of database activity, but interpreting them requires a structured approach. This article presents a top-down methodology that moves from macro to micro: DB Time → Wait Profile → Top SQL → Root Cause. The Foundation: DB Time DB Time is the total time all sessions spent working in the database (on CPU or waiting for database resources) during a snapshot period. It is the single most important number in the AWR report. ...

Data Guard Switchover and Failover: A DBA Playbook

Switchover and failover are the two role-transition operations in Oracle Data Guard. Switchover is a planned, graceful transition (both databases remain available throughout). Failover is an emergency response to primary failure. Having a tested, documented playbook for both is a MAA operational requirement — discovering the procedure on the night of a failure is not a plan. Terminology Recap Term Trigger Data Loss? Primary State Afterward Switchover Planned maintenance None Becomes standby (online) Failover Primary failure Possible (ASYNC) or None (SYNC) Unavailable (failed) Reinstate After failover N/A Failed primary becomes new standby Always use the Data Guard Broker (DGMGRL) for role transitions. Direct SQL*Plus procedures are error-prone and lack the consistency checks the Broker provides. ...

ExaCC I/O Resource Management and Storage Performance Tuning

When multiple VM Clusters share the same Exadata storage cells on ExaCC, storage I/O becomes a shared resource that can interfere between workloads. Oracle’s I/O Resource Manager (IORM) is the mechanism for controlling how storage I/O is distributed across VM Clusters, databases, and consumer groups. Without IORM configuration, a single runaway workload on one VM Cluster can starve others. IORM Architecture on ExaCC IORM operates at two levels: Inter-VM Cluster IORM (configured via OCI): Controls storage I/O allocation between VM Clusters sharing the same physical Exadata infrastructure. Intra-database IORM (configured via DBCA or DBRM): Controls I/O between databases within a single VM Cluster and between consumer groups within a database. Because ExaCC does not give customers cellcli access, inter-VM Cluster IORM is configured through the OCI Console or API — Oracle’s management layer pushes the configuration to the storage cells. ...

Exadata Cloud@Customer Architecture: Infrastructure, VM Clusters, and Operational Model

Exadata Cloud@Customer (ExaCC) delivers the Oracle Cloud control plane experience on Exadata hardware that physically resides in the customer’s data centre. For organisations with data residency requirements, latency-sensitive applications that cannot tolerate public cloud networking, or regulatory constraints that prohibit off-premises data, ExaCC is the path to Autonomous Database and managed Exadata without public cloud. Understanding ExaCC’s architecture — particularly the control plane / data plane separation and the VM Cluster model — is essential for architects sizing, operating, and troubleshooting the platform. ...

Exadata X7 to X11M Migration: An In-Depth Hardware Refresh Playbook

Migrating from Exadata X7 to X11M is not a simple hardware swap — it is a generational architectural shift. The X11M introduces RoCE (RDMA over Converged Ethernet) replacing InfiniBand, PCIe Gen5 NVMe replacing earlier NVMe tiers, and a new compute architecture with AMD EPYC processors. This article provides an in-depth, end-to-end playbook for architects and senior DBAs planning this migration. Reference Environment All commands in this article use the following concrete hostnames and roles. Every code block identifies which server and user account the command runs from. ...

GoldenGate Conflict Detection and Resolution in Active-Active Replication

Active-active replication — where two or more databases accept writes simultaneously and replicate to each other — is one of the most powerful but also most operationally complex GoldenGate topologies. Without a robust conflict detection and resolution (CDR) strategy, you will silently corrupt data on one or both sides. This article covers CDR mechanics, built-in GoldenGate resolution handlers, and architectural patterns that minimise conflict occurrence in the first place. Why Conflicts Occur In a bidirectional topology, any row can be updated on either site concurrently. A conflict occurs when: ...

GoldenGate Microservices Architecture vs Classic: When to Use Which

Oracle GoldenGate ships in two distinct deployment architectures: the Classic Architecture (the original GGSCI-based model) and the Microservices Architecture (introduced in 18c, fully mature from 21c). Both replicate data using the same trail-file and process model under the hood, but they differ substantially in how processes are deployed, managed, and monitored. Choosing the wrong one for your use case adds unnecessary operational complexity. The Fundamental Difference Dimension Classic Microservices Management interface GGSCI command-line REST API + Web UI Process model OS-level processes managed by Manager Microservices managed by Service Manager Configuration storage Flat parameter files (.prm) Internal repository (configurable) Deployment unit Single installation per host Service-based; multiple deployments per host OCI/Cloud-native fit Requires custom automation Native REST/API; integrates with OCI GoldenGate Security Parameter-file credentials Credential store + wallet Multi-tenancy (CDB) Limited Full PDB-level pipeline isolation Classic Architecture: Operational Model In Classic, the Manager process is the parent for all Extract, Pump, and Replicat processes. You interact with the system via GGSCI (GoldenGate Software Command Interface). ...