Oracle ODI 12c — Detailed Course Notes

Comprehensive notes covering architecture, topology, Designer, Knowledge Modules, SCD, monitoring, load plans, Big Data integration, and deployment for Oracle Data Integrator 12c.

📑 Table of Contents

Week 1 — Architecture & Topology Management Foundations

Overview of Oracle Data Integrator 12c Architecture

Oracle Data Integrator (ODI) 12c is a comprehensive data integration platform that implements the E-LT (Extract, Load, Transform) paradigm, pushing transformations to the target database for maximum performance. Its architecture consists of several key components:

💡 Key Concept: ODI's declarative design separates the "what" (logical mappings) from the "how" (physical execution). This allows the same interface to be executed on different databases with optimal performance.

Understanding ETL vs. ELT with ODI's Declarative Design

Traditional ETL (Extract, Transform, Load) tools extract data, transform it in a middle-tier engine, and then load it into the target. This approach can be inefficient for large volumes because transformations occur outside the database.

ODI adopts an ELT (Extract, Load, Transform) approach, where data is first loaded into a staging area (often the target database) and transformations are performed using native database SQL. This leverages the power of the target database engine, minimizing data movement and significantly improving performance.

ODI's declarative design allows developers to define the logical transformation rules without specifying how they are executed. The Knowledge Modules (KMs) handle the physical implementation, automatically generating optimized SQL code based on the source and target databases. This reduces development effort and ensures optimal performance across heterogeneous environments.

Installing and Configuring ODI Studio, Master Repository, and Work Repository

Installation Steps:

# RCU command line example
$ORACLE_HOME/bin/rcu -silent \
    -createRepository \
    -databaseType ORACLE \
    -connectString localhost:1521:orcl \
    -dbUser SYS \
    -dbPassword sys_password \
    -schemaPrefix ODI \
    -components ODI

Connecting ODI Studio to Repositories:

Setting up the Topology: Physical and Logical Architectures, Data Servers, and Agents

The Topology navigator in ODI Studio is used to define the infrastructure:

-- Example: Defining an Oracle Data Server in Topology
-- Physical Architecture -> Data Servers -> New Data Server
-- Name: ORCL_DEV
-- Technology: Oracle
-- JDBC Driver: oracle.jdbc.OracleDriver
-- JDBC URL: jdbc:oracle:thin:@localhost:1521:orcl
-- User: odi_user
-- Password: odi_password
-- Logical Architecture -> Logical Schema -> Map to Physical Schema

Configuring Contexts, Languages, and Schemas for Multi-Environment Deployments

Contexts are used to manage environment-specific variables:

Languages define the default language for messages and prompts. OD I supports multiple languages for multi-lingual deployments.

Schemas represent the actual database schemas (users) that contain the data. They are linked to logical schemas via contexts.

Managing Database Connections and File Systems

ODI supports a wide range of data sources: Oracle, SQL Server, MySQL, IBM DB2, Teradata, flat files (CSV, fixed-width), XML, JSON, and more. Connections are managed through the Topology navigator. For file systems, you define a "File" data server with a directory path and optionally a file mask.

-- Example: File Data Server for flat files
-- Name: FILE_IN
-- Technology: File
-- Directory: /data/inbound/
-- File Mask: *.csv
-- Logical Schema: FILE_SCHEMA
-- Context: DEV points to /data/inbound/, PROD to /prod/data/inbound/

Navigating the ODI Studio Interface

ODI Studio has four main navigators:

Best Practices for Repository Backup, Versioning, and Environment Setup


Week 2 — Designer: Interfaces, Mappings & Knowledge Modules Design

Creating Projects, Folders, and Models

In the Designer navigator, you organize work into Projects. A project contains Folders that group related objects: interfaces, packages, load plans, variables, etc. Models are created for each data source (database schema or file structure). Models contain Datastores (tables, views, files). Reverse-engineering a model imports the metadata from the source into ODI.

-- Steps to create a Model:
-- 1. In Designer, right-click on Models and select New Model.
-- 2. Provide a name (e.g., "SALES_ODS"), select the Logical Schema (e.g., "SALES_ODS_SCHEMA").
-- 3. Click on Reverse Engineer to import tables, columns, and relationships.
-- 4. Choose the objects to reverse engineer (tables, views, synonyms).

Reverse-Engineering Database Schemas and Working with Datastores

Reverse engineering is the process of reading metadata from the database into ODI. You can select which tables, views, and synonyms to import. After reverse engineering, datastores appear under the model. You can modify datastore properties, add filters, change column definitions, and define primary/foreign keys.

Building Basic and Intermediate Interfaces (Mappings)

An Interface in ODI is a data flow that extracts data from one or more sources, transforms it, and loads it into a target. The graphical flow editor allows you to drag and drop datastores, define joins, filters, and transformations, and specify the target table.

-- Creating a simple Interface:
-- 1. In Designer, right-click on Interfaces and select New Interface.
-- 2. Give it a name (e.g., "LOAD_CUSTOMERS").
-- 3. Drag source datastore(s) from the Model into the Source panel.
-- 4. Drag the target datastore into the Target panel.
-- 5. Draw a link between source and target columns (mapping).
-- 6. Add any filters or transformations in the expression editor.
-- 7. Select the appropriate Knowledge Modules (LKM, IKM, CKM).
-- 8. Save and execute.

Understanding Knowledge Modules (KMs)

KMs are reusable code templates that define how ODI executes specific tasks. There are several types:

Deep Dive into IKM and LKM

LKM defines how data is extracted from the source and loaded into a staging table. For example, LKM SQL to SQL uses a database link to pull data from a remote database into the staging area.

IKM defines how data from the staging table is integrated into the target. Common IKMs include:

Choosing the right KM is critical for performance and functionality. ODI provides a wide range of built-in KMs, and you can also create custom KMs.

Configuring CKM for Data Quality and Error Handling

Check Knowledge Module (CKM) is used to validate data against defined constraints (e.g., not null, unique, foreign key). You can use CKM at different stages:

Errors can be logged to error tables, and you can define actions for handling rejected records.

-- Example: Enabling Flow Control
-- In the Interface properties, set the CKM to "CKM SQL".
-- Under the Flow Control tab, select the checks to perform.
-- Options: check primary keys, referential integrity, data types, etc.
-- Errors will be stored in the error table defined in the model.

Implementing Incremental Data Loading Strategies (CDC and Journalization)

Change Data Capture (CDC) allows you to capture only changed data from the source. ODI supports CDC through Journalization (using database triggers or log-based capture). Key steps:

-- Enabling Journalization
-- 1. Right-click on the source datastore and select "Journalizing".
-- 2. Choose the journalizing mode (Simple or Consistent).
-- 3. For Simple mode, specify the timestamp or version column.
-- 4. ODI will create necessary objects (journal tables, triggers).

Performance Tuning of Interfaces Using Optimized ELT Techniques


Week 3 — Advanced ETL/ELT: SCD & Integration Advanced

Implementing Slowly Changing Dimensions (SCD) in ODI

SCD is a common data warehousing pattern for maintaining historical data. ODI supports Type 1, Type 2, and Type 3 SCD via dedicated KMs.

Using IKM SCD and IKM Incremental Update

IKM SCD is specifically designed for Type 2 SCD. It requires a dimension table with start and end date columns, an active flag, and optionally a surrogate key. The KM handles updating the current row and inserting new rows.

IKM Incremental Update can be adapted for Type 1 SCD by setting appropriate update and insert columns. It performs a merge (upsert) operation.

-- Configuring IKM SCD for Type 2
-- 1. In the Interface, select the target datastore (dimension table).
-- 2. Choose IKM SCD (e.g., IKM Oracle SCD).
-- 3. Define the SCD columns: natural key, start date, end date, active flag.
-- 4. Map the source columns to target columns.
-- 5. In the KM options, set the behavior for updates and inserts.

Handling Complex Source-Target Mappings and Multi-Table Joins

ODI interfaces support complex mappings involving multiple source tables with inner/outer joins, aggregations, and filtering. You can also use subqueries and expressions within the flow. For multi-target inserts, you can use multi-table insert KMs.

Implementing Expressions, Filters, and Transformations

The expression editor in ODI allows you to write SQL functions, case statements, and custom transformations. You can use ODI's built-in functions (e.g., SYSDATE, NVL, DECODE) or database-specific functions.

Working with Variables, Sequences, and User-Defined Functions

-- Defining a variable:
-- 1. In Designer, right-click on Variables and select New Variable.
-- 2. Name: VAR_LOAD_DATE, Type: Date, Default Value: SYSDATE.
-- Using the variable in an interface: set a column to #VAR_LOAD_DATE.

Integrating Multiple Data Sources (Relational, Flat Files, XML, JSON)

ODI can handle heterogeneous sources in a single interface using the Multi-Source Mapping feature. You can join data from an Oracle table with a CSV file and load into a SQL Server table. The LKM handles the extraction from each source.

Building Reusable Mappings and Modular ETL Components

Use Sub-interfaces or Procedures to encapsulate reusable logic. You can also create template interfaces and copy them. Packages allow you to orchestrate multiple interfaces in a sequence.

Implementing Referential Integrity and Constraint Validation

ODI can enforce referential integrity during data loading. Use CKM to check foreign keys and prevent orphan records. You can also define constraints in the model and use the "Control" option in the interface.


Week 4 — Operator, Monitoring & Error Handling Monitoring

Navigating the Operator Navigator

The Operator navigator is the central console for monitoring ODI executions. It displays sessions, tasks, and execution logs. You can:

Understanding the Execution Lifecycle: Design, Flow, and Data Control Phases

When you execute an interface, ODI goes through several phases:

  1. Design phase — checks the interface consistency and generates the execution plan.
  2. Flow phase — executes the LKM (loading) and IKM (integration) steps.
  3. Data Control phase — runs the CKM (check) for data quality.

Each phase can produce logs and errors that you can view in the Operator.

Monitoring Sessions, Steps, and Tasks in Real-Time

In the Operator, you can click on a running session to see its live progress. You can also view the generated SQL code for each step, which is useful for debugging.

Managing Error Logging, Debugging, and Troubleshooting Failed Sessions

Implementing Custom Error Handlers and Recovery Mechanisms

You can use Conditional Steps and Error Handlers in Packages to handle failures. For example, if an interface fails, you can send an email notification or run a recovery procedure.

Optimizing Memory, Temporary Tables, and Bulk-Loading Performance

Implementing Data Quality and Record Deduplication Using CKM

Use CKM's "Distinct" check to eliminate duplicates. You can also create custom check conditions in the model to enforce business rules.

Creating Exception Tables and Reporting Bad Records

ODI can store rejected records in error tables. You can then query these tables to report on data quality issues. The error table structure is generated by the CKM.


Week 5 — Load Plans, Scheduling & Production Automation Scheduling

Designing Load Plans for End-to-End Integration Orchestration

A Load Plan is a high-level orchestration object that sequences multiple packages, interfaces, and procedures. It provides features like parallel execution, conditional branching, and restartability.

Defining Sequences, Conditionals, and Parallel Execution

Configuring ODI Agents for Scheduling and Automating Job Execution

ODI Agents can be scheduled to run load plans or packages at specific times using the built-in scheduler or integration with enterprise schedulers like Oracle Scheduler, Control-M, or Autosys.

You can also use the ODI Console (web interface) for monitoring and scheduling.

Integrating ODI with Enterprise Schedulers

Managing Runtime Parameters, Variables, and Dynamic Context Switching

You can define variables at the project or load plan level that are passed at runtime. Context switching allows you to run the same load plan with different physical resources (e.g., DEV vs PROD).

Implementing Rollback, Restart, and Recovery Strategies

Load plans support restart — if a step fails, you can restart from the failure point after fixing the issue. You can also configure rollback options (e.g., using Oracle Flashback or transaction rollback).

Exporting and Importing Projects Using Smart Export/Import and XML Archives

Use the Smart Export and Smart Import utilities to move ODI objects between environments. You can export a project to an XML file, transfer it, and import it into another repository. This is essential for version control and promotion.

# Export a project
odiparameters.bat -export -path "C:\Projects\ODI_SALES.xml" -project "SALES"

# Import a project
odiparameters.bat -import -file "C:\Projects\ODI_SALES.xml"

Version Control Integration with Git/SVN for Collaborative Development

ODI projects can be stored in XML format and committed to version control systems. Teams can work on different branches and merge changes.


Week 6 — Big Data Integration, Deployment & Capstone Big Data

Integrating ODI with Big Data Ecosystems: Hadoop (Hive, HDFS) and Spark

ODI 12c provides connectors for Hadoop and Spark. You can:

Supported KMs include LKM Hive, IKM Hive, and LKM Spark.

-- Example: Using LKM Hive to load data from a file to Hive
-- 1. Create a Model for Hive datastore.
-- 2. In Interface, source is a file (File model), target is a Hive table.
-- 3. Choose LKM Hive (to load into Hive staging).
-- 4. Choose IKM Hive (to integrate into target Hive table).

Using Oracle Big Data Connectors and Advanced KMs

Oracle Big Data Connectors allow ODI to leverage Hadoop's distributed processing. Advanced KMs include IKM Spark for in-memory transformations and LKM Spark for distributed loading.

Implementing Real-Time Data Integration Using CDC and Web Services

ODI supports real-time integration through:

You can create interfaces that process CDC data and load incrementally.

Deploying ODI Projects Across Development, Test, and Production Environments

Use Smart Export/Import to promote projects. Manage environment-specific configurations using contexts and logical schemas. Use separate repositories for each environment.

Using OdiLite and Standalone Agents for Lightweight Deployments

OdiLite is a lightweight runtime environment for executing ODI jobs without the full Studio. It can be embedded in applications. Standalone agents can be installed on any machine to execute jobs.

Managing Security, Credential Management, and Encryption

Performance Monitoring, Capacity Planning, and High-Availability Configurations

Capstone Project

Project Objective: Design and implement a full data integration pipeline for a real-world data warehousing scenario using Oracle ODI 12c.

Deliverables:

🎯 Capstone Goal: Build a complete enterprise data warehouse integration pipeline from source extraction to target loading with full CDC and scheduling, demonstrating mastery of ODI 12c.

📚 References:

© Copyright Westminster College of AI and Quantum Computing Limited, 2026. | info@wcc.co.uk | https://wcc.co.uk