שבוע אורקל 2016

105
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Developers

Upload: aaron-shilo

Post on 14-Apr-2017

45 views

Category:

Documents


0 download

TRANSCRIPT

Exploring Oracle Database Performance Tuning Best Practices for

DBAs and Developers 

*39 years old* Married + 3* 16 years AS a dba, consultant, instructor, architect.* CEO @ DBcs ltd.* Was cto @ johnbryce israel* Oracle certified professional* Microsoft sql server certified professional

Agenda• Oracle Database Architecture Overview• The connection between SQL tuning & Instance tuning• The connection between database & operating system• Common bottlenecks - Drill down• How do you identify the source of the problem?• Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache• Solutions: where do you start and what order to work?

• Introduction to SQL and Application Tuning• The Oracle Optimizer:

• Rule Based Optimization (overview)• Cost Based Optimization• The Different Modes of the Cost Based Optimizer• Execution Plans• Data Access Methods• Indexes – Types, Classifications, Advantages & Disadvantages• Sort Usage Guidelines

• When and What to Tune?• Clustering factor• Data Types are Important• Integrity Constrains are Important• Reasons for Inefficient SQL Performance• Using Bind Variables• Restructuring SQL Statements• Shared SQL and Cursors

• Advanced SQL and Application Topics

"You have to be constantly evolving and in some cases

DBAs/Programmers don’t do that because they know how they did it years ago and they want to

keep doing it that way..."

Quote from Thomas Kyte's book

if you want a 10 step guide to tuning a query, buy a piece of software. You are not needed in this process, anyone can put a query in, get a query out and run it to see if it is faster. There are tons of these tools on the market. They work using rules (heuristics) and can tune maybe 1% of the problem queries out there. They APPEAR to be able to tune a much larger percent but that is only because the people using these tools never look at the outcome -- hence they continue to make the same basic mistakes over and over and over.

If you want to really be able to tune the other 99% of the queries out there, knowledge of lots of stuff -- physical storage mechanisms, access paths, how the optimizer works -that's the only way.....Think about it for a moment. If there were a 10 step or even 1,000,000 step process by which any query can be tuned (or even X% of queries for that matter), we would write a program to do it. Oh don't get me wrong, there are many programs that actually try to do this - Oracle Enterprise Manager with its tuning pack, SQL Navigator and others. What they do is primarily recommend indexing schemes to tune a query, suggest materialized views, offer to add hints to the query to try other access plans. They show you different query plans for the same statement and allow you to pick one. They offer "rules of thumb" (what I generally call ROT since the acronym and the word is maps to are so appropriate for each other) SQL optimizations - which if they were universally applicable - the optimizer would do it as a matter of fact. In fact, the cost based optimizer does that already - it rewrites our queries all of the time. These tuning tools use a very limited set of rules that sometimes can suggest that index or set of indexes you really should have thought of during your design.

Oracle Database Architecture Overview

Oracle Database Architecture Overview

Oracle Database Memory Structures: Overview

Backgroundprocess

Serverprocess

Serverprocess

Redo log buffer

Database buffercache Shared pool Large pool

AggregatedPGA

Java pool Streams pool

SGA

Database Buffer Cache

• Is a part of the SGA • Holds copies of data blocks that are read from data files• Is shared by all concurrent processes

Database writer process

Databasebuffercache

SGA

Data files

DBWn

Serverprocess

Redo Log Buffer

• Is a circular buffer in the SGA (based on the number of CPUs)

• Contains redo entries that have the information to redo changes made by operations, such as DML and DDL

Log writer process

Redo logbuffer

SGA

Redo log files

LGWR

Serverprocess

Shared Pool

• Is part of the SGA – Contains:

• Library cache• Shared parts of SQL and

PL/SQL statements• Data dictionary cache• Result cache:

• SQL queries• PL/SQL functions

• Control structures• Locks

SGA

Library cache

Datadictionary

cache(row cache)

Control structures

Resultcache

Serverprocess

Shared pool

Processing a DML Statement: Example

Database

Data files

Control files

Redolog files

Userprocess

Shared pool

Redo logbuffer

Serverprocess 3

5 1 Library cache

2 4 Database buffer cache

DBWn SGA

2

COMMIT Processing: Example

Database

Data files

Control files

Redolog files

Userprocess

SGA

Shared pool

Redo logbuffer

Serverprocess 1

3Library cache

Database buffer cache

DBWn

2LGWR

SGA

Program Global Area (PGA)

–PGA is a memory area that contains:• Session information• Cursor information• SQL execution work areas:

• Sort area• Hash join area• Bitmap merge area• Bitmap create area

• Work area size influences SQL performance.• Work areas can be automatically or manually

managed.

StackSpace

User Global Area (UGA)User

SessionData

CursorStatus

SQLArea

Serverprocess

Background Process Roles

PMON SMON ARCn DBWn LGWRCKPT

Databasebuffercache

Shared poolSGA Redo logbuffer

MMON CJQ0 QMNn RCBG MMAN

SGA

Javapool

Fixed SGA

Redo log buffer

Databasebuffer cache

Automatic Shared Memory Management

Which size to choose?

Large poolShared pool

Streamspool

SGA_TARGET + STATISTICS_LEVEL

Automatically tuned SGA components

Automated SQL Execution Memory Management

Backgroundprocess

Serverprocess

Serverprocess

… …

…PGA_AGGREGATE_TARGET

Which size to choose?

AggregatedPGA

Automatic Memory Management

• Sizing of each memory component is vital for SQL execution performance.

• It is difficult to manually size each component.• Automatic memory management automates memory

allocation of each SGA component and aggregated PGA.

Buffer cache

Large pool

Shared pool

Java poolS

treams pool

Private

SQ

L areas

Other S

GA

Aggregated PGA memorySGA memory

UntunableP

GA

Free

MEMORY_TARGET + STATISTICS_LEVEL

MMAN

The connection between SQL tuning & Instance tuning

Database tuning is the process of tuning the actual database, which encompasses the allocated memory, disk usage, CPU, I/O, and underlying database processes. Tuning a database also involves the management and manipulation of the database structure itself, such as the design and layout of tables and indexes. Additionally, database tuning often involves the modification of the database architecture in order to optimize the use of the hardware resources available. There are many other considerations when tuning a database, but these tasks are normally accomplished by the database administrator. The objective of database tuning is to ensure that the database has been designed in a way that best accommodates expected activity within the database.

The connection between SQL tuning & Instance tuning

SQL tuning is the process of tuning the SQL statements that access the database.

These SQL statements include database queries and transactional operations such as inserts, updates, and deletes.

The objective of SQL statement tuning is to formulate statements that most effectively access the database in its current state, taking advantage of database and system resources and indexes.

The connection between SQL tuning & Instance tuning

Both database tuning and SQL statement tuning must be performed to achieve optimal results when accessing the database.

A poorly tuned database may very well render wasted effort in SQL tuning, and vice versa.

Ideally, it is best to first tune the database, and then ensure that indexes exist where needed, and then tune the SQL code.

The connection between SQL tuning & Instance tuning

The connection between database & operating system

Question:  We are in the process of adopting Oracle and we have many choices of operating system platforms.  Which OS is best for Oracle and how to I compare operating system environments for Oracle databases?

Answer:  That's a very common question. Oracle dominates the database world in part because it runs on over 60 platforms, everything from a Mainframe to a Mac.

Oracle chose Solaris as their preferred OS in 2005, and later decided to work on their own Linux distro, making a Oracle Linux OS that is custom-tailored to the needs of a typical database.  Oracle leverages on the advantages of all OS platforms with an independent OSI, customized to each platform.

As to which UNIX dialect is "best", it's often related to the server environment.  For example, svmon is only available on IBM AIX. . . . 

Some, operating systems are better at managing large volumes of data, such as SUSE, who developed a special kernel, just for Oracle

The connection between database & operating system

Data integrity features (T10 Protection Information )

Protection Information enables applications or kernel subsystems to attach metadata to I/O operations, allowing devices that support PI to verify the integrity before passing them further down the stack and physically committing them to disk.

Data Integrity Extensions or DIX is a hardware feature that enables exchange of protection metadata between host operating system and HBA and helps to avoid corrupt data from being written, allowing a fullend-to-end data integrity check.

The connection between database & operating system

Zero downtime updates

Make updates to the Linux Operating System (OS) kernel, while it is running, without a reboot or any interruption.

Only Oracle Linux offers this unique capability, making it possible to keep up with important Linux kernel updates without burdening you with the operational cost and disruption of rebooting for every update to thekernel.

Ksplice allows system administrators to deliver valuable patches for both the Unbreakable Enterprise Kernel as well as the Red Hat compatible kernel with lower costs, less downtime, increased security, and greater flexibility and control.

The connection between database & operating system

Btrfs

File System Btrfs (B-tree file system) is the “next generation file system” for Linux. Pronounced as “Butter FS” or “B-tree FS”, it is a GPL licensed file system first developed by Oracle’s Chris Mason in 2007.

Btrfs provides a number of features that make it a very attractive file system solution for local disk storage.

Btrfs is designed for:• Large files and file systems from the ground up • Simplified administration • Integrated RAID and volume management• Snapshots• Checksums for data and meta-data

The connection between database & operating system

10 common performance issues

Common bottlenecks - Drill down

Not every suggestion is a good suggestion”.

Even if its from the Software provider himself”.

Aaron Shilo

Common bottlenecks - Drill down

Once upon a time, Oracle Support had a note called Script: Lists All Indexes that Benefit from a Rebuild (Doc ID 122008.1) which lets just say I didn’t view in a particularly positive light :-) Mainly because it gave dubious advice which included that indexes should be rebuilt if:

Deleted entries represent 20% or more of current entriesThe index depth is more than 4 levels

It then detailed a script that ran a Validate Structure across all indexes in the database that didn’t belong in either the SYS or SYSTEM schema.

This script basically read through and sequentially locked all tables (maybe multiple times) in the database in order to list indexes that might not actually need a rebuild while potentially missing out on some that do. I could write a script that achieved the same result with far less overheads. For example, SELECT index_name FROM DBA_INDEXES where index_name like ‘A%’ and owner not in (‘SYS’, ‘SYSTEM’) would achieve a very similar result

Posted by Richard Foote in Doc 122008.1, Doc 989093.1, Index Rebuild, Oracle Indexes

Bad connection management

• The application connects and disconnects for each database interaction.

• This problem is common with stateless middleware in application servers.

• It has over two orders of magnitude impact on performance, and is totally unscalable.

Common bottlenecks - Drill down

Bad use of cursors and the shared pool

• Not using cursors results in repeated parses. • If bind variables are not used, then there is hard parsing of all SQL

statements.• This has an order of magnitude impact in performance, and it is totally

unscalable. • Use cursors with bind variables that open the cursor and execute it

many times. • Be suspicious of applications generating dynamic SQL.

Common bottlenecks - Drill down

Bad SQL

• Bad SQL is SQL that uses more resources than appropriate for the application requirement.

• This can be a decision support systems (DSS) query that runs for more than 24 hours, or a query from an online application that takes more than a minute.

• You should investigate SQL that consumes significant system resources for potential improvement.

• ADDM identifies high load SQL. • SQL Tuning Advisor can provide recommendations for

improvement.

Common bottlenecks - Drill down

Use of nonstandard initialization parameters

• These might have been implemented based on poor advice or incorrect assumptions.

• Most databases provide acceptable performance using only the set of basic parameters.

• In particular, parameters associated with SPIN_COUNT on latches and undocumented optimizer features can cause a great deal of problems that can require considerable investigation.

• Likewise, optimizer parameters set in the initialization parameter file can override proven optimal execution plans

• For these reasons, schemas, schema statistics, and optimizer settings should be managed as a group to ensure consistency of performance.

Common bottlenecks - Drill down

Getting database I/O wrong

• Many sites lay out their databases poorly over the available disks. • Other sites specify the number of disks incorrectly, because they

configure disks by disk space and not I/O bandwidth. 

Common bottlenecks - Drill down

Online redo log setup problems

• Many sites run with too few online redo log files and files that are too small.

• Small redo log files cause system checkpoints to continuously put a high load on the buffer cache and I/O system.

• If too few redo log files exist, then the archive cannot keep up, and the database must wait for the archiver to catch up. 

Common bottlenecks - Drill down

All online redo log files should be the same size and configured to switch approximately once an hour during normal activity. They should switch no more frequently than every 20 minutes during peak activity.

There should be a minimum of four online log groups to prevent LGWR from waiting for a group to be available following a log switch. A group may be unavailable because a checkpoint has not yet completed or the group has not yet been archived.

http://docs.oracle.com/cd/B12037_01/server.101/b10726/configbp.htm#1006950

Serialization

• Serialization of data blocks in the buffer cache due to lack of free lists, free list groups, transaction slots (INITRANS), or shortage of rollback segments.

• This is particularly common on INSERT-heavy applications, in applications that have raised the block size above 8K, or in applications with large numbers of active users and few rollback segments.

• Use automatic segment-space management (ASSM) and automatic undo management to solve this problem.

Common bottlenecks - Drill down

Long full table scans

• Long full table scans for high-volume or interactive online operations could indicate poor transaction design, missing indexes, or poor SQL optimization.

• Long table scans, by nature, are I/O intensive and unscalable.

Common bottlenecks - Drill down

High amounts of recursive (SYS) SQL

• Large amounts of recursive SQL executed by SYS could indicate space management activities, such as extent allocations, taking place.

• This is unscalable and impacts user response time. • Use locally managed tablespaces to reduce recursive SQL

due to extent allocation. • Recursive SQL executed under another user ID is probably

SQL and PL/SQL, and this is not a problem.

Common bottlenecks - Drill down

Deployment and migration errors

• In many cases, an application uses too many resources because the schema owning the tables has not been successfully migrated from

the development environment or from an older implementation. • Examples of this are missing indexes or incorrect statistics. • These errors can lead to sub-optimal execution plans and poor

interactive user performance. • When migrating applications of known performance, export the

schema statistics to maintain plan stability using the DBMS_STATS package .

• Although these errors are not directly detected by ADDM, ADDM highlights the resulting high load SQL.

Common bottlenecks - Drill down

The Oracle Optimizer:

• Rule Based Optimization (overview)• Cost Based Optimization• The Different Modes of the Cost Based

Optimizer• Execution Plans• Data Access Methods• Indexes – Types, Classifications,

Advantages & Disadvantages• Sort Usage Guidelines

The Oracle Optimizer:

• The optimizer determines the most efficient way to execute a SQL statement after considering many factors related to the objects referenced and the conditions specified in the query.

• This determination is an important step in the processing of any SQL statement and can greatly affect execution time.

The Oracle Optimizer:

The Oracle Optimizer:SQL Statement Parsing, Overview

Syntactic and semantic check

Privileges check

Allocate private SQL Area

Existing sharedSQL area?

Allocate shared SQL area

Execute statement

No (Hard parse)

Yes (Soft parse)

Parsecall

Parse operation(Optimization)

Private SQL area

Shared SQL area

Parsed representation

The Oracle Optimizer: Why Do You Need an Optimizer?

SELECT * FROM emp WHERE job = 'MANAGER';

How can I retrieve these rows?

Use theindex.

Read each rowand check.

Which one is faster?

Query to optimize

Only 1% of employees are managers

Statistics

Schemainformation

Use theindex

1

2

3

Possible access paths

I have a plan !

The Oracle Optimizer: Why Do You Need an Optimizer?

SELECT * FROM emp WHERE job = 'MANAGER';

How can I retrieve these rows?

Use theindex.

Read each rowand check.

Which one is faster?

Query to optimize

80% of employees are managers

Statistics

Schemainformation

Use FullTable Scan

Possible access paths

I have a plan !

1

2

3

• Using the RBO, the optimizer chooses an execution plan based on the access paths available and the ranks of these access paths. Oracle's ranking of the access paths is heuristic. If there is more than one way to execute a SQL statement, then the RBO always uses the operation with the lower rank. Usually, operations of lower rank execute faster than those associated with constructs of higher rank.

The list shows access paths and their ranking:

• RBO Path 1: Single Row by Rowid• RBO Path 2: Single Row by Cluster Join• RBO Path 3: Single Row by Hash Cluster Key with Unique or Primary Key• RBO Path 4: Single Row by Unique or Primary Key• RBO Path 5: Clustered Join• RBO Path 6: Hash Cluster Key• RBO Path 7: Indexed Cluster Key• RBO Path 8: Composite Index• RBO Path 9: Single-Column Indexes• RBO Path 10: Bounded Range Search on Indexed Columns• RBO Path 11: Unbounded Range Search on Indexed Columns• RBO Path 12: Sort Merge Join• RBO Path 13: MAX or MIN of Indexed Column• RBO Path 14: ORDER BY on Indexed Column• RBO Path 15: Full Table Scan

The Oracle Optimizer: Rule Based Optimization (overview)

The CBO performs the following steps:

• The optimizer generates a set of potential plans for the SQL statement based on available access paths and hints.

• The optimizer estimates the cost of each plan based on statistics in the data dictionary for the data distribution and storage characteristics of the tables, indexes, and partitions accessed by the statement.

• The cost is an estimated value proportional to the expected resource use needed to execute the statement with a particular plan. The optimizer calculates the cost of access paths and join orders based on the estimated computer resources, which includes I/O, CPU, and memory.

• Serial plans with higher costs take more time to execute than those with smaller costs. When using a parallel plan, however, resource use is not directly related to elapsed time.

• The optimizer compares the costs of the plans and chooses the one with the lowest cost.

The Oracle Optimizer: Cost Based Optimization

The following features require use of the CBO:

• Partitioned tables and indexes• Index-organized tables• Reverse key indexes• Function-based indexes

• SAMPLE clauses in a SELECT statement• Parallel query and parallel DML• Star transformations and star joins• Extensible optimizer• Query rewrite with materialized views• Enterprise Manager progress meter• Hash joins• Bitmap indexes and bitmap join indexes• Index skip scans

The Oracle Optimizer: Cost Based Optimization

• Piece of code:• Estimator• Plan generator

• Estimator determines cost of optimization suggestions made by the plan generator:

• Cost: Optimizer’s best estimate of the number of standardized I/Os made to execute a particular statement optimization

• Plan generator:• Tries out different statement optimization techniques• Uses the estimator to cost each optimization suggestion• Chooses the best optimization suggestion based on cost• Generates an execution plan for best optimization

The Oracle Optimizer: Cost Based Optimization

• Selectivity is the estimated proportion of a row set retrieved by a particular predicate or combination of predicates.

• It is expressed as a value between 0.0 and 1.0:• High selectivity: Small proportion of rows• Low selectivity: Big proportion of rows

• Selectivity computation:• If no statistics: Use dynamic sampling• If no histograms: Assume even distribution of rows

• Statistic information:• DBA_TABLES and DBA_TAB_STATISTICS (NUM_ROWS)

• DBA_TAB_COL_STATISTICS (NUM_DISTINCT, DENSITY,

HIGH/LOW_VALUE,…)

The Oracle Optimizer: Estimator: Selectivity

Selectivity = Number of rows satisfying a condition

Total number of rows

• Expected number of rows retrieved by a particular operation in the execution plan

• Vital figure to determine join, filters, and sort costs• Simple example:

• The number of distinct values in DEV_NAME is 203.• The number of rows in COURSES (original cardinality) is 1018.• Selectivity = 1/203 = 4.926*e-03• Cardinality = (1/203)*1018 = 5.01 (rounded off to 6)

The Oracle Optimizer: Estimator: Cardinality

SELECT days FROM courses WHERE dev_name = 'ANGEL';

Cardinality = Selectivity * Total number of rows

The Oracle Optimizer: Estimator: Cost

• Cost is the optimizer’s best estimate of the number of standardized I/Os it takes to execute a particular statement.

• Cost unit is a standardized single block random read:

• 1 cost unit = 1 SRds• The cost formula combines three different costs units into

standard cost units.

#SRds*sreadtim + #MRds*mreadtim + #CPUCycles/cpuspeed

sreadtimCost=

Single block I/O cost Multiblock I/O cost CPU cost

#SRds: Number of single block reads#MRds: Number of multiblock reads

#CPUCycles: Number of CPU Cycles

Sreadtim: Single block read timeMreadtim: Multiblock read timeCpuspeed: Millions instructions per second

The Oracle Optimizer:The Different Modes of the Cost Based Optimizer

Value DescriptionCHOOSE The optimizer chooses between a cost-based approach and a rule-based approach, depending on whether statistics

are available. This is the default value.•If the data dictionary contains statistics for at least one of the accessed tables, then the optimizer uses a cost-based approach and optimizes with a goal of best throughput.•If the data dictionary contains only some statistics, then the cost-based approach is still used, but the optimizer must guess the statistics for the subjects without any statistics. This can result in suboptimal execution plans.•If the data dictionary contains no statistics for any of the accessed tables, then the optimizer uses a rule-based approach.

ALL_ROWS The optimizer uses a cost-based approach for all SQL statements in the session regardless of the presence of statistics and optimizes with a goal of best throughput (minimum resource use to complete the entire statement).

FIRST_ROWS_n The optimizer uses a cost-based approach, regardless of the presence of statistics, and optimizes with a goal of best response time to return the first n number of rows; n can equal 1, 10, 100, or 1000.

FIRST_ROWS The optimizer uses a mix of cost and heuristics to find a best plan for fast delivery of the first few rows.Note: Using heuristics sometimes leads the CBO to generate a plan with a cost that is significantly larger than the cost of a plan without applying the heuristic. FIRST_ROWS is available for backward compatibility and plan stability.

RULE The optimizer chooses a rule-based approach for all SQL statements regardless of the presence of statistics.

The Oracle Optimizer: What Is an Execution Plan?

• The execution plan of a SQL statement is composed of small building blocks called row sources for serial execution plans.

• The combination of row sources for a statement is called the execution plan.

• By using parent-child relationships, the execution plan can be displayed in a tree-like structure (text or graphical).

The Oracle Optimizer: Where to Find Execution Plans?

• PLAN_TABLE (EXPLAIN PLAN or SQL*Plus autotrace)

• V$SQL_PLAN (Library Cache)

• V$SQL_PLAN_MONITOR (11g)

• DBA_HIST_SQL_PLAN (AWR)

• STATS$SQL_PLAN (Statspack)

• SQL Management Base (SQL Plan Management Baselines)

• SQL tuning set

• Trace files generated by DBMS_MONITOR

• Event 10053 trace file

• Process state dump trace file since 10gR2

The Oracle Optimizer: How To Read?SQL> explain plan for  2  select e.empno, e.ename, d.dname  3  from emp e, dept d  4  where e.deptno = d.deptno  5  and e.deptno = 10; Explained. SQL> SELECT * FROM table(dbms_xplan.display(null,null,'basic')); PLAN_TABLE_OUTPUT------------------------------------------------Plan hash value: 568005898 ------------------------------------------------| Id  | Operation                    | Name    |------------------------------------------------|   0 | SELECT STATEMENT             |         ||   1 |  NESTED LOOPS                |         ||   2 |   TABLE ACCESS BY INDEX ROWID| DEPT    ||   3 |    INDEX UNIQUE SCAN         | PK_DEPT ||   4 |   TABLE ACCESS FULL          | EMP     |------------------------------------------------

1. Operation 0 is the root of the tree; it has one child, Operation 1

2. Operation 1 has two children, which is Operation 2 and 4

3. Operation 2 has one child, which is Operation 3

The Oracle Optimizer: How To Read?

•Operation 0 (SELECT STATEMENT) | | | Operation 1 (NESTED LOOPS) /\ / \ / \ / \ / \ / \ / \ / \ Operation 2 Operation 4(TABLE ACCESS (TABLE ACCESS FULL)BY INDEX ROWID) | | | Operation 3(INDEX UNIQUE SCAN)

the graphical representation of the execution plan.

If you read the tree; In order to perform Operation 1 , you need to perform Operation 2 and 4.Operation 2 comes first;In order to perform 2, you need to perform its Child Operation 3. In order to perform Operation 4, you need to perform Operation 2.

Oracle Supports the below access methods.

• Full Table SCAN (FTS)• Table Access by ROW-ID• Index Unique Scan• Index Range Scan• Index Skip Scan• Full Index Scan• Fast Full Index Scans• Index Joins• Hash Access• Cluster Access• Bit Map Index

The Oracle Optimizer: Data Access Methods

Guidelines for Managing Indexes• Create indexes after inserting table data• Index the correct tables and columns• Order index columns for performance• Limit the number of indexes for each table• Drop indexes that are no longer needed• Understand deferred segment creation• Estimate index size and set storage parameters• Specify the tablespace for each index• Consider parallelizing index creation• Consider creating indexes with NOLOGGING• Understand when to use unusable or invisible indexes• Consider costs and benefits of coalescing or rebuilding indexes• Consider cost before disabling or dropping constraints

The Oracle Optimizer: Indexes – Types, Classifications, Advantages & Disadvantages

Index Type Usage

The Oracle Optimizer: Indexes – Types, Classifications, Advantages & Disadvantages

Default, balanced tree index, good for high-cardinality (high degree of distinctvalues) columns

B-tree

Used with clustered tables B-tree cluster

Used with hash clusters Hash cluster

Good for columns that have SQL functions applied to them Function-based

Good for columns that have SQL functions applied to them; viable alternativeto using a function-based index

Indexed virtual column

Useful to balance I/O in an index that has many sequential inserts Reverse-key

Useful for concatenated indexes where the leading column is often repeated;compresses leaf block entries

Key-compressed

Useful in data warehouse environments with low-cardinality columns; theseindexes aren’t appropriate for online transaction processing (OLTP) databaseswhere rows are heavily updated.

Bitmap

Useful in data warehouse environments for queries that join fact anddimension tables

Bitmap join

Global index across all partitions in a partitioned table Global partitioned

Local index based on individual partitions in a partitioned table Local partitioned

Specific for an application or cartridge Domain

Physical layout of a table and B-tree index

The Oracle Optimizer: Indexes – Types, Classifications, Advantages & Disadvantages

The Oracle Optimizer: Indexes – Types, Classifications, Advantages & Disadvantages

When you put indexes on a partitioned table, you have the choice between GLOBAL and LOCAL .

The LOCAL index partitions follow the table partitions: They have the same partition key & type, get created automatically when new table partitions are added and get dropped automatically when table partitions are dropped.

Beware: LOCAL indexes are usually not appropriate for OLTP access on the table, because one server process may have to scan through many index partitions then. This is the cause of most of the scary performance horror stories you may have heard about partitioning!

A GLOBAL index spans all partitions. It has a good SELECT performance usually, but is more sensitive against partition maintenance than LOCAL indexes. The GLOBAL index needs to be rebuilt more oftenץ

The Oracle Optimizer: Optimizer Statistics

• Describe the database and the objects in the database

• Information used by the query optimizer to estimate:

• Selectivity of predicates• Cost of each execution plan• Access method, join order, and join method• CPU and input/output (I/O) costs

• Refreshing optimizer statistics whenever they are stale is as important as gathering them:

• Automatically gathered by the system• Manually gathered by the user with DBMS_STATS

The Oracle Optimizer: Optimizer StatisticsA common misperception that if no new statistics are gathered (and assuming nothing else is altered in the database), that execution plans must always remain the same. That by not collecting statistics, one somehow can ensure and guarantee the database will simply perform in the same manner and generate the same execution plans. This is fundamentally not true. In fact, quite the opposite can be true.One might need to collect fresh statistics to make sure vital execution plans don’t change. It’s the act of not refreshing statistics that can cause execution plans to suddenly change.

explain plan changes with no stat change.sql

The Oracle Optimizer: Types of Optimizer Statistics• Table statistics:

• Number of rows

• Number of blocks

• Average row length

• Index Statistics:

• B*-tree level

• Distinct keys

• Number of leaf blocks

• Clustering factor

• System statistics

• I/O performance and utilization

• CPU performance and utilization

The Oracle Optimizer: Histogrms• The optimizer assumes uniform distributions; this may lead to

suboptimal access plans in the case of data skew.

• Histograms:

• Store additional column distribution information

• Give better selectivity estimates in the case of nonuniform distributions

• With unlimited resources you could store each different value and the number of rows for that value.

• This becomes unmanageable for a large number of distinct values and a different approach is used:

• Frequency histogram (#distinct values ≤ #buckets)

• Height-balanced histogram (#buckets < #distinct values)

• They are stored in DBA_TAB_HISTOGRAMS.

The Oracle Optimizer: Frequency Histograms

10 buckets, 10 distinct values

0

10000

20000

30000

40000

1 3 5 7 10 16 27 32 39 49ENDPOINT VALUE: Column value

ENDPOINT

NUMBER

Cumulative cardinality

#rows for column value

Distinct values: 1, 3, 5, 7, 10, 16, 27, 32, 39, 49

Number of rows: 40001

The Oracle Optimizer: Height-Balanced Histograms 5 buckets, 10 distinct values

(8000 rows per bucket)

0 1 3 4 5

ENDPOINT NUMBER: Bucket number

ENDPOINT VALUE

2

Same numberof rows per bucket

1 7 10 10 32 49

Distinct values: 1, 3, 5, 7, 10, 16, 27, 32, 39, 49

Number of rows: 40001

Popular value

The Oracle Optimizer: Height-Balanced HistogramsIn a height-balanced histogram, the ordered column values are divided

into bands so that each band contains approximately the same number of rows.

The histogram tells you values of the endpoints of each band.

In the example in the slide, assume that you have a column that is populated with 40,001 numbers.

There will be 8,000 values in each band.

You only have ten distinct values: 1, 3, 5, 7, 10, 16, 27, 32, 39, and 49.

Value 10 is the most popular value with 16,293 occurrences.

When the number of buckets is less than the number of distinct values, ENDPOINT_NUMBER records the bucket number and ENDPOINT_VALUE records the column value that corresponds to this endpoint.

Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache

Buffer cache

For many types of operations, Oracle Database uses the buffer cache to store data blocks read from disk.

Oracle Database bypasses the buffer cache for particular operations, such as sorting and parallel reads.

To use the database buffer cache effectively, tune SQL statements for the application to avoid unnecessary resource consumption.

To meet this goal, verify that frequently executed SQL statements and SQL statements that perform many buffer gets are well-tuned.

When configuring a new database instance, it is impossible to know the correct size for the buffer cache. Typically, a database administrator makes a first estimate for the cache size, then runs a representative workload on the instance and examines the relevant statistics to see whether the cache is under-configured or over-configured.

Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache

What is a Physical I/O??

Whenever you execute a query, Oracle has to go and fetch data to give you the result of the query execution.

Here, data means the actual data in data blocks. Whenever a new data block is requested, it has to be fetched from the physical datafiles residing on the physical disks.

This fetching of data blocks from the physical disk involves an I/O operation known as physical I/O.

By virtue of this physical I/O, now the block has been fetched and read into the memory area called buffer cache.

This is a default action.  We know that a data block might be requested multiple times by multiple queries.

Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache

What is a Logical I/O??  

Once a physical I/O has taken place and the block has been read into the memory, the next request for the same data block wont require the block to be fetched from the disk and hence avoiding a physical I/O.

So now to return the results for the select query requesting the same data block, the block will be fetched from the memory and is called a Logical I/O.  Whenever the quantum of Logical I/O is calculated, two kinds of reads are considered : Consistent reads and Current reads.

Jointly, these 2 statistics are known as Logical I/O performed by Oracle.

Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache

Consistent reads  

It is a well known fact that whenever a change is induced in a data block, the old data/entry is written to the UNDO/ROLLBACK segments. From the fundamentals of UNDO, we also know that this is to provide a read consistent view of the data block to other users trying to read the same data block.  Consistent reads mean reading the block in a consistent mode “point in time”. Here the phrase “point in time” means the time when the query/statement began.

A consistent read might or might not involve any UNDO data. UNDO data will be applied when it is necessary to roll back a data block to the required “point in time” when the SQL statement was fired. If on reading the buffer cache, it is found that the data block is already in the required state, no UNDO data is required because the block is already consistent.   

Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache

Consistent reads and array size  Consistent reads could also depend on and vary with the array size setting of SQLPLUS. The default value is 15. Array size is the number of rows fetched in a single read.

The value of array size is an indicator of the number of network round trips made to fetch the required data from Oracle.

A careful adjustment of array size value can improve performance by reducing the network round trips.

A higher array size might be good for performance of queries (by reducing the network round trips and also the consistent reads) but too high value also uses more memory. However, array size is not a setting restricted to SQLPLUS; it can be set in many other applications requesting data from oracle database.  

Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache

How do you identify the source of the problem?

Solving database performance issues sometimes requires the use of operating system (OS) utilities.

These tools often provide information that can help isolate database performance problems. Consider the following situations:

• You’re running multiple databases and multiple applications on one server and want to use OS utilities to identify which database (and corresponding process) is consuming the most operating system resources. This approach is invaluable when one database application is consuming resources to the point of causing other databases on the box to perform poorly.

• You need to verify if the database server is adequately sized for current application workload in terms of CPU, memory, disk I/O, and network bandwidth. An analysis is needed to determine at what point the server will not be able to handle larger (future) workloads.

• You’ve used database tools to identify system bottlenecks and want to double-check the analysis via operating system tools.

How do you identify the source of the problem?

In these scenarios, to effectively analyze, tune, and troubleshoot, you’ll need to employ OS tools to identify resource-intensive processes. Furthermore, if you have multiple databases and applicationsrunning on one server, when troubleshooting performance issues, it’s often more efficient to first determine which database and process is consuming the most resources.Operating system utilities help pinpoint whether the bottleneck is CPU, memory, disk I/O, or a network issue. In Linux/Unix environments, once you have the operating system identifier, you can then query the database to showany corresponding database processes and SQL statements.

How do you identify the source of the problem?

Solutions: where do you start and what order to work?

Solutions: where do you start and what order to work?

Mapping a Resource-Intensive Process to a Database Process

Problem

It’s a dark and stormy night, and the system is performing poorly. You identify an operating system–intensive process on the box. You want to map an operating system process back to a database process.If the database process is a SQL process, you want to display the user of the SQL statement and also the SQL.

Solution

In Linux/Unix environments, if you can identify the resource-intensive operating system process, thenyou can easily check to see if that process is associated with a database process. The process consists of the following:

1. Run an OS command to identify resource-intensive processes and associated IDs.2. Identify the database associated with the process.3. Extract details about the process from the database data dictionary views.4. If it’s a SQL statement, get those details.5. Generate an execution plan for the SQL statement.

Solutions: where do you start and what order to work?

Introduction to SQL and Application Tuning

Proactive Tuning Methodology

• Simple design• Data modeling• Tables and indexes• Using views• Writing efficient SQL• Cursor sharing• Using bind variables

Introduction to SQL and Application Tuning

Simplicity in Application Design

• Simple tables• Well-written SQL• Indexing only as required• Retrieving only required information

Introduction to SQL and Application Tuning

Data Modeling

• Accurately represent business practices• Focus on the most frequent and important business

transactions• Use modeling tools• Appropriately normalize data (OLTP versus DW)

Introduction to SQL and Application Tuning

Table Design

• Compromise between flexibility and performance:• Principally normalize• Selectively denormalize

• Use Oracle performance and management features:• Default values• Constraints• Materialized views• Clusters• Partitioning

• Focus on business-critical tables

Introduction to SQL and Application Tuning

Index Design

• Create indexes on the following:• Primary key (automatically created)• Unique key (automatically created)• Foreign keys (good candidates)

• Index data that is frequently queried (select list).

• Use SQL as a guide to index design.

Introduction to SQL and Application Tuning

Using Views

• Simplifies application design• Is transparent to the developer• Can cause suboptimal execution plans

Introduction to SQL and Application Tuning

SQL Execution Efficiency

• Good database connectivity• Minimizing parsing• Share cursors• Using bind variables

Introduction to SQL and Application Tuning

Writing SQL to Share Cursors

• Create generic code using the following:• Stored procedures and packages• Database triggers• Any other library routines and procedures

• Write to format standards (improves readability):• Case• White space• Comments• Object references• Bind variables

Introduction to SQL and Application Tuning

Performance Checklist

• Set initialization parameters and storage options.• Verify resource usage of SQL statements.• Validate connections by middleware.• Verify cursor sharing.• Validate migration of all required objects.• Verify validity and availability of optimizer statistics.

Introduction to SQL and Application Tuning

When and What to Tune?

• Clustering factor• Integrity Constrains are Important• Reasons for Inefficient SQL Performance• Using Bind Variables• Restructuring SQL Statements• Shared SQL and Cursors

When and What to Tune?

When and What to Tune?The Clustering Factor

The clustering factor is a number which represent the degree to which data is randomly distributed in a table.

In simple terms it is the number of “block switches” while reading a table using an index.

When and What to Tune?

The above diagram explains that how scatter the rows of the table are. The first index entry (from left of index) points to the first data block and second index entry points to second data block. So while making index range scan or full index scan, optimizer have to switch between blocks and have to revisit the same block more than once because rows are scatter. So the number of times optimizer will make these switches is actually termed as“Clustering factor”.

When and What to Tune?

The above image represents "Good CF”. In an event of index range scan, optimizer will not have to jump to next data block as most of the index entries points to same data block.This helps significantly in reducing the cost of your SELECT statements.

Clustering factor is stored in data dictionary and can be viewed from dba_indexes (or user_indexes)

Clustering factor.sql

Integrity Constrains are Important

Many people think of constraints as a data integrity thing, and it’s true—they are. But constraints are used by the optimizer as well when determining the optimal execution plan.

The optimizer takes as inputs 

• The query to optimize• All available database object statistics• System statistics, if available (CPU speed, single-block I/O speed, and

so on—metrics about the physical hardware)• Initialization parameters• Constraints

null columns differ from not nul.sqlfk adds to query performance

When and What to Tune?

• Reasons for inefficient SQL performance

• Stale or missing optimizer statistics • Missing access structures • Suboptimal execution plan selection • Poorly constructed SQL

When and What to Tune?

When and What to Tune?

Richard Morris:Are there issues that crop up again and again?

Tom Kyte:Perhaps the biggest issue is the black box approach of development. A developer will learn everything they can about the procedural language they're using. However, they don't learn about the database that they're using or other packages that might be involved……

Richard Morris:Do you think then that poor education is to blame? That somehow it’s got worse over the years rather than getting better?

Tom Kyte:No, it hasn’t changed. When I get up on stage at a seminar and I talk about bind variables I start by saying that for 16 years I’ve been talking about the same thing but each year the problem is the same. Why? Because universities are trying to teach students theory and algorithms and things like that, they’re not teaching them how to write production quality code. They don’t teach them how to debug or how to instrument, they don’t teach them how to defensively program. They just teach them how to write a compiler in Lisp which frankly doesn’t translate very well into IT.

Using Bind Variables

Oracle automatically notices when applications send similar SQL statements to the database. The SQL area used to process the first occurrence of the statement is shared- that is, used for processing subsequent occurrences of that same statement.Therefore, only one shared SQL area exists for a unique statement.Because shared SQL areas are shared memory areas, any Oracle process can use a shared SQL area. The sharing of SQL areas reduces memory use on the database server, thereby increasing system throughput.In evaluating whether statements are similar or identical, Oracle considers SQL statements issued directly by users and applications as well as recursive SQL statements issued internally by a DDL statement.

One of the first stages of parsing is to compare the text of the statement with existing statements in the shared pool to see if the statement can be shared. If the statement differs textually in any way, then Oracle does not share the statement.Exceptions to this are possible when the parameter CURSOR_SHARING has been set to SIMILAR or FORCE.

When and What to Tune?

ADAPTIVE BINDING

DBAs are always encouraging developers to use bind variables, but when bind variables are used against columns containing skewed data they sometimes lead to less than optimum execution plans. This is because the optimizer peeks at the bind variable value during the hard parse of the statement, so the value of a bind variable when the statement is first presented to the server can affect every execution of the statement, regardless of the bind variable values.Oracle uses Adaptive Cursor Sharing to solve this problem by allowing the server to compare the effectiveness of execution plans between executions with different bind variable values.If it notices suboptimal plans, it allows certain bind variable values, or ranges of values, to use alternate execution plans for the same statement. This functionality requires no additional configuration.

https://oracle-base.com/articles/11g/adaptive-cursor-sharing-11gr1

When and What to Tune?

When and What to Tune?

• Restructuring SQL StatementsSELECT COUNT(*) FROM products pWHERE prod_list_price<

1.15( * SELECT avg(unit_cost) FROM costs c WHERE c.prod_id = p.prod_id)

SELECT * FROM job_history jh, employees e WHERE substr(to_char(e.employee_id),2)= substr(to_char(jh.employee_id),2)

SELECT * FROM orders WHERE order_id_char = 1205

SELECT * FROM employeesWHERE to_char(salary) = :sal

1

2

3

4

SELECT * FROM parts_oldUNIONSELECT * FROM parts_new

5

Variuos sql and pl/sql techniques to improve performance

Advanced SQL and Application Topics