oracle to mysql 2012
DESCRIPTION
TRANSCRIPT
Oracle - MySQL
Migration
Marco Tusa
MySQL CTL
MySQL Conference 12 April 2012
© 2012 Pythian 3
• Recognized Leader: • Global industry-leader in remote database administration services and consulting for Oracle, Oracle
Applications, MySQL and SQL Server
• Work with over 165 multinational companies such as Forbes.com, Fox Sports, Nordion and Western
Union to help manage their complex IT deployments
• Expertise: • One of the world’s largest concentrations of dedicated, full-time DBA expertise. Employ 7 Oracle
ACEs/ACE Directors
• Hold 7 Specializations under Oracle Platinum Partner program, including Oracle Exadata, Oracle
GoldenGate & Oracle RAC
• Global Reach & Scalability: • 24/7/365 global remote support for DBA and consulting, systems administration, special projects or
emergency response
Why Pythian
© 2012 Pythian 4
Who am I? • Cluster Technical Leader at Pythian for MySQL technology
• Previous manager Professional Service South EMEA at MySQL/SUN/Oracle
• In MySQL before the SUN gets on us
• Lead the team responsible for Oracle & MySQL DBs service in support to technical systems, at Food and Agriculture Organization of United Nations (FAO of UN)
• Lead developer & system administrator teams in FAO managing the Intranet/Internet infrastructure.
• Worked (a lot) in developing countries like (Ethiopia, Senegal, Ghana, Egypt …)
• My Profile http://it.linkedin.com/in/marcotusa
• Email [email protected] [email protected]
© 2012 Pythian 5
I like to start from :* •Scalability and Flexibility
•High Performance
•High Availability
•Robust Transactional Support
•Web and Data Warehouse Strengths
•Strong Data Protection
•Comprehensive Application Development
•Management Ease
•Open Source Freedom and 24 x 7 Support
•Lowest Total Cost of Ownership
*http://www.mysql.com/why-mysql/topreasons.html
Why MySQL
© 2012 Pythian 6
MySQL TCO Savings Calculator (now)*
*From www.mysql.com TCO calculator
Why MySQL?
© 2012 Pythian 7
Why MySQL?
MySQL TCO Savings Calculator (before)
*From www.mysql.com TCO calculator ancient time
© 2012 Pythian 8
Why MySQL?
All good then? When should I migrate my
environment to MySQL?
Cost is not the only aspect to consider:
• Need to use MySQL correctly;
• Be aware of existing issues
• good list of them from Baron*
• Identify the real effort require for the migration.
*http://www.xaprb.com/blog/2009/03/13/50-things-to-know-before-migrating-oracle-to-mysql/
© 2012 Pythian 9
12 things to know about MySQL (1) 1 Subqueries are poorly optimized (optimization expected in 5.6
http://dev.mysql.com/doc/refman/5.6/en/from-clause-subquery-optimization.html)
2 There is limited ability to audit (no user reference unless General log active).
3 Authentication is built-in. There is no LDAP, Active Directory, or other external
authentication capability. (New PAM module available for 5.5 but only enterprise)
4 Data integrity checking is very weak, and even basic integrity constraints
cannot always be enforced. (replication)
5 Most queries can use only a single index per table; some multi-index query
plans exist in certain cases, but the cost is usually underestimated by the query
optimizer, and they are often slower than a table scan.
6 Foreign keys are not supported in most storage engines.
© 2012 Pythian 10
12 things to know about MySQL (2) 7 Execution plans are not cached globally, only per-connection.
8 There are no integrated or add-on business intelligence, OLAP cube, etc
packages.
9 There are no materialized views (also if we can use Event scheduler)
10 Replication is asynchronous and has many limitations and edge cases.
11 DDL such as ALTER TABLE or CREATE TABLE is non-transactional. It commits
open transactions and cannot be rolled back or crash-recovered.
12 Each storage engine can have widely varying behavior, features, and
properties. (positive and negative)
© 2012 Pythian 11
Prepare a plan, and do not improvise
• Analyze the source (from application to data design)
• Identify show stoppers
• Identify how to map what to what
• Identify how to organize the target
Most important:
Be ready to do not force migration. If it does not make sense to proceed,
STOP!
Getting Started?
© 2012 Pythian 12
The Motto
Use the right tool for the job
© 2012 Pythian 13
• Database is used only to store data all the logic reside in
the application
• Database contains logic such as stored procedure and
complex package
• Database containing data for data warehouse
• Real time data and historical records (telephone
company)
Most common source cases
© 2012 Pythian 14
Define the process
Analyze Analyze
Understand Understand
Match
Src/dest
Match
Src/dest
Re/Design Extract src Extract src
Convert Convert
Import Import Schemas Schemas data data Logic Logic
Partition Partition Index Index Test/POC Test/POC Validate
Som
eth
ing
fails
© 2012 Pythian 15
When analyzing the source database(s) what should be the outcome?
• Easy to understand excluding list
• Identify Source type (Simple data move; data + Intelligence; data mart)
• In detail review per schema of complexity
• Detailed assessment of modification and effort database objects
• Detailed assessment of functions/functionalities used (also in the
application)
• Application assessment and review
Mitigating risk of failure (analyze)
© 2012 Pythian 16
Easy to understand excluding list
•Create a rank on the “impedance“
o Apply it to analyzed schema i.e.:
*The lower grade the better
Mitigating risk of failure (analyze)
Issue Workaround Grade* Notes
Reference to external schemas
in the a different instance (db
link) 10 Not portable
…
Packages See Writing stored procedures 9 Require full recode
Procedures See Writing stored procedures 9 Require full recoding
Unique key longer then 255
characters See Key length limitations 4
Views alias Manually added 4 Columns alias must be added
manually
Sequences See Migration of Sequences 3 Whenever possible convert to
autoincrement
Empty schemas See empty schema definitions 2 Convert to User definition
© 2012 Pythian 17
1. Identify and understand differences
- Oracle vs MySQL behavior
- DDL differences MySQL – Oracle
- DML differences
- Data formatting and encoding
- Data set dimensions
2. Identify and understand business logic differences
- map Oracle functions to MySQL
- convert Oracle logic to MySQL (if possible)
3. Realize a Proof of Concept
- involve an experienced Oracle DBA
- involve an experience MySQL DBA
- involve the developers
- use real data
- use real traffic
Mitigating risk of failure (analyze)
© 2012 Pythian 18
Understanding server behavior Identify different behavior between Oracle and MySQL, some basic differences (cont.)
• Oracle is case insensitive in the schema object definition while MySQL is case
sensitive (remember to set lower_case_table_names)
• Oracle does not provide DEFAULT value for NOT NULL, MySQL does.
• Oracle supports millisecond MySQL only from 5.6
• Oracle does not apply silent conversion to data types MySQL does (set sql_mode)
• Oracle maximum VARCHAR2 dimension is 4,000 bytes, MySQL 65,535
Mitigating risk of failure (understand)
© 2012 Pythian 19
Understanding server behavior Identify different behavior between Oracle and MySQL, some basic differences
1.what is what, understanding the naming conventions
AUTO COMMIT
Default enabled in MySQL
- you can't ROLLBACK
- Non Transactional Storage Engines
- SET AUTOCOMMIT = {0 | 1};
2.securing the database
Database Authentication/Privileges
- MySQL Privileges (local; no roles)
- Oracle System Privileges (local/external; roles)
3.Dual in MySQL is not required
- e.g. SELECT 1+1
but we provided for Oracle Compatibility
- SELECT 1+1 FROM DUAL
- SELECT CURRENT_USER() FROM DUAL
Mitigating risk of failure (understand)
© 2012 Pythian 20
Understanding DDL differences
Key length limitations
Oracle handles index with a length up to the 40%(plus some overhead) of the
database block size (db_block_size), this could be a problem with MySQL.
MySQL can use 767/1000 bytes as a primary key or an index.
But because in UTF-8, one character is 3 bytes, a primary key or any key can be at
most 255 characters.
Work around only for InnoDB innodb_large_prefix in case of Dynamic/Compressed
ROW format.
Mitigating risk of failure (understand)
© 2012 Pythian 21
Understanding DDL differences
autoincrement/sequence
Oracle uses sequence, while MySQL is bound to AUTO_INCREMENT
AUTO_INCREMENT must be NOT NULL and part of the primary key
Oracle can retrieve sequence values MySQL need to use the function
LAST_INSERT_ID().
The LAST_INSERT_ID() is maintained per connection and is thus safe for concurrent
use.
Do not use “SELECT MAX(id)+1 FROM tab”
Mitigating risk of failure (understand)
© 2012 Pythian 22
Understanding Function Triggers difference
Given The relevance in a Migration of the presence of SP/Trigger it is worth to
talk about it a little bit more in details
Procedure and triggers difference
• one trigger for event in MySQL, all the different actions needs to be group
• no packages, workaround using a fake schema
• different behavior by storage engine and if transactional or not
• Security assignments and security definer/invoker
• Up to 5.5 very basic error handling and lack of “signal” . So version 5.5 is
almost mandatory if in the need to use decent error handling.
Mitigating risk of failure (understand)
© 2012 Pythian 23
Understanding Function Triggers difference
MySQL stored programs can often add to application functionality and developer
efficiency, and there are certainly many cases where the use of a procedural
language such as the MySQL stored program language can do things that a non
procedural language like SQL cannot.
There are also a number of reasons why a MySQL stored program approach may
offer performance improvements over a traditional SQL approach
• It provides a procedural approach (SQL is a declarative, non procedural
language)
• It reduces client-server traffic
• It allows us to divide and conquer complex statements
But…
Mitigating risk of failure (understand)
© 2012 Pythian 24
Understanding Function Triggers difference
One graph tells more then 1,000 words:
Mitigating risk of failure (understand)
© 2012 Pythian 25
Understanding Function Triggers difference IF and CASE Statements
When constructing IF and CASE statements, try to minimize the number of
comparisons that these statements are likely to make by testing for the most likely
scenarios first.
For instance, in the code in the next slide, the first statement maintains counts of
various percentages.
Assuming that the input data is evenly distributed, the first IF condition
(percentage>95) will match about once in every 20 executions.
On the other hand, the final condition will match in three out of four executions. So
this means that for 75% of the cases, all four comparisons will need to be
evaluated.
Mitigating risk of failure (understand)
© 2012 Pythian 26
Understanding Function Triggers difference Non Optimized
IF (percentage>95) THEN
SET Above95=Above95+1;
ELSEIF (percentage >=90) THEN
SET Range90to95=Range90to95+1;
ELSEIF (percentage >=75) THEN
SET Range75to89=Range75to89+1;
ELSE
SET LessThan75=LessThan75+1;
END IF;
Optimized IF (percentage<75) THEN
SET LessThan75=LessThan75+1;
ELSEIF (percentage >=75 AND percentage<90) THEN
SET Range75to89=Range75to89+1;
ELSEIF (percentage >=90 and percentage <=95) THEN
SET Range90to95=Range90to95+1;
ELSE
SET Above95=Above95+1;
END IF;
Mitigating risk of failure (understand)
© 2012 Pythian 27
Understanding Function Triggers difference
Mitigating risk of failure (understand)
Looks simple and the effect is relevant:
© 2012 Pythian 28
Understanding Function Triggers difference Using Recursion
A recursive routine is one that invokes itself.
Recursive routines often offer elegant solutions to complex programming
problems, but they also tend to consume large amounts of memory.
They are also likely to be less efficient and less scalable than implementations
based on iterative execution.
Mitigating risk of failure (understand)
© 2012 Pythian 29
Understanding Function Triggers difference
Mitigating risk of failure (understand)
Recursive
CREATE PROCEDURE rec_fib(n INT,OUT out_fib
INT)
BEGIN
DECLARE n_1 INT;
DECLARE n_2 INT;
IF (n=0) THEN
SET out_fib=0;
ELSEIF (n=1) then
SET out_fib=1;
ELSE
CALL rec_fib(n-1,n_1);
CALL rec_fib(n-2,n_2);
SET out_fib=(n_1 + n_2);
END IF;
END
Not Recursive CREATE PROCEDURE nonrec_fib(n INT,OUT
out_fib INT)
BEGIN
DECLARE m INT default 0;
DECLARE k INT DEFAULT 1;
DECLARE i INT;
DECLARE tmp INT;
SET m=0;
SET k=1;
SET i=1;
WHILE (i<=n) DO
SET tmp=m+k;
SET m=k;
SET k=tmp;
SET i=i+1;
END WHILE;
SET out_fib=m;
END
© 2012 Pythian 30
Understanding Function Triggers difference
Mitigating risk of failure (understand)
The difference is quite impressive and evident
© 2012 Pythian 31
Understanding Function Triggers difference
Mitigating risk of failure (understand)
When you need to retrieve only a single row from a SELECT statement,
using the INTO clause is far easier than declaring, opening, fetching from,
and closing a cursor. But does the INTO clause generate some additional
work for MySQL or could the INTO clause be more efficient than a cursor?
© 2012 Pythian 32
Understanding Function Triggers difference
Trigger Overhead
Every database trigger is associated with a specific DML operation (INSERT, UPDATE, or DELETE) on a specific table the trigger code will execute whenever that DML operation occurs on that table.
Furthermore, all MySQL 5.x triggers are of the FOR EACH ROW type, which means that the trigger code will execute once for each row affected by the DML operation.
Given that a single DML operation might potentially affect thousands of rows, should we be concerned that our triggers might have a negative effect on DML performance?
Absolutely yes!
Mitigating risk of failure (understand)
© 2012 Pythian 33
Understanding Function Triggers difference
Mitigating risk of failure (understand)
When using Trigger be ALWAYS sure to have the right
indexes.
© 2012 Pythian 34
Understanding DDL differences Identify conversion between Oracle and MySQL for
• Tables • Views
• Procedures • Functions
• Packages • Triggers
• Sequences, synonyms etc.
I.e. data types:
Mitigating risk of failure (match)
MySQL Data Type Oracle Data Type
BIGINT NUMBER(19, 0)
BIT RAW BLOB BLOB, RAW CHAR CHAR DATE DATE DATETIME DATE DECIMAL FLOAT (24) DOUBLE FLOAT (24)
DOUBLE PRECISION FLOAT (24)
ENUM VARCHAR2 FLOAT FLOAT
MySQL Data Type Oracle Data Type INT NUMBER(10, 0)
INTEGER NUMBER(10, 0)
LONGBLOB BLOB, RAW LONGTEXT CLOB, RAW MEDIUMBLOB BLOB, RAW MEDIUMINT NUMBER(7, 0) MEDIUMTEXT CLOB, RAW NUMERIC NUMBER REAL FLOAT (24) SET VARCHAR2 SMALLINT NUMBER(5, 0)
TEXT VARCHAR2, CLOB
TIME DATE TIMESTAMP DATE TINYBLOB RAW TINYINT NUMBER(3, 0) TINYTEXT VARCHAR2
VARCHAR VARCHAR2, CLOB
YEAR NUMBER
© 2012 Pythian 35
Understanding DML differences 1.Join syntax
2.SQL_mode
3.Data comparison using collation
4.other common differences
• SQL macro differences
• NVL() --> IFNULL()
• ROWNUM --> LIMIT
• SEQ.CURRVAL --> LAST_INSERT_ID()
• SEQ.NEXTVAL --> NULL
• NO DUAL necessary (SELECT NOW())
• NO DECODE() --> IF() CASE()
• JOIN (+) Syntax --> INNER|OUTER LEFT|RIGHT
• No Hierarchical (connect to prior)
Mitigating risk of failure (match)
© 2012 Pythian 36
Data export & Index redesign
• Re-organize the schema/table not just convert data types
• Storage engines
• Index full redesign
• Data organization
• Sharding
• Partition
• Logic rewrite
• Inside MySQL
• Move to application
Mitigating risk of failure (convert)
© 2012 Pythian 37
Realize a Proof of Concept
Don’t work Alone
Involve Oracle experienced DBA
Involve MySQL experience DBA
Involve the developers
Use real data
Use real traffic
Take one source for each type; start with the easy one
Go Back to the analysis phase if you have to
Mitigating risk of failure (POC)
© 2012 Pythian 38
General document
• Description of the main differences between platforms
• Description of the work around found
• Explanation of what to do to avoid most common issues
• Code write instructions
• Common function mapping
• List of the blocking issue(s)
• List and explanation of what cannot be migrated and why
What should my migration doc contains?
© 2012 Pythian 39
Per schema document
• Overview of the effort for the migration
What should my migration doc contains?
Schema Name: Test
Objects Number Time(min) hrs Cost(0,50 cent/min)
Table 200 320 5,3 2,65
Views 50 5 0,08 0,04
Procedure 500 5000 83,33 41,67
Function 12 200 3,33 1,67
Trigger 200 2500 41,67 20,83
Package 3 5 0,08 0,04
Total Time 8030 133,80 66,90
© 2012 Pythian 40
Per schema document
•Effort per table like:
What should my migration doc contains?
Schema Name: Test Table City
Rows 2000
Estimated min 10
Attribute Data type source dim source Data type dest dim dest
Name VARCHAR2 50 varchar 50
lat FLOAT FLOAT
long FLOAT FLOAT
population Number 10,0 INT 10
SqKm Number 7,0 MEDIUMINT
Country CHAR 3 CHAR 3
© 2012 Pythian 41
Trigger section
• Effort per schema
• Effort per Table:
What should my migration doc contains?
Schema
Name: Test
Events Before Time(min) After Time(min)
Insert 50 600 20 300
Update 50 600 20 300
Delete 50 600 10 100
Total 150 50
Packages 3 3
Total Time 1800 700
Schema Name: Test Table
Total 24
action time Trigger name Source Insert* Update* Delete* Trigger name dest
Before
Ins_change_ID 20 Ins_actions
Ins_change_ISO 10
upd_population 15 upd_population
After
del_died_male 10 del_died_male
avr_pop_calculation 15 avr_pop_calculation
Total 30 30 10
*time in minutes for conversion
© 2012 Pythian 42
Procedure - function section
• Effort per schema
• Effort per Table:
What should my migration doc contains?
Schema Name: Test
Package Number Time(min) Cost(0,50 cent/min)
Pack1 112 1200
Pack2 200 2000
Pack3 200 2000
Total 521
Packages 3
Total Time 5200
Schema Name: Test Table
Total 3
Procedure name Code rows Impedance** Time* Packge comments
Proc_1 200 4 480 Pckg1 Complex Error handling
Func_1 50 0 120 Pckg2 No problem
Proc_2 300 10 - Pckg1 Use of connect by prior
Total 600
*time in minutes for conversion
** The lower the better 10 means no portable
© 2012 Pythian 43
Document from the Proof of Concept per source type
• Expected results
• Real value from test
• Issues found
• Work around identify
• Time/Effort per schema
• Breakdown per object (Table, View, Trigger, SP)
• Redefine expectations
• Review efforts and costs
• Be ready to drop something from the migration list
What should my migration doc contains?
© 2012 Pythian 44
No! there are tools on the market but:
•Choose your product carefully !
•Better a simple one than something too complex
•Always double check before applying
•Nothing will replace human/professional knowledge/experience
Should I do all this manually?
© 2012 Pythian 47
http://www.pythian.com/news/
http://www.facebook.com/pages/The-Pythian-Group/163902527671
@pythian
http://www.linkedin.com/company/pythian
1-877-PYTHIAN
To contact us…
To follow us…
Thank you and Q&A
@pythianjobs