oracle中比对2张表之间数据是否一致的几种方法

25
Oracle 中比对 2 张表之间 数据是否一致的几种方法 by Maclean.liu [email protected] www.oracledatabase12g.com

Upload: maclean-liu

Post on 11-May-2015

1.427 views

Category:

Technology


11 download

TRANSCRIPT

Page 1: Oracle中比对2张表之间数据是否一致的几种方法

Oracle中比对 2张表之间数据是否一致的几种方法

by [email protected]

www.oracledatabase12g.com

Page 2: Oracle中比对2张表之间数据是否一致的几种方法

About Me

l Email & Gtalk:[email protected]

l Blog:www.oracledatabase12g.com

l QQ:47079569 QQ Group:23549328

l Oracle Certified Database Administrator Master 10g and 11g

l Over 6 years experience with Oracle DBA technology

l Over 7 years experience with Linux technology

l Member Independent Oracle Users Group

l Member All China Oracle Users Group

l Presents for advanced Oracle topics: RAC, DataGuard, Performance Tuning and Oracle Internal.

Page 3: Oracle中比对2张表之间数据是否一致的几种方法

How To Find Maclean Liu?

Page 4: Oracle中比对2张表之间数据是否一致的几种方法

大约是 2 个星期前做一个夜班的时候,开发人员需要比对 shareplex 数据同步复制软件在 源端和目标端的 2 张表上的数据是否一致,实际上后来想了下 shareplex 本身应当具有这种数据校验功能, 但是还是希望从数据库的角度得出几种可用的同表结构下的数据比对方法。

注意以下几种数据比对方式适用的前提条件:

1. 所要比对的表的结构是一致的2. 比对过程中源端和 目标端 表上的数据都是静态的,没有任何 DML 修改

方式 1:

假设你所要进行数据比对的数据库其中有一个 版本为 11g 且该表上有相应的主键索引(primary key index)或者唯一非空索引(unique key &not null)的话,那么恭喜你! 你可以借助 11g 新引入的专门做数据对比的 PL/SQL Package dbms_comparison 来实现数据校验的目的,如以下演示:

Source 源端版本为 11gR2 :

conn maclean/macleanSQL> select * from v$version;

BANNER--------------------------------------------------------------------------------Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit ProductionPL/SQL Release 11.2.0.3.0 - ProductionCORE 11.2.0.3.0 ProductionTNS for Linux: Version 11.2.0.3.0 - ProductionNLSRTL Version 11.2.0.3.0 - Production

SQL> select * from global_name;

GLOBAL_NAME--------------------------------------------------------------------------------www.oracledatabase12g.com & www.askmaclean.com

Page 5: Oracle中比对2张表之间数据是否一致的几种方法

drop table test1; create table test1 tablespace users as select object_id t1,object_name t2 from dba_objects where object_id is not null; alter table test1 add primary key(t1); exec dbms_stats.gather_table_stats('MACLEAN','TEST1',cascade=>TRUE);

create database link maclean connect to maclean identified by maclean using 'G10R21';Database link created.

以上源端数据库版本为 11.2.0.3 , 源表结构为 test1(t1 number primary key,t2 varchar2(128),透过 dblink 链接到版本为 10.2.0.1 的目标端

conn maclean/maclean

SQL> select * from v$version

BANNER----------------------------------------------------------------Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64biPL/SQL Release 10.2.0.1.0 - ProductionCORE 10.2.0.1.0 ProductionTNS for Linux: Version 10.2.0.1.0 - ProductionNLSRTL Version 10.2.0.1.0 - Production

create table test2 tablespace users as select object_id t1,object_name t2from dba_objects where object_id is not null;alter table test2 add primary key(t1);exec dbms_stats.gather_table_stats('MACLEAN','TEST2',cascade=>TRUE);

目标端版本为 10.2.0.1 , 表结构为 test2(t1 number primary key,t2 varchar2(128))。

注意这里 2 张表上均必须有相同的主键索引或者伪主键索引(pseudoprimary key 伪主键要求是唯一键且所有的成员列均是非空 NOT NULL)。

Page 6: Oracle中比对2张表之间数据是否一致的几种方法

实际创建 comparison 对象,并实施校验:

begin dbms_comparison.create_comparison(comparison_name => 'MACLEAN_TEST_COM', schema_name => 'MACLEAN', object_name => 'TEST1', dblink_name => 'MACLEAN', remote_schema_name => 'MACLEAN', remote_object_name => 'TEST2', scan_mode => dbms_comparison.CMP_SCAN_MODE_FULL);end;

PL/SQL procedure successfully completed.

SQL> set linesize 80 pagesize 1400

SQL> select * from user_comparison where comparison_name='MACLEAN_TEST_COM';

COMPARISON_NAME COMPA SCHEMA_NAME------------------------------ ----- ------------------------------OBJECT_NAME OBJECT_TYPE REMOTE_SCHEMA_NAME------------------------------ ----------------- ------------------------------REMOTE_OBJECT_NAME REMOTE_OBJECT_TYP------------------------------ -----------------DBLINK_NAME--------------------------------------------------------------------------------SCAN_MODE SCAN_PERCENT--------- ------------CYCLIC_INDEX_VALUE--------------------------------------------------------------------------------NULL_VALUE--------------------------------------------------------------------------------LOCAL_CONVERGE_TAG--------------------------------------------------------------------------------REMOTE_CONVERGE_TAG--------------------------------------------------------------------------------MAX_NUM_BUCKETS MIN_ROWS_IN_BUCKET--------------- ------------------LAST_UPDATE_TIME---------------------------------------------------------------------------MACLEAN_TEST_COM TABLE MACLEANTEST1 TABLE MACLEANTEST2 TABLEMACLEANFULL

ORA$STREAMS$NV

1000 1000020-DEC-11 01.08.44.562092 PM

Page 7: Oracle中比对2张表之间数据是否一致的几种方法

利用 dbms_comparison.create_comparison 创建 comparison 后,新建的 comparison 会出现在user_comparison 视图中;

以上我们完成了 comparison 的创建,但实际的校验仍未发生我们利用 10046 事件监控这个数据对比过程:

conn maclean/macleanset timing on;alter system flush shared_pool;

alter session set events '10046 trace name context forever,level 8';

set serveroutput on

DECLARE retval dbms_comparison.comparison_type;BEGIN IF dbms_comparison.compare('MACLEAN_TEST_COM', retval, perform_row_dif => TRUE) THEN dbms_output.put_line('No Differences'); ELSE dbms_output.put_line('Differences Found'); END IF;END;/

Differences Found =====> 返回 果结 为 Differences Found, 明数据存在差异并不一说致

PL/SQL procedure successfully completed.

Elapsed: 00:00:10.87

===========================10046 tkprof result =========================

SELECT MIN("T1"), MAX("T1")FROM "MACLEAN"."TEST1"

SELECT MIN("T1"), MAX("T1")FROM "MACLEAN"."TEST2"@MACLEAN

SELECT COUNT(1)FROM "MACLEAN"."TEST1" s WHERE ("T1" >= :scan_min AND "T1" <= :scan_max )

SELECT COUNT(1)FROM "MACLEAN"."TEST2"@MACLEAN s WHERE ("T1" >= :scan_min AND "T1" <= :scan_max )

SELECT q.wb1, min(q."T1") min_range1, max(q."T1") max_range1, count(*) num_rows, sum(q.s_hash) sum_range_hashFROM

Page 8: Oracle中比对2张表之间数据是否一致的几种方法

(SELECT /*+ FULL(s) */ width_bucket(s."T1", :scan_min1, :scan_max_inc1, :num_buckets) wb1, s."T1", ora_hash(NVL(to_char(s."T1"), 'ORA$STREAMS$NV'), 4294967295, ora_hash(NVL((s."T2"), 'ORA$STREAMS$NV'), 4294967295, 0)) s_hash FROM "MACLEAN"."TEST1" s WHERE (s."T1">=:scan_min1 AND s."T1"<= :scan_max1) ) q GROUP BY q.wb1 ORDER BY q.wb1

SELECT /*+ REMOTE_MAPPED */ q.wb1, min(q."T1") min_range1, max(q."T1") max_range1, count(*) num_rows, sum(q.s_hash) sum_range_hashFROM (SELECT /*+ FULL(s) REMOTE_MAPPED */ width_bucket(s."T1", :scan_min1, :scan_max_inc1, :num_buckets) wb1, s."T1", ora_hash(NVL(to_char(s."T1"), 'ORA$STREAMS$NV'), 4294967295, ora_hash(NVL((s."T2"), 'ORA$STREAMS$NV'), 4294967295, 0)) s_hash FROM "MACLEAN"."TEST2"@MACLEAN s WHERE (s."T1">= :scan_min1 AND s."T1"<=:scan_max1) ) q GROUP BY q.wb1 ORDER BY q.wb1

SELECT /*+ FULL(P) +*/ * FROM "MACLEAN"."TEST2" P

SELECT /*+ FULL ("A1") */ WIDTH_BUCKET("A1"."T1", :SCAN_MIN1, :SCAN_MAX_INC1, :NUM_BUCKETS), MIN("A1"."T1"), MAX("A1"."T1"), COUNT(*), SUM(ORA_HASH(NVL(TO_CHAR("A1"."T1"), 'ORA$STREAMS$NV'), 4294967295, ORA_HASH(NVL("A1"."T2", 'ORA$STREAMS$NV'), 4294967295, 0))) FROM "MACLEAN"."TEST2" "A1" WHERE "A1"."T1" >= :SCAN_MIN1 AND "A1"."T1" <= :SCAN_MAX1 GROUP BY WIDTH_BUCKET("A1"."T1", :SCAN_MIN1, :SCAN_MAX_INC1, :NUM_BUCKETS) ORDER BY WIDTH_BUCKET("A1"."T1", :SCAN_MIN1, :SCAN_MAX_INC1, :NUM_BUCKETS)

SELECT ROWID, "T1", "T2" FROM "MACLEAN"."TEST2" "R" WHERE "T1" >= :1 AND "T1" <= :2

--------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |--------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 126 | 3528 | 4 (0)| 00:00:01 ||* 1 | FILTER | | | | | || 2 | TABLE ACCESS BY INDEX ROWID| TEST2 | 126 | 3528 | 4 (0)| 00:00:01 ||* 3 | INDEX RANGE SCAN | SYS_C006255 | 227 | | 2 (0)| 00:00:01 |--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):---------------------------------------------------

Page 9: Oracle中比对2张表之间数据是否一致的几种方法

1 - filter(TO_NUMBER(:1)<=TO_NUMBER(:2)) 3 - access("T1">=TO_NUMBER(:1) AND "T1"<=TO_NUMBER(:2))

SELECT ll.l_rowid, rr.r_rowid, NVL(ll."T1", rr."T1") idx_valFROM (SELECT l.rowid l_rowid, l."T1", ora_hash(NVL(to_char(l."T1"), 'ORA$STREAMS$NV'), 4294967295, ora_hash(NVL((l."T2"), 'ORA$STREAMS$NV'), 4294967295, 0)) l_hash FROM "MACLEAN"."TEST1" l WHERE l."T1">=:scan_min1 AND l."T1"<=:scan_max1 ) ll FULL OUTER JOIN (SELECT /*+ NO_MERGE REMOTE_MAPPED */ r.rowid r_rowid, r."T1", ora_hash(NVL(to_char(r."T1"), 'ORA$STREAMS$NV'), 4294967295, ora_hash(NVL((r."T2"), 'ORA$STREAMS$NV'), 4294967295, 0)) r_hash FROM "MACLEAN"."TEST2"@MACLEAN r WHERE r."T1">= :scan_min1 AND r."T1"<=:scan_max1 ) rr ON ll."T1"=rr."T1" WHERE ll.l_hash IS NULL OR rr.r_hash IS NULL OR ll.l_hash <> rr.r_hash

----------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Inst |IN-OUT|----------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 190 | 754K| 9 (12)| 00:00:01 | | ||* 1 | VIEW | VW_FOJ_0 | 190 | 754K| 9 (12)| 00:00:01 | | ||* 2 | HASH JOIN FULL OUTER | | 190 | 754K| 9 (12)| 00:00:01 | | || 3 | VIEW | | 190 | 7220 | 4 (0)| 00:00:01 | | ||* 4 | FILTER | | | | | | | || 5 | TABLE ACCESS BY INDEX ROWID| TEST1 | 190 | 5510 | 4 (0)| 00:00:01 | | ||* 6 | INDEX RANGE SCAN | SYS_C0013098 | 341 | | 2 (0)| 00:00:01 | | || 7 | VIEW | | 126 | 495K| 4 (0)| 00:00:01 | | || 8 | REMOTE | TEST2 | 126 | 3528 | 4 (0)| 00:00:01 | MACLE~ | R->S |----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):---------------------------------------------------

1 - filter("LL"."L_HASH" IS NULL OR "RR"."R_HASH" IS NULL OR "LL"."L_HASH"<>"RR"."R_HASH") 2 - access("LL"."T1"="RR"."T1") 4 - filter(TO_NUMBER(:SCAN_MIN1)<=TO_NUMBER(:SCAN_MAX1)) 6 - access("L"."T1">=TO_NUMBER(:SCAN_MIN1) AND "L"."T1"<=TO_NUMBER(:SCAN_MAX1))

Remote SQL Information (identified by operation id):----------------------------------------------------

8 - SELECT ROWID,"T1","T2" FROM "MACLEAN"."TEST2" "R" WHERE "T1">=:1 AND "T1"<=:2 (accessing 'MACLEAN' )

Page 10: Oracle中比对2张表之间数据是否一致的几种方法

可以看到以上过程中虽然没有避免对 TEST1、TEST2 表的全表扫描(FULL TABLE SCAN), 但是好在实际参与 HASH JOIN FULL OUTER 的仅是访问索引后获得的少量数据,所以效率还是挺高的。

此外可以通过 user_comparison_row_dif 了解实际那些 row 存在差异,如:

SQL> set linesize 80 pagesize 1400SQL> select * 2 from user_comparison_row_dif 3 where comparison_name = 'MACLEAN_TEST_COM' 4 and rownum < 2;

COMPARISON_NAME SCAN_ID LOCAL_ROWID REMOTE_ROWID------------------------------ ---------- ------------------ ------------------INDEX_VALUE--------------------------------------------------------------------------------STA LAST_UPDATE_TIME--- ---------------------------------------------------------------------------MACLEAN_TEST_COM 42 AAATWGAAEAAANBrAAB AAANJrAAEAAB8AMAAd46DIF 20-DEC-11 01.18.08.917257 PM

以上利用 dbms_comparison 包完成了一次简单的数据比对,该方法适用于 11g 以上版本且要求表上有主键索引或非空唯一索引, 且不支持以下数据类型字段的比对

• LONG

• LONG RAW

• ROWID

• UROWID

• CLOB

• NCLOB

• BLOB

• BFILE

• User-defined types (including object types, REFs, varrays, and nested tables)

• Oracle-supplied types (including any types, XML types, spatial types, and media types)

Page 11: Oracle中比对2张表之间数据是否一致的几种方法

若要比对存有以上类型字段的表,那么需要在 create_comparison 时指定 column_list 参数排除掉这些类型的字段。

方法 1 dbms_comparison 的优势在于可以提供详细的比较信息,且在有适当索引的前提下效率较高。缺点在于有数据库版本的要求(at least 11gR1), 且也不支持 LONG 、CLOB 等字段的比较。

方式 2:

利用 minus Query 对比数据

这可以说是操作上最简单的一种方法,如:

select * from test1 minus select * from test2@maclean;

-----------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Inst |IN-OUT|-----------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 75816 | 3527K| | 1163 (40)| 00:00:14 | | || 1 | MINUS | | | | | | | | || 2 | SORT UNIQUE | | 75816 | 2147K| 2984K| 710 (1)| 00:00:09 | | || 3 | TABLE ACCESS FULL| TEST1 | 75816 | 2147K| | 104 (1)| 00:00:02 | | || 4 | SORT UNIQUE | | 50467 | 1379K| 1800K| 453 (1)| 00:00:06 | | || 5 | REMOTE | TEST2 | 50467 | 1379K| | 56 (0)| 00:00:01 | MACLE~ | R->S |-----------------------------------------------------------------------------------------------------

Remote SQL Information (identified by operation id):----------------------------------------------------

5 - SELECT "T1","T2" FROM "TEST2" "TEST2" (accessing 'MACLEAN' )

Page 12: Oracle中比对2张表之间数据是否一致的几种方法

Select * from (select 'MACLEAN.TEST1' "Row Source", a.* from (select /*+ FULL(Tbl1) */ T1, T2 from MACLEAN.TEST1 Tbl1 minus select /*+ FULL(Tbl2) */ T1, T2 from MACLEAN.TEST2@"MACLEAN" Tbl2) A union all select 'MACLEAN.TEST2@"MACLEAN"', b.* from (select /*+ FULL(Tbl2) */ T1, T2 from MACLEAN.TEST2@"MACLEAN" Tbl2 minus select /*+ FULL(Tbl1) */ T1, T2 from MACLEAN.TEST1 Tbl1) B) Order by 1;

MINUS Clause 会导致 2 张表均在本地被全表扫描(TABLE FULL SCAN),且要求发生 SORT

排序。 若所对比的表上有大量的数据,那么排序的代价将会是非常大的, 因此这种方法的效率不高。

方式 2 MINUS 的优点在于操作简便,特别适合于小表之间的数据检验。缺点在于 由于 SORT 排序可能导致在大数据量的情况下效率很低, 且同样不支持 LOB 和 LONG 这样的大对象。

方式 3:

使用 not exists子句,如:

select * from test1 a where not exists (select 1 from test2 b where a.t1 = b.t1 and a.t2 = b.t2);

no rows selected

Page 13: Oracle中比对2张表之间数据是否一致的几种方法

Elapsed: 00:00:00.06

------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 75816 | 7996K| | 691 (1)| 00:00:09 ||* 1 | HASH JOIN ANTI | | 75816 | 7996K| 3040K| 691 (1)| 00:00:09 || 2 | TABLE ACCESS FULL| TEST1 | 75816 | 2147K| | 104 (1)| 00:00:02 || 3 | TABLE ACCESS FULL| TEST2 | 77512 | 5979K| | 104 (1)| 00:00:02 |------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------

1 - access("A"."T1"="B"."T1" AND "A"."T2"="B"."T2")

照理说在数据量较大的情况下 not exists使用的 HASH JOIN ANTI 是在性能上是优于 MINUS

操作的, 但是当所要比较的表身处不同的 2 个数据库(distributed query)时将无法使用 HASH

JOIN ANTI,而会使用 FILTER OPERATION 这种效率极低的操作:

select * from test1 a where not exists (select 1 from test2@maclean b where a.t1 = b.t1 and a.t2 = b.t2)no rows selected

Elapsed: 00:01:05.76

--------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Inst |IN-OUT|--------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 75816 | 2147K| 147K (1)| 00:29:31 | | ||* 1 | FILTER | | | | | | | || 2 | TABLE ACCESS FULL| TEST1 | 75816 | 2147K| 104 (1)| 00:00:02 | | || 3 | REMOTE | TEST2 | 1 | 29 | 2 (0)| 00:00:01 |

Page 14: Oracle中比对2张表之间数据是否一致的几种方法

MACLE~ | R->S |--------------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------

1 - filter( NOT EXISTS (SELECT 0 FROM "B" WHERE "B"."T1"=:B1 AND "B"."T2"=:B2))

Remote SQL Information (identified by operation id):----------------------------------------------------

3 - SELECT "T1","T2" FROM "TEST2" "B" WHERE "T1"=:1 AND "T2"=:2 (accessing 'MACLEAN' )

可以从以上执行计划看到 FILTER 操作是十分昂贵的。

方式 3 的优点在于操作简便, 且当需要对比的表位于同一数据库时效率要比 MINUS 方式高,但如果是 distributed query分布式查询则效率会因 FILTER 操作而急剧下降。not exists 同样不支持 CLOB 等大对象。

方式 4:

Toad、PL/SQL Developer 等图形化工具都提供了 compare table data 的功能, 这里我们以Toad工具为例,介绍如何使用该工具校验数据:

打开 Toad 链接数据库-> 点击Database-> Compare -> Data

Page 15: Oracle中比对2张表之间数据是否一致的几种方法

分别在 Source 1 和 Source 2 对话框中输入源表和目标表的信息

因为 Toad 的底层实际上使用了 MINUS 操作,所以提高 SORT_AREA_SIZE 有助于提高compare 的性能,若使用 AUTO PGA则可以不设置。

Page 18: Oracle中比对2张表之间数据是否一致的几种方法

首先可以比较 2 张表的行数,点击 Execute计算 count

Page 19: Oracle中比对2张表之间数据是否一致的几种方法

使用 MINUS 找出其中一张表上有,而另一张没有的行

Page 21: Oracle中比对2张表之间数据是否一致的几种方法

Toad 的 compare data 功能是基于 MINUS 实现的,所以效率上并没有优势。但是通过图形界面省去了写 SQL语句的麻烦。这种方法同样不支持 LOB、LONG 等对象。

Page 22: Oracle中比对2张表之间数据是否一致的几种方法

方式 5:

这是一种别出心裁的做法。 将一行数据的上所有字段合并起来,并使用dbms_utility.get_hash_value 对合并后的中间值取 hash value,再将所有这些从各行所获得的hash值 sum累加, 若 2 表的 hash累加值相等则判定 2 表的数据一致。

简单来说,如下面这样:

create table hash_one as select object_id t1,object_name t2 from dba_objects;

select dbms_utility.get_hash_value(t1||t2,0,power(2,30)) from hash_one where rownum <3;

DBMS_UTILITY.GET_HASH_VALUE(T1||T2,0,POWER(2,30))------------------------------------------------- 89209477 757190129

select sum(dbms_utility.get_hash_value(t1||t2,0,power(2,30))) from hash_one;

SUM(DBMS_UTILITY.GET_HASH_VALU------------------------------ 40683165992756

select sum(dbms_utility.get_hash_value(object_id||object_name,0,power(2,30))) from dba_objects;

SUM(DBMS_UTILITY.GET_HASH_VALU------------------------------ 40683165992756

Page 23: Oracle中比对2张表之间数据是否一致的几种方法

对于列较多的表,手动去构造所有字段合并可能会比较麻烦,利用以下 SQL 可以快速构造出我们所需要的语句:

放到 PL/SQL Developer等工具中运行,在 sqlplus 中可能因 ORA-00923: FROM keyword not found where expected出错

select 'select sum(dbms_utility.get_hash_value('||column_name_path||',0,power(2,30)) ) from '||owner||'.'||table_name||';' from (select owner,table_name,column_name_path,row_number() over(partition by table_name order by table_name,curr_level desc) column_name_path_rank from (select owner,table_name,column_name,rank,level as curr_level,ltrim(sys_connect_by_path(column_name,'||''|''||'),'||''|''||') column_name_path from (select owner,table_name,column_name,row_number() over(partition by table_name order by table_name,column_name) rank from dba_tab_columns where owner=UPPER('&OWNER') and table_name=UPPER('&TABNAME') order by table_name,column_name) connect by table_name = prior table_name and rank-1 = prior rank)) where column_name_path_rank=1;

使用示范:

SQL> @get_hash_colEnter value for owner: SYSEnter value for tabname: TAB$

'SELECTSUM(DBMS_UTILITY.GET_HASH_VALUE('||COLUMN_NAME_PATH||',0,POWER(2,30)))FROM--------------------------------------------------------------------------------select sum(dbms_utility.get_hash_value(ANALYZETIME||'|'||AUDIT$||'|'||AVGRLN||'|'||AVGSPC||'|'||AVGSPC_FLB||'|'||BLKCNT||'|'||BLOCK#||'|'||BOBJ#||'|'||CHNCNT||'|'||CLUCOLS||'|'||COLS||'|'||DATAOBJ#||'|'||DEGREE||'|'||EMPCNT||'|'||FILE#||'|'||FLAGS||'|'||FLBCNT||'|'||INITRANS||'|'||INSTANCES||'|'||INTCOLS||'|'||KERNELCOLS||'|'||MAXTRANS||'|'||OBJ#||'|'||PCTFREE$||'|'||PCTUSED$||'|'||PROPERTY||'|'||ROWCNT||'|'||SAMPLESIZE||'|'||SPARE1||'|'||SPARE2||'|'||SPARE3||'|'||SPARE4||'|'||SPARE5||'|'||SPARE6||'|'||TAB#||'|'||TRIGFLAG||'|'||TS#,0,1073741824) ) from SYS.TAB$;

Page 24: Oracle中比对2张表之间数据是否一致的几种方法

利用以上生成的 SQL 算表的计 sum(hash)值

select sum(dbms_utility.get_hash_value(ANALYZETIME || '|' || AUDIT$ || '|' || AVGRLN || '|' || AVGSPC || '|' || AVGSPC_FLB || '|' || BLKCNT || '|' || BLOCK# || '|' || BOBJ# || '|' || CHNCNT || '|' || CLUCOLS || '|' || COLS || '|' || DATAOBJ# || '|' || DEGREE || '|' || EMPCNT || '|' || FILE# || '|' || FLAGS || '|' || FLBCNT || '|' || INITRANS || '|' || INSTANCES || '|' || INTCOLS || '|' || KERNELCOLS || '|' || MAXTRANS || '|' || OBJ# || '|' || PCTFREE$ || '|' || PCTUSED$ || '|' || PROPERTY || '|' || ROWCNT || '|' || SAMPLESIZE || '|' || SPARE1 || '|' || SPARE2 || '|' || SPARE3 || '|' || SPARE4 || '|' || SPARE5 || '|' || SPARE6 || '|' || TAB# || '|' || TRIGFLAG || '|' || TS#, 0, 1073741824)) from SYS.TAB$;

SUM(DBMS_UTILITY.GET_HASH_VALU------------------------------ 1646389632463

Page 25: Oracle中比对2张表之间数据是否一致的几种方法

方式 5 利用累加整行数据的 hash 来判定表上数据是否一致, 仅需要对 2 张表做全表扫描,效率上是这几种方法中最高的, 且能保证较高的准确率。

但是该 hash 方式存在以下几点不足:

1. 所有字段合并的整行数据可能超过 4000 字节,这时会出现 ORA-1498错误。换而言之使用这种方式的前提是表中任一行的行长不能超过 4000 bytes,当然常规情况下很少会有一行数据超过 4000 bytes,也可以通过 dba_tables.avg_row_len平均行长的统计信息来判定,若avg_row_len<<4000 那么一般不会有溢出的问题。

2. 该 hash 方式仅能帮助判断 数据是否一致, 而无法提供更多有用的,例如是哪些行不一致等细节信息

3. 同样的该 hash 方式对于 lob、long 字段也无能为力

© 2011, www.oracledatabase12g.com. 版权所有.文章允许转载,但必须以链接方式注明源地址,

否则追究法律责任.