oracle data buffer cache

Post on 18-Dec-2014

4.533 Views

Category:

Technology

10 Downloads

Preview:

Click to see full reader

DESCRIPTION

Oracle Database internal info

TRANSCRIPT

1

data buffer cache管理机制浅析

2

内容

1. Oracle如何寻找到需要的 buffer?

2. Oracle如何管理 data buffer cache 里面的块?3. Oracle如何确定哪些块应该写入数据文件,如何写?

4. 和 Data buffer cache相关的一些等待事件

3

引入

• Instance最大的内存区域• db_cache_size参数• 分配单位: granule • 提供了 default、 keep、 recyle 三种不不同类型的 cache

• 多种数据块尺寸( 2、 4、 8、 16 或 32k)的 buffer cache对应不同的 blocksize数据块db_nk_cache_size

4

定位 buffer

• Buffer 存放的位置: The hash bucket for a particular block header is

determined based on the modulus of the Data Block Address (DBA) and the value of the _DB_BLOCK_HASH_BUCKETS parameter. For example, hash bucket = MOD(DBA, _DB_BLOCK_HASH_BUCKETS).

• 先在 PGA中构造 buffer discriptor内存结构,同需要的锁定模式一起传入搜索函数, hash 算法找到对应的bucket

搜索函数: kcbget(descriptor,lock_mode)

5

Buffer cache 示意图

_db_block_hash_buckets_db_block_hash_latches

6

BH 结构

视图: X$BH V$BH

BH (0x0x5cfce8c4) file#: 45 rdba: 0x0b417f8f (45/98191) class 1 ba: 0x0x5c73e000

set: 5 dbwrid: 0 obj: 198268 objn: 198268

hash: [6893c08c,6893c08c] lru: [61ff9348,57fee018]

LRU flags: hot_buffer

ckptq: [NULL] fileq: [NULL]

st: XCURRENT md: NULL rsop: 0x(nil) tch: 2

LRBA: [0x0.0.0] HSCN: [0xffff.ffffffff] HSUB: [255] RRBA: [0x0.0.0]

7

Hash chain+BH

CHAIN: 4889 LOC: 0x0x5a6e4100 HEAD: [51fdf58c,56fcc874] BH (0x0x51fdf58c) file#: 1 rdba: 0x00402432 (1/9266) class 1 ba:

0x0x51a1a000 set: 12 dbwrid: 0 obj: 3807 objn: 3807 hash: [56fcc874,5a6e4100] lru: [51fdf8c4,51fdeff4] LRU flags: ckptq: [NULL] fileq: [NULL] st: XCURRENT md: NULL rsop: 0x(nil) tch: 7 LRBA: [0x0.0.0] HSCN: [0xffff.ffffffff] HSUB: [255] RRBA: [0x0.0.0] buffer tsn: 0 rdba: 0x00402432 (1/9266) scn: 0x0000.000071f9 seq: 0x01 flg: 0x04 tail: 0x71f90601 frmt: 0x02 chkval: 0x42ef type: 0x06=trans data

8

Hash chain上的搜索

For each buffer in the chain:

- Ignore buffers that do not match RDBA

- Wait for READING buffer and return them

- Skip CR (consistent read) buffers

- If the CUR (current) buffer is held in a compatible mode, then use it

- Otherwise if all other users are CR state objects

– Make it a CR copy and create a new EXLCUR copy of the buffer

– Or wait for the current buffer to be released- If no usable buffers exist in cache, read from disk

搜索的时候,需要持有 cache buffer chain latch

9

Working sets

_db_block_lru_latches 缺省值为 DBWR进程的数量 ×8(允许的最大的 buffer pool数量)

10

Working sets

• Each list that is shown above will have sublists called the auxiliary write list (AUX) and a MAIN list. For example, the LRU-P list will have a LRUP-AUX and a LRUP-MAIN list.

• LRU-XR, LRU-XO and LRU-P are also called write lists.buffers are linked to these due to a specific write action.

• these lists are candidates for immediate write-outs by the DBWR.

• enable write prioritization capabilities• 一个 BH只能在 LRU或 LRUW上,但能存在多个 write

list上

11

working sets

Dump of buffer cache at level 10 (WS) size: 501 wsid: 1 state: 0 (WS_REPL_LIST) main_prev: 5a6ff9bc main_next: 5a6ff9bc aux_prev: 58fffd08 aux_next:

58fa4048curnum: 501 auxnum: 501cold: 5a6ff9bc hbmax: 0 hbufs: 0 (WS_WRITE_LIST) main_prev: 5a6ff9d8 main_next: 5a6ff9d8 aux_prev: 5a6ff9e0 aux_next:

5a6ff9e0curnum: 0 auxnum: 0 (WS_XOBJ_LIST) main_prev: 5a6ff9f4 main_next: 5a6ff9f4 aux_prev: 5a6ff9fc aux_next:

5a6ff9fccurnum: 0 auxnum: 0 (WS_XRNG_LIST) main_prev: 5a6ffa10 main_next: 5a6ffa10 aux_prev: 5a6ffa18 aux_next:

5a6ffa18curnum: 0 auxnum: 0 (WS) fbwanted: 0 (WS) bgotten: 0 sumwrt: 0 sumscan: 0 (WS) numscan: 0 hotscan: 0 dmoves: 0MAIN RPL_LST Queue header (NEXT_DIRECTION)[NULL]MAIN RPL_LST Queue header (PREV_DIRECTION)[NULL]AUXILIARY RPL_LST Queue header (NEXT_DIRECTION)[58fa4048,58fffd08]0x58fa4000=>0x58fa42f0=>0x58fa45e0=>0x58fa48d0=>0x58fa4bc0=>0x58fa4eb0=>0x58fa51a0=>0

x58fa54900x58fa5780=>0x58fa5a70=>0x58fa5d60=>0x58fa6050=>0x58fa6340=>0x58fa6630=>0x58fa6920=>

0x58fa6c100x58fa6f00=>0x58fa71f0=>0x58fa74e0=>0x58fa77d0=>0x58fa7ac0=>0x58fa7db0=>0x58fa80a0=>0

x58fa8390

12

确定可重用 buffer的过程

• 8i:从 list的尾端开始 scan,将冷端的buffer head所指向的内容牺牲掉

• 9i:当查询所需要的块需要从磁盘读进来,挂在 lru链上时,

1. 从 list的尾端开始 scan,先扫描辅 list,再扫描主 list

2. lru算法 +touch count数3. 热块往主 list上移动,从中插入主 list。辅 list上空了之后,执行相同的算法在 Lru中找出可牺牲的块,换到辅list上

13

LRU算法

IF ( touch count of scanned buffer >_db_aging_hot_criteria ) THENGive buffer another chance (do not select as a victim)IF (_db_aging_stay_count >= _db_aging_hot_criteria) THENHalve the buffer's touch countELSESet the buffer's touch count to _db_aging_stay_countEND IFELSE Select buffer as a victimEND IF

14

LRUW

• LRUW list:

The LRU-W (write) list is used to hold buffers that aged out of the LRU but need to be written to disk before they can be reused.

15

block 更改的过程

• Update block的过程:假如我要修改 BH2 指向的块的内容 1)oracle会将 BH2从辅助 LRU链表上摘下,同时插入主 LRU链表的中间,也就是插入 BH1和 BH4中间,同时增加 BH2的 touch的数量。2) 将该 BH2的标记设置为钉住( ping)。( latch保护)3) 更新 BH2对应的内存数据块的内容。4) 更新完以后,取消钉住的标记。5) 将 BH2从主 LRU链表转移到主 LRUW链表上。6) 如果这个时候又有进程发出更新 BH2所对应的内存数据块的内容,则 BH2再次被钉住,更新,取消钉住。7) DBWR启动以后,在扫描主 LRUW链表时会将 BH2转移到辅助LRUW链表上。

8) DBWR将辅助 LRUW链表上的 BH2对应的数据块写入数据文件。9) 确认成功写入数据文件以后,将 BH2从辅助 LRUW链表上转移到辅助 LRU链表上

16

checkpoint

Checkpoints( checkpoint queue latch)• To ensure that the data blocks that have their redo

generated up to a certain point in the redo log (RBA) are written to the disk

• Checkpoint structure includes:

– Checkpoint SCN

– Checkpoint RBA

– Thread that allocated the checkpoint

– Enabled thread bitmap

– Timestamp

关于 DDL:

– The Oracle server ensures that the DDL is successfully mini-checkpointed before the DROP (which depends on the SCN and seq# of the data blocks within the object).

17

CKPTQ & FQ

Pre-Oracle8 DBWR scanned the entire cache to find buffers with checkpoint bit set.

CKPTQ and FQs eliminate this scan.

– When the buffer is first modified, it is inserted into the CKPTQ in RBA order

– The buffer is also inserted into the appropriate FQ

When a checkpoint is initiated, DBWR writes all buffers on the queue until the checkpoint RBA is less than the head of the CKPTQ RBA.

全量检查点发生条件:发出命令: alter system checkpoint;除了 shutdown abort以外的正常关闭数据库。

18

DBWR

触发条件:1.Lru链上扫描以查找可以覆盖的 buffer header时,如果已经扫描的 buffer header的数量到达一定的限度(由隐藏参数: _db_block_max_scan_pct决定,我的库中是 40 )

2.当 DBWR在主 LRUW链表上查找已经更新完而正在等待被写入数据文件的 buffer header时,如果找到的 buffer header的数量超过一定限度(由隐藏参数:_db_writer_scan_depth_pct 决定 我的库中是 25 )

3. 如果主 LRUW链表和辅助 LRUW链表上的脏数据块的总数超过一定限度,。该限度由隐藏参数: _db_large_dirty_queue(我的库是 25)决定。

4. 完全检查点时触发 DBWR。5.将表空间设置为离线( offline)状态时触发 DBWR。6. 发出命令: alter tablespace … begin backup,从而将表空间设置为热备份状态时触发DBWR。7.将表空间设置为只读状态时,触发 DBWR。8.删除对象时(比如删除某个表)会触发 DBWR。

19

写数据

DBWR会将要写的脏数据块所对应的 buffer header拷贝到一个名为批量写( write batch)的结构中。每个 working set所对应的 DBWR进程都可以向该结构里拷贝 buffer header。当 write batch的 buffer header的个数达到一定限额时,才会发生实际的 I/O

20

等待事件

buffer busy waits

P1=file# p2=block_id p3= reason code

Reason code 130和 220是最常见

Free buffer waits

If a session spends a lot of time on the free buffer waits event, it is usually due to one or a combination of the following five reasons:

•Inefficient SQL statements

•Not enough DBWR processes

•Slow I/O subsystem

•Delayed block cleanouts.

•Small buffer cache

21

等待事件

Latch free

– Cache buffers chain

热点块问题

可以通过 v$session_wait的 p1raw字段来判断 latch free等待事件是否是由于出现了热点块。如果 p1raw保持一致,那么说明 session在等待同一个 latch 地址,系统存在热点块。

– Cache buffers lru chains– Checkpoint queue latch

top related