Oracle 10g引入了ASM,在10g版本中,如果由于故障(光纜故障,控制器故障,HBA卡故障或者其他故障造成磁盤無法訪問),這時(shí)Oracle會把這個(gè)磁盤drop掉。在11g中,Oracle引入了一個(gè)參數(shù)disk_repair_time,這個(gè)參數(shù)與"Oracle ASM Fast Mirror Resync"有關(guān),有了這個(gè)特性,當(dāng)故障發(fā)生時(shí)(除磁盤自身故障),在disk_repair_time時(shí)間之內(nèi),待故障解決磁盤在線后,Oracle會同步由于故障而暫時(shí)沒有寫入本磁盤extent的數(shù)據(jù),而不必同步磁盤上所有的數(shù)據(jù),進(jìn)而避免因此造成的性能問題。如果超過disk_repair_time時(shí)間,系統(tǒng)仍未修復(fù),Oracle會drop這個(gè)磁盤。默認(rèn)時(shí)間為3.6小時(shí),一般能滿足大多數(shù)環(huán)境,可以根據(jù)實(shí)際情況設(shè)置這個(gè)參數(shù)。
公司主營業(yè)務(wù):成都網(wǎng)站設(shè)計(jì)、成都做網(wǎng)站、移動網(wǎng)站開發(fā)等業(yè)務(wù)。幫助企業(yè)客戶真正實(shí)現(xiàn)互聯(lián)網(wǎng)宣傳,提高企業(yè)的競爭能力。創(chuàng)新互聯(lián)是一支青春激揚(yáng)、勤奮敬業(yè)、活力青春激揚(yáng)、勤奮敬業(yè)、活力澎湃、和諧高效的團(tuán)隊(duì)。公司秉承以“開放、自由、嚴(yán)謹(jǐn)、自律”為核心的企業(yè)文化,感謝他們對我們的高要求,感謝他們從不同領(lǐng)域給我們帶來的挑戰(zhàn),讓我們激情的團(tuán)隊(duì)有機(jī)會用頭腦與智慧不斷的給客戶帶來驚喜。創(chuàng)新互聯(lián)推出新沂免費(fèi)做網(wǎng)站回饋大家。
使用這個(gè)特性,要滿足兩個(gè)條件:
1.磁盤組的COMPATIBLE屬性版本至少在11.1及以上(磁盤組COMPATIBLE參數(shù)影響到磁盤組的格式,元數(shù)據(jù),AU等)2.磁盤組的的冗余模式為Normal或High
注意:如果是磁盤自身故障(DG冗余模式為Norma/High),這個(gè)磁盤必須drop,添加磁盤后,oracle會自動reblancing。如果冗余模式為external,磁盤出故障時(shí),磁盤組會離線,要通過備份來恢復(fù)數(shù)據(jù)庫。Exadata冗余度至少為Normal,是通過在ASM級別中mirror,所以等cell節(jié)點(diǎn)任何一個(gè)節(jié)點(diǎn)down機(jī),不影響數(shù)據(jù)庫的正常使用。
本文通過在11g ASM中創(chuàng)建一磁盤組db2,然后將磁盤組中一塊盤offline,超過disk_repair_time后,觀察磁盤的變化
ASMCMD> lsattr -G db2 -lName Value access_control.enabled FALSE access_control.umask 066 au_size 1048576 cell.smart_scan_capable FALSE compatible.asm 11.2.0.0.0 compatible.rdbms 10.1.0.0.0 disk_repair_time 3.6h sector_size 512
ASMCMD> setattr -G db2 disk_repair_time 5h
ORA-15032: not all alterations performed ORA-15242: could not set attribute disk_repair_time ORA-15283: ASM operation requires compatible.rdbms of 11.1.0.0.0 or higher (DBD ERROR: OCIStmtExecute)注意:compatible參數(shù)至少為11.1
ASMCMD> setattr -G db2 compatible.rdbms 11.2
ASMCMD> lsattr -G db2 -l
Name Value access_control.enabled FALSE access_control.umask 066 au_size 1048576 cell.smart_scan_capable FALSE compatible.asm 11.2.0.0.0 compatible.rdbms 11.2 disk_repair_time 3.6h sector_size 512
為了測試,設(shè)置disk_repair_time為5分鐘,m代表分鐘,h代表小時(shí),如果不輸單位,默認(rèn)是小時(shí)
ASMCMD> setattr -G db2 disk_repair_time 5m
ASM alert后臺日志
SQL> /* ASMCMD */ALTER DISKGROUP DB2 SET ATTRIBUTE 'disk_repair_time' = '5m' SUCCESS: /* ASMCMD */ALTER DISKGROUP DB2 SET ATTRIBUTE 'disk_repair_time' = '5m'
ASMCMD> lsattr -G db2 -l
Name Value access_control.enabled FALSE access_control.umask 066 au_size 1048576 cell.smart_scan_capable FALSE compatible.asm 11.2.0.0.0 compatible.rdbms 11.2 disk_repair_time 5m sector_size 512 ASMCMD>
查看磁盤組的信息和磁盤頭信息
ASMCMD> lsdsk -G db2
Path /dev/oracleasm/disks/ASMDISK11 /dev/oracleasm/disks/ASMDISK13
ASMCMD> lsdsk -G db2 --statistics
Reads Write Read_Errs Write_Errs Read_time Write_Time Bytes_Read Bytes_Written Voting_File Path 191 1068 0 0 .965827 39.09044 1794048 4374528 N /dev/oracleasm/disks/ASMDISK11 166 1068 0 0 1.040196 38.521265 684032 4374528 N /dev/oracleasm/disks/ASMDISK13ASMCMD>
[oracle@ohs1 ~]$ kfed read /dev/oracleasm/disks/ASMDISK11
kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 2147483648 ; 0x008: disk=0 kfbh.check: 1549816565 ; 0x00c: 0x5c6052f5 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdhdb.driver.provstr:ORCLDISKASMDISK11 ; 0x000: length=17 kfdhdb.driver.reserved[0]: 1145918273 ; 0x008: 0x444d5341 kfdhdb.driver.reserved[1]: 827020105 ; 0x00c: 0x314b5349 kfdhdb.driver.reserved[2]: 49 ; 0x010: 0x00000031 kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000 kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000 kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000 kfdhdb.compat: 186646528 ; 0x020: 0x0b200000 kfdhdb.dsknum: 0 ; 0x024: 0x0000 kfdhdb.grptyp: 2 ; 0x026: KFDGTP_NORMAL kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER kfdhdb.dskname: DB2_0000 ; 0x028: length=8 kfdhdb.grpname: DB2 ; 0x048: length=3 kfdhdb.fgname: DB2_0000 ; 0x068: length=8 kfdhdb.capname: ; 0x088: length=0 kfdhdb.crestmp.hi: 33036942 ; 0x0a8: HOUR=0xe DAYS=0x14 MNTH=0x6 YEAR=0x7e0 kfdhdb.crestmp.lo: 3147311104 ; 0x0ac: USEC=0x0 MSEC=0x20a SECS=0x39 MINS=0x2e kfdhdb.mntstmp.hi: 33036942 ; 0x0b0: HOUR=0xe DAYS=0x14 MNTH=0x6 YEAR=0x7e0 kfdhdb.mntstmp.lo: 3163762688 ; 0x0b4: USEC=0x0 MSEC=0xcc SECS=0x9 MINS=0x2f kfdhdb.secsize: 512 ; 0x0b8: 0x0200 kfdhdb.blksize: 4096 ; 0x0ba: 0x1000 kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000 kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80 kfdhdb.dsksize: 2447 ; 0x0c4: 0x0000098f kfdhdb.pmcnt: 2 ; 0x0c8: 0x00000002 kfdhdb.fstlocn: 1 ; 0x0cc: 0x00000001 kfdhdb.altlocn: 2 ; 0x0d0: 0x00000002 kfdhdb.f1b1locn: 2 ; 0x0d4: 0x00000002 kfdhdb.redomirrors[0]: 0 ; 0x0d8: 0x0000 kfdhdb.redomirrors[1]: 0 ; 0x0da: 0x0000 kfdhdb.redomirrors[2]: 0 ; 0x0dc: 0x0000 kfdhdb.redomirrors[3]: 0 ; 0x0de: 0x0000 kfdhdb.dbcompat: 168820736 ; 0x0e0: 0x0a100000 kfdhdb.grpstmp.hi: 33036942 ; 0x0e4: HOUR=0xe DAYS=0x14 MNTH=0x6 YEAR=0x7e0 kfdhdb.grpstmp.lo: 3147119616 ; 0x0e8: USEC=0x0 MSEC=0x14f SECS=0x39 MINS=0x2e kfdhdb.vfstart: 0 ; 0x0ec: 0x00000000 kfdhdb.vfend: 0 ; 0x0f0: 0x00000000 kfdhdb.spfile: 0 ; 0x0f4: 0x00000000 kfdhdb.spfflg: 0 ; 0x0f8: 0x00000000
[oracle@ohs1 ~]$ kfed read /dev/oracleasm/disks/ASMDISK13
kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 2147483649 ; 0x008: disk=1 kfbh.check: 1549816567 ; 0x00c: 0x5c6052f7 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdhdb.driver.provstr:ORCLDISKASMDISK13 ; 0x000: length=17 kfdhdb.driver.reserved[0]: 1145918273 ; 0x008: 0x444d5341 kfdhdb.driver.reserved[1]: 827020105 ; 0x00c: 0x314b5349 kfdhdb.driver.reserved[2]: 51 ; 0x010: 0x00000033 kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000 kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000 kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000 kfdhdb.compat: 186646528 ; 0x020: 0x0b200000 kfdhdb.dsknum: 1 ; 0x024: 0x0001 kfdhdb.grptyp: 2 ; 0x026: KFDGTP_NORMAL kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER kfdhdb.dskname: DB2_0001 ; 0x028: length=8 kfdhdb.grpname: DB2 ; 0x048: length=3 kfdhdb.fgname: DB2_0001 ; 0x068: length=8 kfdhdb.capname: ; 0x088: length=0 kfdhdb.crestmp.hi: 33036942 ; 0x0a8: HOUR=0xe DAYS=0x14 MNTH=0x6 YEAR=0x7e0 kfdhdb.crestmp.lo: 3147311104 ; 0x0ac: USEC=0x0 MSEC=0x20a SECS=0x39 MINS=0x2e kfdhdb.mntstmp.hi: 33036942 ; 0x0b0: HOUR=0xe DAYS=0x14 MNTH=0x6 YEAR=0x7e0 kfdhdb.mntstmp.lo: 3163762688 ; 0x0b4: USEC=0x0 MSEC=0xcc SECS=0x9 MINS=0x2f kfdhdb.secsize: 512 ; 0x0b8: 0x0200 kfdhdb.blksize: 4096 ; 0x0ba: 0x1000 kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000 kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80 kfdhdb.dsksize: 2447 ; 0x0c4: 0x0000098f kfdhdb.pmcnt: 2 ; 0x0c8: 0x00000002 kfdhdb.fstlocn: 1 ; 0x0cc: 0x00000001 kfdhdb.altlocn: 2 ; 0x0d0: 0x00000002 kfdhdb.f1b1locn: 2 ; 0x0d4: 0x00000002 kfdhdb.redomirrors[0]: 0 ; 0x0d8: 0x0000 kfdhdb.redomirrors[1]: 0 ; 0x0da: 0x0000 kfdhdb.redomirrors[2]: 0 ; 0x0dc: 0x0000 kfdhdb.redomirrors[3]: 0 ; 0x0de: 0x0000 kfdhdb.dbcompat: 168820736 ; 0x0e0: 0x0a100000 kfdhdb.grpstmp.hi: 33036942 ; 0x0e4: HOUR=0xe DAYS=0x14 MNTH=0x6 YEAR=0x7e0 kfdhdb.grpstmp.lo: 3147119616 ; 0x0e8: USEC=0x0 MSEC=0x14f SECS=0x39 MINS=0x2e kfdhdb.vfstart: 0 ; 0x0ec: 0x00000000 kfdhdb.vfend: 0 ; 0x0f0: 0x00000000 kfdhdb.spfile: 0 ; 0x0f4: 0x00000000 kfdhdb.spfflg: 0 ; 0x0f8: 0x00000000ASMCMD> iostat -G db2
Group_Name Dsk_Name Reads Writes DB2 DB2_0000 1859584 5001216 DB2 DB2_0001 749568 5001216ASMCMD> offline -G db2 -D DB2_0001
Diskgroup altered.
ASMCMD>
從ASM alet中可以看到磁盤將要在5分鐘后drop
WARNING: Disk 1 (DB2_0001) in group 1 will be dropped in: (300) secs on ASM inst 1
Mon Jun 20 15:55:23 2016
5分鐘后,磁盤被ASM強(qiáng)制刪除(alert日志)
SQL> alter diskgroup DB2 drop disk DB2_0001 force /* ASM SERVER */ NOTE: GroupBlock outside rolling migration privileged region NOTE: requesting all-instance membership refresh for group=1 Mon Jun 20 16:01:36 2016 GMON updating for reconfiguration, group 1 at 26 for pid 33, osid 8507 NOTE: cache closing disk 1 of grp 1: (not open) DB2_0001 NOTE: group DB2: updated PST location: disk 0000 (PST copy 0) NOTE: group 1 PST updated. Mon Jun 20 16:01:36 2016 NOTE: membership refresh pending for group 1/0xe0485f13 (DB2) GMON querying group 1 at 27 for pid 19, osid 5801 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DB2 SUCCESS: refreshed membership for 1/0xe0485f13 (DB2) NOTE: starting rebalance of group 1/0xe0485f13 (DB2) at power 1 SUCCESS: alter diskgroup DB2 drop disk DB2_0001 force /* ASM SERVER */ SUCCESS: PST-initiated drop disk in group 1(3762839315)) Starting background process ARB0 Mon Jun 20 16:01:39 2016 ARB0 started with pid=34, OS id=9355 NOTE: assigning ARB0 to group 1/0xe0485f13 (DB2) with 1 parallel I/O NOTE: stopping process ARB0 SUCCESS: rebalance completed for group 1/0xe0485f13 (DB2) NOTE: Attempting voting file refresh on diskgroup DB2 Mon Jun 20 16:01:43 2016 NOTE: GroupBlock outside rolling migration privileged region NOTE: requesting all-instance membership refresh for group=1 GMON updating for reconfiguration, group 1 at 28 for pid 34, osid 9361 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DB2 NOTE: group DB2: updated PST location: disk 0000 (PST copy 0) NOTE: group 1 PST updated. WARNING: offline disk number 1 has references (51 AUs) NOTE: membership refresh pending for group 1/0xe0485f13 (DB2) Mon Jun 20 16:01:49 2016 GMON querying group 1 at 29 for pid 19, osid 5801 NOTE: cache closing disk 1 of grp 1: (not open) _DROPPED_0001_DB2 Mon Jun 20 16:01:49 2016 SUCCESS: refreshed membership for 1/0xe0485f13 (DB2) NOTE: Attempting voting file refresh on diskgroup DB2
可以看到磁盤組db2 offline_disks顯示為1
ASMCMD> lsdgState Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED NORMAL N 512 4096 1048576 2447 2394 0 1197 1 N DB2/ MOUNTED EXTERN N 512 4096 1048576 4894 4786 0 4786 0 N OHSDBA/ MOUNTED NORMAL N 512 4096 1048576 7341 6410 2447 1981 0 Y SYSTEMDG/ ASMCMD>
offline后,再次讀取磁盤頭部信息(發(fā)現(xiàn)頭部信息沒改變)
[oracle@ohs1 ~]$ kfed read /dev/oracleasm/disks/ASMDISK13kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 2147483649 ; 0x008: disk=1 kfbh.check: 1549816567 ; 0x00c: 0x5c6052f7 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdhdb.driver.provstr:ORCLDISKASMDISK13 ; 0x000: length=17 kfdhdb.driver.reserved[0]: 1145918273 ; 0x008: 0x444d5341 kfdhdb.driver.reserved[1]: 827020105 ; 0x00c: 0x314b5349 kfdhdb.driver.reserved[2]: 51 ; 0x010: 0x00000033 kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000 kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000 kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000 kfdhdb.compat: 186646528 ; 0x020: 0x0b200000 kfdhdb.dsknum: 1 ; 0x024: 0x0001 kfdhdb.grptyp: 2 ; 0x026: KFDGTP_NORMAL kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER kfdhdb.dskname: DB2_0001 ; 0x028: length=8 kfdhdb.grpname: DB2 ; 0x048: length=3 kfdhdb.fgname: DB2_0001 ; 0x068: length=8 kfdhdb.capname: ; 0x088: length=0 kfdhdb.crestmp.hi: 33036942 ; 0x0a8: HOUR=0xe DAYS=0x14 MNTH=0x6 YEAR=0x7e0 kfdhdb.crestmp.lo: 3147311104 ; 0x0ac: USEC=0x0 MSEC=0x20a SECS=0x39 MINS=0x2e kfdhdb.mntstmp.hi: 33036942 ; 0x0b0: HOUR=0xe DAYS=0x14 MNTH=0x6 YEAR=0x7e0 kfdhdb.mntstmp.lo: 3163762688 ; 0x0b4: USEC=0x0 MSEC=0xcc SECS=0x9 MINS=0x2f kfdhdb.secsize: 512 ; 0x0b8: 0x0200 kfdhdb.blksize: 4096 ; 0x0ba: 0x1000 kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000 kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80 kfdhdb.dsksize: 2447 ; 0x0c4: 0x0000098f kfdhdb.pmcnt: 2 ; 0x0c8: 0x00000002 kfdhdb.fstlocn: 1 ; 0x0cc: 0x00000001 kfdhdb.altlocn: 2 ; 0x0d0: 0x00000002 kfdhdb.f1b1locn: 2 ; 0x0d4: 0x00000002 kfdhdb.redomirrors[0]: 0 ; 0x0d8: 0x0000 kfdhdb.redomirrors[1]: 0 ; 0x0da: 0x0000 kfdhdb.redomirrors[2]: 0 ; 0x0dc: 0x0000 kfdhdb.redomirrors[3]: 0 ; 0x0de: 0x0000 kfdhdb.dbcompat: 168820736 ; 0x0e0: 0x0a100000 kfdhdb.grpstmp.hi: 33036942 ; 0x0e4: HOUR=0xe DAYS=0x14 MNTH=0x6 YEAR=0x7e0 kfdhdb.grpstmp.lo: 3147119616 ; 0x0e8: USEC=0x0 MSEC=0x14f SECS=0x39 MINS=0x2e kfdhdb.vfstart: 0 ; 0x0ec: 0x00000000 kfdhdb.vfend: 0 ; 0x0f0: 0x00000000 kfdhdb.spfile: 0 ; 0x0f4: 0x00000000 kfdhdb.spfflg: 0 ; 0x0f8: 0x00000000SQL> select name, path from v$asm_disk;
NAME PATH ---------------------------------------- ------------------------------------------------------------ /dev/oracleasm/disks/ASMDISK14 /dev/oracleasm/disks/ASMDISK13 _DROPPED_0001_DB2 DB2_0000 /dev/oracleasm/disks/ASMDISK11 OHSDBA_0001 /dev/oracleasm/disks/ASMDISK10 OHSDBA_0000 /dev/oracleasm/disks/ASMDISK9 DATA_PGOLD_0004 /dev/oracleasm/disks/ASMDISK8 DATA_PGOLD_0003 /dev/oracleasm/disks/ASMDISK7 DATA_PGOLD_0002 /dev/oracleasm/disks/ASMDISK6 DATA_PGOLD_0001 /dev/oracleasm/disks/ASMDISK5 DATA_PGOLD_0000 /dev/oracleasm/disks/ASMDISK4 NAME PATH ---------------------------------------- ------------------------------------------------------------ SYSTEMDG_0002 /dev/oracleasm/disks/ASMDISK3 SYSTEMDG_0001 /dev/oracleasm/disks/ASMDISK2 SYSTEMDG_0000 /dev/oracleasm/disks/ASMDISK1 14 rows selected. SQL>
被drop后,再次嘗試online disk
ASMCMD> online -G db2 -D DB2_0001ORA-15032: not all alterations performed ORA-15054: disk "DB2_0001" does not exist in diskgroup "DB2" (DBD ERROR: OCIStmtExecute) ASMCMD>
嘗試undrop disk,雖然沒報(bào)什么錯(cuò)誤,但是磁盤仍舊沒能撤回成功
[oracle@ohs1 ~]$ sqlplus / as sysasmSQL*Plus: Release 11.2.0.3.0 Production on Mon Jun 20 16:50:17 2016
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Real Application Clusters and Automatic Storage Management options
SQL> alter diskgroup db2 undrop disks; Diskgroup altered. SQL> SQL> alter diskgroup db2 undrop disks NOTE: GroupBlock outside rolling migration privileged region NOTE: requesting all-instance membership refresh for group=2 Mon Jun 20 16:50:35 2016 GMON updating for reconfiguration, group 2 at 17 for pid 28, osid 16254 NOTE: cache closing disk 1 of grp 2: (not open) _DROPPED_0001_DB2 NOTE: group DB2: updated PST location: disk 0000 (PST copy 0) NOTE: group 2 PST updated. Mon Jun 20 16:50:36 2016 NOTE: membership refresh pending for group 2/0xfb327d85 (DB2) GMON querying group 2 at 18 for pid 19, osid 13443 NOTE: cache closing disk 1 of grp 2: (not open) _DROPPED_0001_DB2 SUCCESS: refreshed membership for 2/0xfb327d85 (DB2) NOTE: starting rebalance of group 2/0xfb327d85 (DB2) at power 1 SUCCESS: alter diskgroup db2 undrop disks Starting background process ARB0 Mon Jun 20 16:50:38 2016 ARB0 started with pid=29, OS id=16331 NOTE: assigning ARB0 to group 2/0xfb327d85 (DB2) with 1 parallel I/O NOTE: stopping process ARB0 SUCCESS: rebalance completed for group 2/0xfb327d85 (DB2) NOTE: Attempting voting file refresh on diskgroup DB2 Mon Jun 20 16:50:41 2016 NOTE: GroupBlock outside rolling migration privileged region NOTE: requesting all-instance membership refresh for group=2 GMON updating for reconfiguration, group 2 at 19 for pid 29, osid 16336 NOTE: cache closing disk 1 of grp 2: (not open) _DROPPED_0001_DB2 NOTE: group DB2: updated PST location: disk 0000 (PST copy 0) NOTE: group 2 PST updated. WARNING: offline disk number 1 has references (51 AUs) NOTE: membership refresh pending for group 2/0xfb327d85 (DB2) Mon Jun 20 16:50:48 2016 GMON querying group 2 at 20 for pid 19, osid 13443 NOTE: cache closing disk 1 of grp 2: (not open) _DROPPED_0001_DB2 Mon Jun 20 16:50:48 2016 SUCCESS: refreshed membership for 2/0xfb327d85 (DB2) NOTE: Attempting voting file refresh on diskgroup DB2
ASMCMD> lsdg
State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED EXTERN N 512 4096 1048576 12235 10514 0 10514 0 N DATA_PGOLD/ MOUNTED NORMAL N 512 4096 1048576 2447 2394 0 1197 1 N DB2/ MOUNTED EXTERN N 512 4096 1048576 4894 4786 0 4786 0 N OHSDBA/ MOUNTED NORMAL N 512 4096 1048576 7341 6410 2447 1981 0 Y SYSTEMDG/
ASMCMD>
NAME PATH ------------------------------ ------------------------------------------------------------ /dev/oracleasm/disks/ASMDISK14 /dev/oracleasm/disks/ASMDISK13 _DROPPED_0001_DB2 DB2_0000 /dev/oracleasm/disks/ASMDISK11 OHSDBA_0001 /dev/oracleasm/disks/ASMDISK10 OHSDBA_0000 /dev/oracleasm/disks/ASMDISK9 DATA_PGOLD_0004 /dev/oracleasm/disks/ASMDISK8 DATA_PGOLD_0003 /dev/oracleasm/disks/ASMDISK7 DATA_PGOLD_0002 /dev/oracleasm/disks/ASMDISK6 DATA_PGOLD_0001 /dev/oracleasm/disks/ASMDISK5 DATA_PGOLD_0000 /dev/oracleasm/disks/ASMDISK4 NAME PATH ------------------------------ ------------------------------------------------------------ SYSTEMDG_0002 /dev/oracleasm/disks/ASMDISK3 SYSTEMDG_0001 /dev/oracleasm/disks/ASMDISK2 SYSTEMDG_0000 /dev/oracleasm/disks/ASMDISK1 14 rows selected. SQL>
因?yàn)閛ffline被drop后,磁盤頭部信息并未改變,所以再次添加原來的磁盤時(shí)出現(xiàn)錯(cuò)誤
SQL> alter diskgroup db2 add disk '/dev/oracleasm/disks/ASMDISK13';alter diskgroup db2 add disk '/dev/oracleasm/disks/ASMDISK13' * ERROR at line 1: ORA-15032: not all alterations performed ORA-15033: disk '/dev/oracleasm/disks/ASMDISK13' belongs to diskgroup "DB2"
SQL>
清除ASM磁盤頭部信息,重新添加磁盤
1+0 records in 1+0 records out 1024 bytes (1.0 kB) copied, 0.00966501 seconds, 106 kB/s
[oracle@ohs1 ~]$ kfed read /dev/oracleasm/disks/ASMDISK13
kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 B7F60200 00000000 00000000 00000000 00000000 [................] Repeat 255 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
[oracle@ohs1 ~]$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.3.0 Production on Mon Jun 20 17:13:37 2016 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production With the Real Application Clusters and Automatic Storage Management options SQL> alter diskgroup db2 add disk '/dev/oracleasm/disks/ASMDISK13'; Diskgroup altered. SQL>注意:磁盤被drop后,磁盤頭部沒有變化,如果磁盤本身沒有問題,還想再利用這個(gè)盤,必須清除磁盤頭部才能重新添加。
Oracle ASM Fast Mirror Resync
http://docs.oracle.com/cd/E11882_01/server.112/e18951/asmdiskgrps.htm#OSTMG10044
disk group attribute
http://docs.oracle.com/cd/E11882_01/server.112/e18951/asmdiskgrps.htm#OSTMG10045
http://docs.oracle.com/cd/E11882_01/server.112/e18951/asmdiskgrps.htm#OSTMG137
http://docs.oracle.com/database/121/HABPT/config_storage.htm#HABPT4813
http://www.oratea.com/2017/05/19/disk_repair_time%E4%BB%8B%E7%BB%8D/
DISK_REPAIR_TIME介紹
1. 介紹當(dāng)ASM磁盤被drop后,ASM就會發(fā)起重平衡,保證被drop的磁盤里涉及的extent再次冗余,但是ASM的重平衡很費(fèi)時(shí)間,并且涉及到大量其它磁盤的IO操作。但是有時(shí)侯磁盤可能會因?yàn)榫S護(hù)或其它原因暫時(shí)offline了,如果短時(shí)間的offline就發(fā)生重平衡,并且磁盤加回來時(shí),又要發(fā)生重平衡,這有點(diǎn)不可接受。所以ASM提供了快速磁盤同步特性,在磁盤OFFLINE期間,ASM記錄OFFLINE磁盤涉及的extent,而當(dāng)磁盤恢復(fù)時(shí),ASM會快速同步這個(gè)磁盤上發(fā)生變化的extent。
如果磁盤OFFLINE很長時(shí)間,這其實(shí)是有風(fēng)險(xiǎn)的,因?yàn)樵贜ORMAL冗余的情況下,OFFLINE磁盤涉及的extent就只有一份了,所以ASM提供了一個(gè)磁盤組屬性DISK_REPAIR_NAME,表明在這個(gè)磁盤組上的磁盤,OFFLINE多長時(shí)間后就被DROP,ASM開始重平衡,這個(gè)值默認(rèn)是3.6小時(shí)。
如果是有計(jì)劃磁盤OFFLINE維護(hù),3.6小時(shí)也許不夠,比如Exadata的Cell節(jié)點(diǎn)升級,所以可能需要先將該值調(diào)大。
--在磁盤組正常情況下,設(shè)置該值的命令如下:
SQL> ALTER DISKGROUP DATA SET ATTRIBUTE 'disk_repair_time'= '36h';
但是如果磁盤已經(jīng)被offline,那如何設(shè)置offline的磁盤多長時(shí)間drop呢?命令有些不同。
2. 判斷多少時(shí)間后磁盤被drop
--在ASM的alert日志里查看,磁盤被drop掉之前的倒計(jì)時(shí),如下:
WARNING: Disk 0 (DATA_CD_00_DMORLCEL08) in group 1 will be dropped in: (12960) secs on ASM inst 1
WARNING: Disk 1 (DATA_CD_01_DMORLCEL08) in group 1 will be dropped in: (12960) secs on ASM inst 1
WARNING: Disk 2 (DATA_CD_02_DMORLCEL08) in group 1 will be dropped in: (12960) secs on ASM inst 1
--檢查DISK_REPAIR_NAME
SQL> column name format a30
SQL> column value format a30
SQL> select name,value from v$asm_attribute where group_number=1 and name like '%disk_repair_time%';
NAME VALUE
------------------------------ ------------------------------
disk_repair_time 3.6h
3. 延長DISK_REPAIR_TIME
--如果一個(gè)failgroup失敗,failgroup上的磁盤已經(jīng)全部OFFLINE了,可以使用如下命令延長DISK_REPAIR_TIME
SQL> ALTER DISKGROUP
--然后再檢查ASM的alert日志,是否生效,例如:
WARNING: Disk 2 (DATA_CD_02_DMORLCEL08) in group 1 will be dropped in: (18000) secs on ASM inst 1
WARNING: Disk 3 (DATA_CD_03_DMORLCEL08) in group 1 will be dropped in: (18000) secs on ASM inst 1
WARNING: Disk 4 (DATA_CD_04_DMORLCEL08) in group 1 will be dropped in: (18000) secs on ASM inst 1
WARNING: Disk 5 (DATA_CD_05_DMORLCEL08) in group 1 will be dropped in: (18000) secs on ASM inst 1
--如果只有1塊盤offline,命令如下:
SQL> ALTER DISKGROUP
--檢查磁盤名稱
SQL> col path format a59
SQL> set lines 200
SQL> set pagesi 400
SQL> select path, name, header_status, mode_status, mount_status, state, failgroup from v$asm_disk order by path;
4. 立即drop磁盤
--如果磁盤的修復(fù)時(shí)間預(yù)計(jì)比較長,那可以立即drop掉磁盤,開始重平衡,而不是等完DISK_REPAIR_TIME時(shí)間后再開始。
如果failgroup失敗,命令如下:
ALTER DISKGROUP
如果某塊磁盤失敗,命令如下:
ALTER DISKGROUP