DataGuard可以提供Oracle数据库的冗灾、数据保护、故障恢复等,实现数据库快速切换与灾难性恢复。在生产数据库的保证"事务一致性"时,使用生产库的物理全备份创建备库,备库会通过生产库传输过来的归档日志或重做条目自动维护备用数据库。
DataGuard数据同步技术有以下优势:
1) Oracle数据库自身内置的功能,与每个Oracle新版本的新特性都完全兼容,且不需要另外付费。
2) 配置管理较简单,不需要熟悉其他第三方的软件产品。
3) 物理Standby数据库支持任何类型的数据对象和数据类型;
4) 逻辑Standby数据库处于打开状态,可以在保持数据同步的同时执行查询等操作。
5) 在最大保护模式下,可确保数据的零丢失。
下面我们给各位来整理一些关于Oracle 10g dataguard常用的维护命令
基本常用
1.停止Standby
select process, status from v$managed_standby; --查看备库是否在应用日志进行恢复
alter database recover managed standby database cancel;
shutdown immediate;
2.切换到只读模式
-----由shutdown模式切换到只读模式-------
startup nomount;
alter database mount standby database;
alter database open read only;
-----由应用日志模式切换到只读模式-------
alter database recover managed standby database cancel; -- 取消日志应用
alter database open read only;
3.切换回管理恢复模式
startup nomount;
alter database mount standby database;
alter database recover managed standby database disconnect from session; -- 启动日志应用
alter database recover managed standby database using current logfile disconnect from session;
4.主库和备库之间角色切换
4.1 主库切换为备库
alter database commit to switchover to physical standby;
alter database commit to switchover to physical standby with session shutdown;-- 主库有会话连接的时候
shutdown immediate
startup nomount;
alter database mount standby database;
alter database recover managed standby database disconnect from session;
4.2 从库切换为主库
alter database commit to switchover to primary;
shutdown immediate;
startup;
alter system switch logfile;
5.备库自动使用主库传过来的日志进行恢复
alter database recover automatic standby database;
6.更改保护模式
alter database set standby database to maximize protection;
alter database set standby database to maximize availability;
alter database set standby database to maximize performancen;
7.取消自动恢复模式
alter database recover managed standby database cancel;
alter database recover managed standby database finish;
alter database recover managed standby database finish force;
第一部分 日常维护
一 正确打开主库和备库
1 主库:
SQL> STARTUP MOUNT;
SQL> ALTER DATABASE ARCHIVELOG;
SQL> ALTER DATABASE OPEN;
2 备库:
SQL> STARTUP MOUNT;
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;
二 正确关闭顺序
1 备库:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
SQL>SHUTDOWN IMMEDIATE;
2 主库
SQL>SHUTDOWN IMMEDIATE;
三 备库Read-Only模式打开
当前主库正常OPEN状态
备库处于日志传送状态.
1 在备库停止日志传送
SQL> recover managed standby database cancel;
2 备库Read-only模式打开
SQL> alter database open read only;
3 备库回到日志传送模式
SQL> recover managed standby database disconnect from session;
Media recovery complete.
SQL> select status from v$instance;
STATUS
------------
MOUNTED
四 日志传送状态监控
1 主库察看当前日志状况
SQL> select sequence#,status from v$log;
SEQUENCE# STATUS
---------- ----------------
51 ACTIVE
52 CURRENT
50 INACTIVE
2 备库察看RFS(Remote File Service)接收日志情况和MRP应用日志同步主库情况
SQL> SELECT PROCESS, STATUS, THREAD#, SEQUENCE#, BLOCK#, BLOCKS
2 FROM V$MANAGED_STANDBY;
PROCESS STATUS THREAD# SEQUENCE# BLOCK# BLOCKS
--------- ------------ ---------- ---------- ---------- ----------
ARCH CONNECTED 0 0 0 0
ARCH CONNECTED 0 0 0 0
RFS RECEIVING 0 0 0 0
MRP0 WAIT_FOR_LOG 1 52 0 0
RFS RECEIVING 0 0 0 0
可以看到备库MPR0正等待SEQUENCE#为52的redo.
3 察看备库是否和主库同步
SQL> SELECT ARCHIVED_THREAD#, ARCHIVED_SEQ#, APPLIED_THREAD#, APPLIED_SEQ#
2 FROM V$ARCHIVE_DEST_STATUS;
ARCHIVED_THREAD# ARCHIVED_SEQ# APPLIED_THREAD# APPLIED_SEQ#
---------------- ------------- --------------- ------------
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
1 51 1 50
可以看到备库已经将SEQUENCE#51的日志归档,已经将SEQUENCE#50的redo应用到备库.
由于已经将SEQUENCE#51的日志归档,所以SEQUENCE#51以前的数据不会丢失.
4 察看备库已经归档的redo
SQL> SELECT REGISTRAR, CREATOR, THREAD#, SEQUENCE#, FIRST_CHANGE#,
2 NEXT_CHANGE# FROM V$ARCHIVED_LOG;
REGISTR CREATOR THREAD# SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
------- ------- ---------- ---------- ------------- ------------
SRMN SRMN 1 37 572907 573346
RFS ARCH 1 38 573346 573538
RFS ARCH 1 39 573538 573623
RFS ARCH 1 40 573623 573627
RFS ARCH 1 41 573627 574326
RFS ARCH 1 42 574326 574480
RFS ARCH 1 43 574480 590971
RFS ARCH 1 44 590971 593948
RFS FGRD 1 45 593948 595131
RFS FGRD 1 46 595131 595471
FGRD FGRD 1 46 595131 595471
REGISTR CREATOR THREAD# SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
------- ------- ---------- ---------- ------------- ------------
RFS ARCH 1 47 595471 595731
RFS ARCH 1 48 595731 601476
RFS ARCH 1 49 601476 601532
RFS ARCH 1 50 601532 606932
RFS ARCH 1 51 606932 607256
5 察看备库已经应用的redo
SQL> SELECT THREAD#, SEQUENCE#, FIRST_CHANGE#, NEXT_CHANGE#
2 FROM V$LOG_HISTORY;
THREAD# SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
---------- ---------- ------------- ------------
1 1 366852 368222
1 2 368222 369590
1 3 369590 371071
1 4 371071 372388
1 5 372388 376781
1 6 376781 397744
1 7 397744 407738
1 8 407738 413035
1 9 413035 413037
1 10 413037 413039
1 11 413039 413098
THREAD# SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
---------- ---------- ------------- ------------
1 12 413098 428161
1 13 428161 444373
1 14 444373 457815
1 15 457815 463016
1 16 463016 476931
1 17 476931 492919
1 18 492919 505086
1 19 505086 520683
1 20 520683 530241
1 21 530241 545619
1 22 545619 549203
THREAD# SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
---------- ---------- ------------- ------------
1 23 549203 552403
1 24 552403 553230
1 25 553230 553398
1 26 553398 553695
1 27 553695 554327
1 28 554327 557569
1 29 557569 561279
1 30 561279 561385
1 31 561385 566069
1 32 566069 566825
1 33 566825 570683
THREAD# SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
---------- ---------- ------------- ------------
1 34 570683 571627
1 35 571627 571867
1 36 571867 572907
1 37 572907 573346
1 38 573346 573538
1 39 573538 573623
1 40 573623 573627
1 41 573627 574326
1 42 574326 574480
1 43 574480 590971
1 44 590971 593948
THREAD# SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#
---------- ---------- ------------- ------------
1 45 593948 595131
1 46 595131 595471
1 47 595471 595731
1 48 595731 601476
1 49 601476 601532
1 50 601532 606932
1 51 606932 607256
可以看到备库已经将SEQUENCE#为51的归档文件应用到备库.
6 察看备库接收,应用redo数据过程.
SQL> SELECT MESSAGE FROM V$DATAGUARD_STATUS;
MESSAGE
--------------------------------------------------------------------------------
ARC0: Archival started
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
ARC1: Archival started
ARC1: Becoming the heartbeat ARCH
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[1]: Assigned to RFS process 19740
RFS[1]: Identified database type as 'physical standby'
Primary database is in MAXIMUM PERFORMANCE mode
Attempt to start background Managed Standby Recovery process
MESSAGE
--------------------------------------------------------------------------------
MRP0: Background Managed Standby Recovery process started
Managed Standby Recovery not using Real Time Apply
Clearing online redo logfile 7 /oraguard/redo1/redo_7_1.log
Clearing online redo logfile 7 complete
Media Recovery Waiting for thread 1 sequence 47
RFS[1]: No standby redo logfiles created
Redo Shipping Client Connected as PUBLIC
-- Connected User is Valid
RFS[2]: Assigned to RFS process 19746
RFS[2]: Identified database type as 'physical standby'
Primary database is in MAXIMUM PERFORMANCE mode
MESSAGE
--------------------------------------------------------------------------------
Committing creation of archivelog '/arch/1_47_552308270.arc'
Media Recovery Log /arch/1_47_552308270.arc
Media Recovery Waiting for thread 1 sequence 48
MRP0: Background Media Recovery cancelled with status 16037
MRP0: Background Media Recovery process shutdown
Managed Standby Recovery Canceled
Attempt to start background Managed Standby Recovery process
MRP0: Background Managed Standby Recovery process started
Managed Standby Recovery not using Real Time Apply
Media Recovery Waiting for thread 1 sequence 48
RFS[1]: No standby redo logfiles created
MESSAGE
--------------------------------------------------------------------------------
Committing creation of archivelog '/arch/1_48_552308270.arc'
Media Recovery Log /arch/1_48_552308270.arc
Media Recovery Waiting for thread 1 sequence 49
RFS[1]: No standby redo logfiles created
Committing creation of archivelog '/arch/1_49_552308270.arc'
Media Recovery Log /arch/1_49_552308270.arc
Media Recovery Waiting for thread 1 sequence 50
RFS[1]: No standby redo logfiles created
Committing creation of archivelog '/arch/1_50_552308270.arc'
Media Recovery Log /arch/1_50_552308270.arc
Media Recovery Waiting for thread 1 sequence 51
MESSAGE
--------------------------------------------------------------------------------
RFS[1]: No standby redo logfiles created
Committing creation of archivelog '/arch/1_51_552308270.arc'
Media Recovery Log /arch/1_51_552308270.arc
Media Recovery Waiting for thread 1 sequence 52
可以看到RFS接收到sequence#为51的归档文件并存至备库归档目录/arch/1_51_552308270.arc.
Oracle自动应用文件/arch/1_51_552308270.arc进行备库与主库同步
Oracle继续等待主库sequence 52的归档文件
五 备库归档目录维护
1 找到备库归档目录
SQL> show parameter log_archive_dest_1
NAME TYPE
------------------------------------ --------------------------------
VALUE
------------------------------
log_archive_dest_1 string
LOCATION=/arch
VALID_FOR=(ALL_LOGFILES,ALL_RO
LES)
DB_UNIQUE_NAME=ora2
log_archive_dest_10 string
2 维护策略
每周2,4,7删除已经应用的归档文件
具体参见附录二
第二部分 主库正常切换
一 人工干预主库正常切换
1 在主库端检验数据库可切换状态
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
SWITCHOVER_STATUS
-----------------
TO STANDBY
1 row selected
SWITCHOVER_STATUS:TO STANDBY表示可以正常切换.
如果SWITCHOVER_STATUS的值为SESSIONS ACTIVE,表示当前有会话处于ACTIVE状态
2 开始主库正常切换
如果SWITCHOVER_STATUS的值为TO STANDBY 则:
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY;
如果SWITCHOVER_STATUS的值为SESSIONS ACTIVE 则:
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN;
成功运行这个命令后,主库被修改为备库
3 重启先前的主库
SQL> SHUTDOWN IMMEDIATE;
SQL> STARTUP MOUNT;
4 在备库验证可切换状态
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;
SWITCHOVER_STATUS
-----------------
TO_PRIMARY
1 row selected
5 将目标备库转换为主库
如果SWITCHOVER_STATUS的值为TO STANDBY 则:
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;
如果SWITCHOVER_STATUS的值为SESSIONS ACTIVE 则:
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;
成功运行这个命令后,备库被修改为主库
6 重启目标备库
SQL> SHUTDOWN IMMEDIATE;
SQL> STARTUP;
7 先前主库启动日志传送进程
SQL> alter database recover managed standby database disconnect;
总结: 这样主库的一次正常切换完成.切换后的状态,原先的主库变为备库,原先的备库变为主库.
二 通过运行脚本实现主库正常切换
1 主库切换为备库
在主库上运行脚本
/admin/dataGuard/switchover/primary_to_standby.sh
2 备库切换为主库
在备库上运行脚本
/admin/dataGuard/switchover/standby_to_primary.sh
脚本1成功运行后,再运行脚本2,不能同时运行两个脚本.
经过这次切换后原来的主库变为备库,原先的备库变为主数据并且OPEN对应用提供服务.
3 复原最初状态
在原备库上运行脚本
/admin/dataGuard/switchover/primary_to_standby.sh
成功完成后
在原主库上运行脚本
/admin/dataGuard/switchover/standby_to_primary.sh
第三部分 主库灾难切换
一 人工干预主库灾难切换
二 通过运行脚本实现主库灾难切换
SQL>alter database recover managed standby database cancel;
SQL>shutdown immediate
SQL>startup mount
SQL>ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE PERFORMANCE;
SQL>alter database recover managed standby database finish;
-- switch
SQL>alter database commit to switchover to primary with session shutdown;
-- open
SQL>shutdown immediate
SQL>startup
附:
一 有选择察看redo传送与应用情况
select message from v$dataguard_status
where message_num>&message_num;
二 备库归档目录维护脚本
在crontab 中定制每日执行removeCommand.sh即可。
流程:每日11:50PM执行removeCommand.sh
假设今日2005-04-05 则删除04-04和04-03两日已应用归档日志.保留今日已应用归档日志
[oracle@db_gurid admin]$ crontab -l
50 23 * * * sh /oraguard/admin/removeCommand.sh>>removeArch.log
##################
[oracle@db_gurid admin]$ cat removeCommand.sh
#!/bin/sh
export ORACLE_BASE=/ora10g/app
export ORACLE_HOME=$ORACLE_BASE/product/10.1.0/db_1
export ORACLE_SID=ora2
cd /oraguard/admin
$ORACLE_HOME/bin/sqlplus /nolog<
@/oraguard/admin/removeArch.sql
EOF
chmod +x /oraguard/admin/removeArch.sh
/oraguard/admin/removeArch.sh>>removeArch2.log
##################
[oracle@db_gurid admin]$ cat removeArch.sql
set feed off
set heading off
set echo off
spool removeArch.sh
select 'rm '||name from v$archived_log where applied='YES' and completion_time>trunc(sysdate-3) and completion_time
第四部份物理Dataguard日常操作管理(一)
3.1 Dataguard 常用参数
日志的传输以及应用可以算作是Dataguard的核心所在.在我们搭建DG的过程中,如何配置优化日志传输服务,关系到整个DG体系的性能以及可用性.而且,不同的保护模式也需要不用的参数组合.10g下,影响配置日志传输的参数主要有以下几个:
1. ARCH/LGWR
设置日志的传送模式,默认使用arch传送.传送发生在日志切换边沿,最大可用和最大保护模式下,需要使用lgwr来传送日志.使用lgwr传送日志,需要备库建立standby logfile,并且支持日志的实时应用.
2. SYNC /ASYNC
该参数表示网络I/O的操作方式, SYNC表示网络I/O将与重做日志的写入同步进行,等待网络i/o完成收到响应后继续下一个写操作.而ASYNC表示日志的传送是异步的,oracle利于LNS进程,接收lgwr发送过来的重做日志信息放入缓冲区,并异步传送到备机,也可以手动指定缓冲区的大小
最大保护和最大可用模式下,需要设置为SYNC AFFIM模式.
3. AFFIMAFFIRM
该参数是LGWR传送模式下的一个属性,表示重做日志的磁盘I/O模式, AFFIM表示同步并且发送成功写操作状态到主数据库, NOAFFIRM表示主库无需等待备库的日志写成功.
4. MANDATORY /OPTIONAL
该参数表示归档的模式,默认值为OPTIONAL. MANDATORY表示强制归档,如果归档不成功会引起主库的归档等待.
5. REOPENREOPEN
该参数表示归档文件收到错误信息后,是否重试以及重试的最小间隔时间.
6. MAX_FAILURE/ NOMAX_FAILUR
该参数表示由于故障而被关闭的目标文件的最大重试次数.超过设定次数,将不再重试.
NOMAX_FAILUR表示不断重试
7. NET_TIMEOUT
该参数表示在网络出现异常或者某些情况下,主数据库的LNS进程放弃网络连接之前的最大等待时间.
8. DELAYDELAY
该参数表示日志的应用模式,delay表示延时指定时间应用传送过来的日志,nodelay表示不延时.
该参数作用也可以通过发布alter database recover managed standby database delay 来实现.
9. VALID_FOR
VALID_FOR的引入,为了更好的实现主备平滑切换,用于说明目标文件何时可用以及归档的重做日志类型.
VALID_FOR属性由2部分组成:archive_source(online_logfile,standby_logfile,all_logfiles)和database_role(primary_role,standby_role,all_role).
online_logfile: 表示归档联机重做日志
standby_logfile:表示归档备用数据库的重做日志/接受的重做日志
all_logfiles: online_logfile && standby_logfile
primary_role: 仅当数据库角色为主库时候归档生效
standby_role: 仅当数据库角色为备库时候归档生效
all_role: 任意角色归档均生效
3.2 选择数据保护模式
表3.2 不同保护模式下LOG_ARCHIVE_DEST_n参数属性设置
最大保护 最高可用 最高性能
REDO写进程 LGWR LGWR LGWR或ARCH
网络传输模式 SYNC SYNC LGWR进程时SYNC或ASYNC,ARCH进程时SYNC
磁盘写操作 AFFIRM AFFIRM AFFIRM或NOAFFIRM
是否需要 Standby Redolog YES YES 可没有,但推荐有
下面我们进入实践部分将一个Data Guard配置从最高性能模式改为最高可用性模式,以下操作如非特别注明,均是在Primary数据库端执行。
3.2.1 更改DG最大可用模式
主库上执行以下命令查看主库保护模式
SQL> select protection_mode,protection_level from v$database;
PROTECTION_MODE PROTECTION_LEVEL
----------------------------- -------------------------------
MAXIMUM PERFORMANCE MAXIMUM PERFORM
A
当前主库的模式为最大性能模式,也是建立DG默认模式,停止备库的日志应用,之前建立DG的时候主备库都增加了standby_redolog,在此步骤直接做保护模式转换
主库上执行以下语句
SQL>alter system set log_archive_dest_2='SERVICE=syw01
LGWR SYNC
AFFIRM
VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=SYW01'
重启主库至mount状态并在主库上执行下列语句
SQL> alter database set standby database to maximize availability;
性能模式,
下面我们就测试一下。首先断开物理Standby数据库网络,这时primary与standby无法正常通信和传送日志
SQL> alter database open
考虑主备切换同时修改standby,在备库上执行
SQL>alter system set log_archive_dest_2='SERVICE=syw
LGWR SYNC AFFIRM
VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)
DB_UNIQUE_NAME=SYW'
注:MAXIMIZE后可跟{PROTECTION | AVAILABILITY | PERFORMANCE},分别对应最大保护、最高可用性和最高性能。
再次查看主库,备库保护模式
SQL> select protection_mode,protection_level from v$database;
PROTECTION_MODE PROTECTION_LEVEL
-------------------- --------------------
MAXIMUM AVAILABILITY MAXIMUM AVAILABILITY
3.2.2 测试最高可用状态
当数据库处于最高可用性模式时,如果Standby无法访问,Primary应该会自动切换成最
高
重新查询主库的保护级别:
SQL> select protection_mode,protection_level from v$database;
PROTECTION_MODE PROTECTION_LEVEL
-------------------- --------------------
MAXIMUM AVAILABILITY RESYNCHRONIZATION
保护级别已经自动变成待同步状态。
警告日志有如下信息
Thread 1 advanced to log sequence 143
Current log# 2 seq# 143 mem# 0: /u01/app/oracle/oradata/syw/redo02.log
Sun Aug 28 11:17:06 2011
ARC3: LGWR is actively archiving destination LOG_ARCHIVE_DEST_2
ARC3: Standby redo logfile selected for thread 1 sequence 142 for destination LOG_ARCHIVE_DEST_2
Sun Aug 28 11:20:57 2011
系统已更改。
ORA-16198: LGWR received timedout error from KSR
LGWR: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (16198)
LGWR: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
Sun Aug 28 11:20:57 2011
Errors in file /u01/app/oracle/admin/syw/bdump/syw_lgwr_30471.trc:
ORA-16198: 远程归档期间内部通道上超时
LGWR: Network asynch I/O wait error 16198 log 2 service 'syw01'
Current log# 2 seq# 143 mem# 0: /u01/app/oracle/oradata/syw/redo02.log
LGWR: Failed to archive log 2 thread 1 sequence 143 (16198)
Sun Aug 28 11:21:00 2011
LGWR: Closing remote archive destination LOG_ARCHIVE_DEST_2: 'syw01' (error 16198
在主库执行更新插入操作
SQL> insert into t values(3);
已创建 1 行。
SQL> commit;
提交完成。
执行日志切换
SQL> alter system switch logfile;
再次查看主库保护级别
SQL> select protection_mode,protection_level from v$database;
PROTECTION_MODE PROTECTION_LEVEL
-------------------- --------------------
MAXIMUM AVAILABILITY RESYNCHRONIZATION
此时primary依旧是日志待同步状态,现在将网线连接上,使主库与备库正常通信
备库警告日志出现如下信息
Sun Aug 28 11:29:10 2011
Media Recovery Log /sywdg/arch1/1_143_758642906.dbf
Media Recovery Log /sywdg/arch1/1_144_758642906.dbf
Media Recovery Log /sywdg/arch1/1_145_758642906.dbf
Media Recovery Log /sywdg/arch1/1_146_758642906.dbf
Media Recovery Waiting for thread 1 sequence 147 (in transit)
Sun Aug 28 11:29:35 2011
Primary database is in MAXIMUM AVAILABILITY mode
Changing standby controlfile to MAXIMUM AVAILABILITY level
RFS[24]: Successfully opened standby log 4: /u01/app/oracle/oradata/stdby_redo04.log'
此时主库已更改成最大可用模式,与此同时在备库上执行
ABILITY MAXIMUM AVAILABILIT
SQL> alter database recover managed standby database cancel;
数据库已更改。
SQL> alter database open;
数据库已更改。
SQL> select * from t;