Recovery Information
and Scenarios
Recovery Guidelines
Performing Recovery (Different
Recover commands)
Archive Log Information
Control File Information
Online Redo Log Information
System Tablespace
Information
Temp Tablespace Information
Application Tablespace
Information
Loss
of Index Datafile
Loss
of
Data Datafile
Read-Only Tablespace
Archive Log Files
Recovery
Scenarios
Recover Full DB or
Disaster Recovery
Media Failure, DB
not in ARCH Log Mode, Restore ALL Files
Recover a
Particular Datafile when DB is closed (Closed DB Recovery)
Recover a
Particular Tablespace when DB is Open (Open DB Recovery with
Offline Tablespace)
Recover a
Particular Datafile when DB is Open (Open DB Recovery with Offline
Datafile)
Recover a Lost
Datafile without Valid Backup
Recover Missing Archive Logs or
Redo Log Files (Cancel-Based Recovery)
Timed Base Recovery (Example, recover
from drop table)
Changed Base Recovery
Recover DataFile in
Backup Mode
Loss of Control File
Loss of ALL Control Files
Loss of Redo Log Files -
Explanation
Loss of "Inactive"
Online Redo Log Group
Loss of "Active" or
"Current" Online Redo Log Group
Loss of UNDO Tablespace
Loss spfile/pfile
Recovery
Guidelines
First of all, you need to undestand the concepts of Recovery:
Instance
Failure
Instance failure is a hardware, software, or system failure that
prevents an instance from continuing work. It can be caused by a CPU
failure, an operating system failure, a power outage, failure of one
of the ORACLE background processes or a failure to access a required
database file when the file is not lost or damaged.
Instance
Recovery
Instance recovery is automatic. Restarting the database performs the
instance recovery. It involves two steps:
1) Rolling forward.
o data that
has not been recorded in the database
o the
contents of rollback segments
2) Rolling back transactions that have been explicitly rolled
o Rollback
transactions that has not been committed.
Any resources held by transactions in process at the time of the
failure will be released. Instance recovery is not necessary if the
database is shutdown normally.
Media
failure
Media failure is a hardware, software, or system failure that
prevents reading or writing to files that are required to operate
the database. Media failure is a failure caused by the loss of the
control file, database files or redo log file.
To recover up to the last point of failure, the database must be
operating in ARCHIVELOG mode. In addition, you must have the latest
backup of database, all online redo log files, archived logs,
current control file.
If the DB is operating in ARCHIVELOG mode, there are options for
recovery depending on the type of failure. Broadly there are
two types of media recovery:
• Complete media recovery
• Incomplete media recovery
Complete Media Recovery:
A complete media recovery will recover all lost changes to the
database. You need to recover the entire database or a single data
file which becomes corrupted. The dynamic view V$RECOVER_FILE
determines what files requires recovery. The complete media recovery
can be accomplished only if all online and archived redo log files
are available. Complete media recovery can be performed on offline
data files while the database is open and need to be opened with the
RESETLOGS option if a backup control file or new control file was
created for the recovery.
Types of complete media recovery are:
1. Closed
Database Recovery
2. Open
Database Recovery with Offline Tablespace
3. Open
Database Recovery with Offline Datafile
Incomplete Media Recovery:
In some situations complete media recovery may not be possible or
may not be desired. This can happen because all the files needed for
a complete recovery are not available (for example, all online redo
log files are lost and you were not duplexing your log
groups). You might want to perform an incomplete recovery if a
user drops a table and you want to recover to the point before the
table was dropped.
Incomplete media recovery is also called point-in-time
recovery. Point in time recovery that is not continued to a
complete recovery must be terminated by the OPEN RESETLOGS
option. The database must be closed during incomplete recovery
operations.
There are three types of incomplete media recovery:
1. Cancel-Based Recovery
2. Time-Based Recovery
(Example, recover from drop table)
3. Change-Based Recovery
Performing
Recovery (Different Recovery Commands)
You can use different commands to recover your database, they are:
1) RECOVER DATABASE
This command is only used with a current control file. Database must
be mounted, but not OPEN. The control file is compared with the
datafile header and brings the datafiles up to date by applying
archived redo logs to make the control file entries match the
datafile header. Online redo logs are applied to make the datafiles
current. Once the recovery is complete open the database with: ALTER DATABASE OPEN;
2) RECOVER DATAFILE ‘filename’
This command is used when database is up and can't be brought down.
It can also be used when the database is in mounted state but the
tablespace which contains these datafiles must be taken offline.
Issue RECOVER DATAFILE ‘filename’. You will be prompted for log
files. The changes will be applied only to these files.
Once the media recovery is complete the tablespace can be brought
online. It allows you to perform 'multi-tasking' recovery, different
datafiles can be recovered parallelly using different sessions or
terminals. Very useful when there are several datafiles to be
recovered.
3) RECOVER TABLESPACE ‘tsname’
Tablespace must be offline. Database must be in OPEN state. Recovers
a single tablespace to the current state. This command can't be used
on SYSTEM tablespace or a tablespace which has rollback segments
having a status "in use". If having database OPEN is not an issue,
can recover using standard recovery (RECOVER DATABASE)
4) RECOVER DATABASE UNTIL CANCEL
Manual recovery after media failure enables you to control how many
redo log files to apply to the database. This can be used to undo an
inadvertent change to the database by stopping recovery before the
change took place. MANUAL option needed for recovery with a control
file backup (old copy) and current control file is not available.
Database must be MOUNTED but not OPEN. Once you issue RECOVER
DATABASE UNTIL CANCEL command, you will be prompted beginning with
the earliest redo log file recorded in the header of each database
file. The recovery process will continue to prompt for redo log
files until CANCEL is typed when prompted for the next redo log
file. Recovery can be cancelled at any time of any redo log.
5) RECOVER DATABASE UNTIL TIME
<date&time>
It's the same as RECOVER DATABASE UNTIL CANCEL except the
granularity is recovery is stopped at a specified in time within a
log file.
The format of <date&time> is YYYY-MM-DD:hh:mm:ss
6) RECOVER DATABASE .....
USING BACKUP CONTROLFILE
Can be used for recovery with an old copy (backup) of control file.
Everything else is similar to other RECOVER DATABASE commands.
Options
Opening the Database
As security measure before starting the recovery backup datafiles,
online logs, and control file. If space is a constraint then at
least backup the online logs and control files.
Open the database with: ALTER DATABASE OPEN [NO]RESETLOGS
RESETLOGS
Before using RESETLOGS option take a cold backup of the database.
Once RESETLOGS is used then the redo log files can't be used at all.
The RESETLOGS clears all the online redo logs and modifies all the
online data files to indicate no recovery is needed. After resetting
the redo logs none of the existing log files or data file backups
can be used. In the control file log sequence number is modified,
which is very important for recovery purposes. The recovery will be
applied only to the log files whose sequence number is greater than
log sequence number in the control file. One has to be very cautious
when using RESETLOGS option. One more important factor to remember
is that all datafiles must be online otherwise they will become
useless once the database is up.
NORESETLOGS
The NORESTLOGS option will allow to apply the online redo logs and
will not clear the redo log files during startup. Leaves online logs
intact. Only used in scenario where MANUAL RECOVERY started and
CANCELled, and then RECOVER DATABASE is started.
Provide
Arch Log Information
Manually
If archive log destination is not defined in the parameter file,
Oracle has to be informed about the location of the file archive
logs.
Alter system archive log start
to <location>;
To recover also you have to define the archive log location
recover from < location
> database;
Automatically
Before recover set auto recovery on.
set autorecovery on;
Then type your recovery option, example:
recover datafile ‘<datafile
path/name or file no.>’;
Enter auto when prompted for a redolog file.
Auto;
Or
recover automatic datafile
‘<datailepath/name or file no.>’
Some
Useful Views
V$recover_file - which file need recovery
V$recovery_log - archive log need for recovery
Control File Information
Control files are key components to recovery. A control file that
reflects the physical structure and current status of the database
must always be available. The control file should be backed up at
the completion of every structural change. For example, after adding
a datafile or log file and after the completion of a hot backup to
the database, the control file should be backed up.
Because a control file is an unreadable binary file, Oracle provides
a method to create a readable SQL script. Such a script is generated
by:
alter database backup
controlfile to trace as '/path/Today_Date';
Multiple backup copies of control files from different times should
be kept to enable point-in-time recovery. The trace files should
have a naming standard which includes the timestamp.
At least three control files should be used. Each should be on
different controllers and disks.
In the event of losing a control file and its mirrored copy due to
user error or corruption, the DBA can easily restart the instance
with the remaining valid control files. If all control files are
lost, the DBA needs to startup nomount the instance, recreate the
control file by running a valid create control file script, and open
the instance.
Failure to write to any of the control files will result in the
database aborting or crashing.
If you lose a single control file and have at least one good control
file, we recommend that you:
o Shutdown if
the instance is not down yet.
o Check
background trace files or alert log file to determine which control
file is corrupted.
o Copy the
good control file over the bad one or modify the init.ora to reflect
only good control files.
o Startup
If you lose all your control files, then you should issue the
command
o CREATE
CONTROLFILE
The create control file command should already be in a script and
can easily be created while the database is up by issuing an alter database backup controlfile
to trace as '/path/Today_Date';. The output from the
NORESETLOGS option is dramatically different from the output without
RESETLOGS (NORESETLOGS is default).
The NORESETLOGS option produces a script that does not contain the
thread numbers of each group. This may be serious problem if you
modify this script to contain the RESETLOGS option. This will ignore
the other threads.
Risks and disadvantages of the create controlfile command:
o Lost
previous log history.
o Database
specific details must be recreated by looking at the file headers.
o Data file
information is obtained by data files. If the data files are
corrupted or not from a single backup, then there may be problems.
o Certain
optimizing SCN structures are ignored with a backup controlfile
o User error
possibility is high when there is not a create controlfile script
already in place.
o Check if
you are in archivelog mode.
When should you backup your controlfile?
o After the
completion of a hot backup
o After a
database structural change (i.e. add data file, change sizes of
files, add log files)
o After a
tablespace status change (i.e. READ-ONLY to READ-WRITE or vice
versa)
o After
backing up your control file, you should also create or adjust any
create controlfile scripts. Your control file is a key component in
recovery. It should always represent the physical configuration and
the file status of the time you open the database. In most
situations, it is best to recover with the current control file.
What to check for when using the backup controlfile option during
recovery?
Make sure that the contents of the controlfile reflect the physical
structure of the database you are recovering to.
Alter database open resetlogs is needed when using backup
controlfile. A backup is always recommended after resetlogs because
the logs are cleared and the sequence numbers are reinitialized.
In the event that media recovery needs to be performed, the current
control file should be used if it does reflect the physical
structure of the database that you are planning to recover to. The
current control file preserves crucial timestamp and stop SCN
information which may optimize and ease recovery while using the
"backup control file" syntax needs to ignore some timestamp
information found in the control file.
Online
redo logs Information
On-line redo logs are crucial for instance
and media recovery. They contain the redo to roll forward the
database and
the
undo needed to rollback any uncommitted transactions. The loss
of a current on-line redo log or loss of an unarchived on-line
redo log will result in an outage.
An unarchived log cannot be dropped unless NOARCHIVELOG mode
is enabled; unfortunately, in order to switch to this mode,
one needs to shutdown and remount the instance.
To protect from the above outages, Oracle
recommends multiplexing of the redo log files over disk
mirroring if both techniques cannot be used. Oracle
multiplexing (multiple members per group) protects against
user error (e.g. an accidental delete) while hardware
mirroring does not. Ideally, each member of the on-line redo
group should be on different drives and different controllers.
Smaller checkpoint intervals
will enable crash recovery to complete faster although it
imposes a performance overhead of writing dirty buffers and
updating file headers. Oracle, however, always does a
checkpoint on a log switch. A typical recommended setting is a
checkpoint every 30 minutes.
Furthermore, the customer should monitor the
frequency of checkpoints and archiver wait problems. Although
the database is up and running, since it is hung due to
checkpoint problems or archiver wait problems, the database is
essentially not available. If the checkpoint is initiated and
a log is not completed before the log needs to be recycled,
the database will temporarily freeze all user activity until
the checkpoint is finished.
Async IO should be available and should be default. With Async
IO, this problem does not usually appear. If Async IO
is not available, multiple database writers (DBWR) and a
checkpoint process (CKPT) can reduce checkpoint wait errors
but usually is only a temporary fix. However, if this happens
frequently, it may be due to the size and/or number of log
files being small relative to the rate of changes being made
in the database. A similar situation arises when log writer
waits on archiver to complete archiving of a redo log. Again,
the solution is to add more log file groups or increase the
size of the log file groups to allow the archiver process more
time to do its job. One can also increase
log_archive_buffer_size to increase the size of a single read
from the on-line redo logs by archiver (arch). Multiple
archivers are a common solution to archive waits. Checkpoints
should always be monitored to determine if activity
predictions are accurate or changing by querying V$SYSSTAT.
Oracle recommends that log file status be
monitored very closely, especially for the STALE or INVALID
status. The INVALID log file errors do appear in the alert.log
as an IO error and would be detected by alert.log parsing. The
STALE status indicates that the state of any pending write on
that log is unknown or the log is not complete (a log being
written to prior to a shutdown abort). This potential problem
does not appear in the alert.log and can only be monitored by
V$LOGFILE. If frequent STALE or INVALID status is found on a
running database, it may be indicative of hardware problems.
Log files have the highest IO activity and
they should be placed on their own set of disks if possible
and definitely separated from the archive files.
Standards:
- Naming convention should be
established.
- Multiplexing should be mandatory
- Redo logs (groups) should be
isolated from other disk files.
- Redo log members should be the same
size.
- Redo log should be sized to switch
once every thirty minutes or depending on your
availability standards.
- Log checkpoint interval and log
checkpoint timeout should be set.
- Sufficient redo logs (groups)
should be created to prevent archiver induced
- file busy waits or checkpoint busy
waits.
What do you do when you lose or discover a
stale or invalid redo log member of a group?
- Investigate the problem and check
if it is still accessible to Oracle.
- Drop the invalid or stale member if
problems still persist and recreate the log member
somewhere else if possible. Stale logs can also be
caused by switching a log. Stale implies that the log is
not completely full or complete.
What if you lose the entire group that has
been archived?
- Attempt to drop the group and
recreate a new group.
- If this is successful, then
initiate a hot backup because recovery can not skip
missing logs.. If not, then this implies that this is
the current or next log..
- You need to restore and perform
incomplete database recovery to the log group preceding
the crashed one - using until cancel or until time
recovery.
What if you lose a group that has not been
archived but is not the current or next group?
- One needs to shutdown the
database.
- Startup mount, alter database
noarchivelog, and drop the log group.
- Alter database archivelog and then
startup. From testing, we sometimes see that if the
archive log lossed has not checkpointed then the
database will not allow you to startup with archivelog.
One must startup with noarchivelog, shutdown, and then
restart.
- Add subsequent groups.
- Strongly suggest running a hot
backup at this time because recovery can not recovery
pass missing logs.
What do you do if you lose the current
online redo log?
- You need to restore and perform
incomplete database recovery.
Online Redo Logs
|
Preventive Steps
|
Detection
Mechanisms
|
Online redo logs are needed for
recovery.
|
- Oracle multiplexing of log groups
- Never placed Online Redo logs on the
same disks as archive logs and application data files
- Make sure log switch rate is matched by
DBWR activity and ARCH activity
- Make sure log group size is acceptable
to availability requirements
- Minimum of 3 log groups.
Implement multiple archivers.
|
- Check V$LOG for invalid or stale
statuses
- Parse alert.log and background trace
file for redo log errors
- Always check V$LOG and V$LOGFILE to
determine which log is current and which log has been
archived.
- Monitor log switch rate and checkpoint
rates.
- Monitor if the archiver keeps up with
the log writer.
|
Types of Outage
|
Steps
|
Early Detection
|
Error Detection
|
Loss of one and not all of the online
redo log members of a particular group
|
- Drop log file member
- Add log file member
|
No detection is possible until log
switch is performed either exiting or entering the file.
|
The alert log records the following
error stack when switching out of the redo log group.
ORA-00316, ORA-00312, ORA-00321, ORA-00312. In addition
the LGWR process will generate the same error stack
Error number 312 reports the file effected. The full
syntax is;
ORA-00312: online log 1 thread 1: ‘file
name’
The view, V$LOG, should be monitored for
statuses of ‘STALE’ or ‘INVALID’.
|
Loss of inactive archived redo log
group
Notes:
1. All recovery time are quite static except for the
crash recovery.
2. The addition of new redo log group can be done
afterwards.
|
- Shutdown abort
- Startup mount
- Drop the
problem redo log
- Alter database
open (crash recovery)
|
No detection is possible until the LGWR
process write to the redo log
|
The LGWR process will generate the error
stack OPIRIP, the general one ORA-00447 followed by more
specific ones. In this test case, they are ORA-00313,
ORA-00312 and ORA-07366. The failure of the LGWR will
cause other background processes fail with some general
error message produced and finally alert log will report
a background process failure. (may not be the LGWR
process)
|
Loss of an inactive redo log group
that has not been archived
Notes:
1. All recovery time are quite static except for the
crash recovery.
2. The addition of new redo log group can be
done afterwards.
3. Cannot drop the unarchived redo log
without setting noarchivelog mode.(ORA-00350)
4. Cannot set archivelog after dropping the
problem redo log group since instance recovery
required.(ORA-00265)
|
- Shutdown abort
- Startup mount
- Alter database
noarchivelog
- Drop the
problem redo log
- Alter database
open (crash recovery)
- Shutdown normal
- Startup mount
- Alter database
archivelog
- Alter database
open
|
No detection
until the ARCH process archive the redo log |
The ARCH
process will generate archival stoppage message - ARCH:
Archival stopped, error occurred. Will continue retrying -
followed by informative error messages (ORA-00255,
ORA-00312) reporting the problem online redo log and more
specific error message(s) telling the cause (in the test
case, it is ORA-00286). The same set of error messages
will also appear on the alert log file together with
archival stoppage message. |
Other Online Redo
Log Outages
|
Detection
|
Steps
|
Loss of the current online redo log
group.
|
The LGWR process will generate the error
stack OPIRIP, the general one ORA-00447 followed by more
specific ones. In this test case, they are ORA-00313,
ORA-00312 and ORA-07366. The failure of the LGWR will
cause other background processes fail with some general
error message produced and finally alert log will report
a background process failure. (may not be the LGWR
process).
V$LOG and V$LOGFILE will indicate if
this is the current log. If so, we must switch to CRF.
|
1. Restore and commence incomplete
recovery.
2. Commence Disaster Recovery Plan
|
Silent Redo Log Corruption
|
Error (ORA 600[3020]) during application
of archive log on the standby database site.
|
1. Rebuild standby database if there is
one present.
2. Investigate Primary for any problems.
Check alert.logs.
|
Internal Redo Corruption
|
ORA 600 [3020] implies that this change
in the redo log is corrupted or inconsistent with the
changes in the data block.
All ORA 600 errors during application or
writing of redo logs may be evidence to a corruption in
the online redo log.
|
1. If it does not affect primary, then
refresh or rebuild standby database.
2. If primary is down, commence disaster
recovery plan
|
System
Tablespace Information
The system tablespace is the most crucial
of all tablespaces because it contains the data dictionary
tables and objects. Loss of the system data file will result
in database recovery or switching to a disaster recovery site.
The system tablespace should be configured and
sized properly. By convention, no user objects or temporary
segments are created in SYSTEM. This should be enforced using
privileges and roles with periodic monitoring to catch any
deviations. Basically, the RESOURCE role and UNLIMITED
TABLESPACE privilege should not be granted to an user. A
typical size of 300 MB - 500 MB may be required if PL/SQL
and/or stored procedures are used. Free space and
fragmentation should be monitored.
Loss of a system data file:
- The file needs to be restored and
rolled forward. The database will be unavailable.
- Or you can recreate the database.
- Or restore from backup and roll
forward
- Or commence disaster recovery plan.
System Tablespace
|
Preventive Steps
|
Detection
Mechanisms
|
System tablespace contains the data
dictionary tables.
|
- No non-data dictionary objects should
be found in SYSTEM
- Monitor Space usage
- Hardware mirroring with redundant
controllers
- No one should have privileges to write
to the system tablespace
- No auditing.
Spare disks in case of disk failures.
|
- Monitor threshold on space usage
within system
- Monitor for non-system objects found in
system tablespace
- Parse alert.log and background trace
files for ORA 600 errors with your monitoring tool.
- Monitor disk failures to check if it
affects system data files.
|
System Tablespace
Outages
|
Comments /
Detection Mechanisms
|
Steps
|
Lack of space or system tablespace
fragmentation.
|
This should be monitored very closely
and the monitor tool should ensure that space is
sufficient in all their tablespaces. However, if this
occurs, the customer needs to add a datafile.
|
Add another data file
|
Loss of system tablespace.
|
Monitor tool should track disk failures
and correlate with the corresponding data file.
|
1. restore and recover.
2. commence disaster recovery plan.
|
Corruption of data dictionary object.
|
ORA-1578 on a data dictionary object,
and ORA 600 may be an indication of data dictionary
corruption. These errors should be parsed from the
alert.log by the monitoring tool.
|
1. restore and recover.
2. commence disaster recovery plan.
|
Temp
Tablespace Information
Although the loss of the temporary
tablespaces will not result in losing access to data, the
application, in particular operations that require sort space
or temp space, may fail.
At least two temporary tablespaces should be available while
precreated alter user scripts are established to
prevent system tablespace fragmentation or application failure
in case of loss of a temporary tablespace.
Multiple temporary tablespaces will be
beneficial in distributing sort allocations for different
users and improve the performance of concurrent sorts.
Furthermore, if several index rebuilds have to occur due to
corruption or maintenance, multiple temp tablespaces will
allow the parallelism to restore and create the indexes
quickly. For index rebuilds, TEMP space of up to twice the
size of indexed data may need to be allocated. As this TEMP
space may not be required during normal operation, it is
useful to keep spare disk space.
Following the initial index building phase, temporary segments
usage is limited to sort activity. For performance reasons,
the sort_area_size setting should be high enough to keep disk
sorts to a minimum assuming there is enough real memory
available. All complex queries which may potentially require
disk sorts will then write to the temp tablespace.
Temporary
Tablespace
|
Preventive Steps
|
Detection
Mechanisms
|
Temporary segments are used for sorting
and temporary scratch work when the allotted space in
memory is insufficient.
|
- Temporary tablespaces should be
hardware mirrored with redundant controllers.
- Alter user scripts should be available
to switch users to an alternative temporary
tablespace.
- Enough space should be available if
more is needed for the temp.
- Test and measure the amount temp space
needed as your application changes.
Extra disks should be available.
|
|
Temporary
Tablespace Outages or Temporary tablespace related
problems.
|
Detection
Mechanisms/ Comments
|
Realistic/
Tested TTR
|
Steps
|
Cannot allocate extents or tablespace
fragmentation.
|
These errors do not appear in the
alert.log; thus, it must be trapped within the
application. ORA 1547, ORA 1562 during a sort operation
indicates temp tablespace configuration problem.
|
The customer should configure the TEMP
tablespace to avoid problems like this. To fix this
problem once it is detected encompasses adding another
data file or creating another tablespace.
|
1. Add another data file.
.
|
Losing temporary tablespace data
file.
|
|
|
1. Run alter user scripts to switch
temporary tablespace usage for users.
2. Create another temp
tablespace if possible.
|
Types of Outage
|
Steps
|
Early Detection
|
Error Detection
|
Loss of temporary tablespace data
file
|
- Create another temporary tablespace.
- Alter user to use to new temporary
tablespace.
- Drop current
temporary tablespace.
|
Monitor tool needs to detect disk
failure and correspond OS files to Oracle files.
|
Monitor tool needs to detect disk
failures and correspond OS files to Oracle files.
|
Application
Tablespace Information
All application data files should be hardware mirrored on
different controllers and disks for additional resilience.
More importantly, the application should have integrity checks
and error checking mechanisms to prevent errors and to
automatically recover in case of errors. Security policies and
proper prevention techniques will be essential to avoid
dropping, deleting or modifying critical data. We also
recommend partitioning of data and implementing business rules
and database integrity constraints to detect for application
data inconsistencies.
Even with built in prevention, user errors,
application bugs, hardware failures, etc still occur which
will result in some type of recovery. There are several
possible solutions to resolve these problems:
- Recreate the tablespace and its
corresponding objects (only relevant for indices),
- Restore from backup and recover the
tablespace or
- Implement object level recovery
plan or
- Initiate disaster recovery plan.
Case 1:
Loss of index data file
Tablespace recovery and object recovery are
two options that are available. Tablespace recovery would
entail restoring the affected tablespace from the hot backup
and rolling forward. Object recovery consists of recreating
the tablespace and recreating the indices. Pre-created create
index scripts (implemented in parallel or with degrees
of parallelism) should be kept ready and ran concurrently
touching different temporary tablespaces to expedite recovery.
In addition, large temporary tablespaces should be made
available to allow for successful index creation.
Although both
recovery descriptions allow for full availability of the
database and the tables, this is still a serious outage due to
the slow performance.
Loss of Index Data file
|
Steps
|
Early Detection
|
Error Detection
|
Recreate Index Tablespace
|
- Drop Index tablespace
- Create another index tablespace
- Create all indices in the tablespace
in parallel.
|
Monitor tool should detect disk failure.
If the failure is an index tablespace,
automated steps are needed to recreate the indices.
|
Monitor tool should detect disk
failures.
|
Restore from Backup and Recover
|
- Offline tablespace.
- Restore from hot backup.
- Set autorecovery on.
- Bring up recovery processes.
- Alter database recover automatic
tablespace.
- Alter tablespace online.
|
Monitor tool should detect disk failure.
If the failure is an index tablespace,
automated steps are needed to be restore and recover.
|
Monitor tool should detect disk
failures.
|
Switch to disaster recovery plan.
|
Switch to standby database or replicated
database.
|
Monitor tool needs to detect disk
failure and correspond OS files to Oracle files.
|
Monitor tool needs to detect disk
failures and correspond OS files to Oracle files.
|
Case
2: Loss of application table datafiles
Hopefully, the customer does implement some
type of object level backup and recovery strategy to reduce
MTTR. If they do not, they must either restore from backup and
recover or implement their disaster recovery plan.
Object level recovery only works when there
are mechanisms within their application to resynchronize from
a determined point of time. When the subset has been restored,
then the subset should be able to resynchronized with the rest
of the database via the application. The database should be partitioned in such a way
that recovery can occur to subsets of the database while other
functionality is not impeded.
Loss of Data File
|
Steps
|
Early Detection
|
Error Detection
|
Recreate Table Tablespace
|
Only applicable if exports or unloads of
the tables are taken.
Only applicable if there is an object
level recovery plan.
(snapshots, unloader, replicated
database, etc)
|
Monitor tool should detect disk failure.
If the failure is an index tablespace,
automated steps are needed to be restore and recover.
|
IO errors or ORA 1100s errors.
|
Restore from Backup and Recover
|
- Offline tablespace.
- Restore from
hot backup
- Set
autorecovery on.
- Bring up
recovery processes.
- Alter database
recover automatic tablespace.
- Alter
tablespace online.
|
Monitor tool should detect disk failure.
If the failure is an index tablespace,
automated steps are needed to be restore and recover.
|
Monitor tool should detect disk
failures.
|
Commence disaster recovery plan
|
Only applicable if there is a standby
database or a replicated database. This can also be
accomplished by breaking three-way mirrors to maintain
hot backups on site.
|
Monitor tool needs to detect disk
failure and correspond OS files to Oracle files.
|
Monitor tool needs to detect disk
failures and correspond OS files to Oracle files.
|
Types of Outage
|
Steps
|
Early Detection
|
Error Detection
|
Cannot allocate extents or tablespace
fragmentation
|
Add datafile
|
Monitor Space Usage and fragmentation of
tablespaces.
|
ORA-1547 or other error traps need to be
trapped within the application
|
Loss of Index due to user error or
corruption.
|
Drop index.
Create index.
|
Monitor alert.log for ORA-1578 errors
that affect index data.
Periodic index validation with the
analyze command.
|
ORA-1578 while using or validating an
index.
|
Single Table Loss or Corruption of
Single Table.
|
Object level recovery plan.
Restore and recover.
|
Monitor alert.log for ORA-1578 errors
that affect table data.
Table can be analyze to check for block
corruptions.
|
ORA-1578 while accessing or analyzing
table data.
|
Reorganize Table - Scheduled Outage.
|
Drop / Create
Unload/
Direct Load
Use of Parallelism would be helpful.
|
Monitor and alarm when objects exceed
more than 20 extents or some threshold.
|
ORA-1556 max extent error should be
trapped within the application.
|
Reorganize Index - Scheduled Outage
|
Drop/ Create
|
Monitor and alarm when objects exceed
more than 20 extents or some threshold.
|
ORA-1556 max extent error should be
trapped within the application.
|
Read-Only
tablespaces
Read only tablespaces contain information
that is static and the data is accessible only for viewing and
not for modification. Read-only tablespaces NEED to be backed
up once it becomes READ-ONLY. The recovery session will get an
error if a file is READ-ONLY and a backup controlfile is being
utilized for recovery.. All read-only tablespace files must be
taken offline prior to incomplete recovery. This is what makes
backing up READ-ONLY tablespace after changing to READ-ONLY
very important. The moral of the story is that you need to
restore the READ-ONLY tablespace from your previous backup and
offline the read-only tablespace files prior to recovery with
a backup controlfile. After opening the database, you then can
online the READ-ONLY data files.
If you use the current control file, then
recovery will complete successfully even if the tablespace
changes from read-write to read-only through the redo logs.
Read-only tablespace files are still online files and recovery
will still apply changes to these files. Changing a file from
read-write or read-only will cause some redo and during
recovery, we will apply the redo. Thus the flags are changed.
Read-only tablespaces reduce the amount of
work during checkpoints. Read only data files are not
checkpointed and updated while only read-write files are.
Recovery will ignore read-only files when possible; thus,
streamlining recovery even further.
Read-Only Data
|
Preventive Steps
|
Detection
Mechanisms
|
Index segments should only hold indexes.
|
1. Partition data according to
functionality.
2. Hardware mirroring and redundant
controllers
|
1. Monitor disks and for IO errors
|
Data segments should hold both tables
and clusters.
|
1. Partition data according to
functionality.
2. Hardware mirroring and redundant
controllers
|
1. Monitor disks and for IO errors
|
Read-Only Data
|
Steps
|
Accidental drop of read-only data.
|
1. Restore from backup
2. commence Disaster Recovery Plan.
|
Loss of a read-only data file.
|
1. Restore from backup
2. commence Disaster Recovery Plan.
|
Archive Log Files
Archive log files provide the means to roll
forward from a previous backup of the database. Since archive
log files are applied in order, it is extremely important that
all the archive log files be available in order to roll
forward to the desired point in time. For that reason, Oracle
recommends that archive files are also mirrored to protect
from disk failure. For extra protection, copies of the archive
files can be kept onsite and offsite in case of some large
disaster. One of the most
common errors found on customer sites is a poorly organized
archiving procedure which may be backing up incomplete archive
files or not detecting corrupted tapes. A scripted
methodology and well-tested tape management tool may be
required to safeguard from corrupted archive files. Prior to
backing up or copying an archive file, the procedure should
check from V$LOG table that the corresponding on-line redo log
has been completely archived.
Another common problem is that the archive
destination becomes full or inadequate to support the number
of archive files being created. First, adequate disk space
should be provided to allow for sufficient time for the log
files to be backed up to tape and to allow at least a day’s
worth of archive logs to be onsite or all archive logs
starting from the previous database backup (cold or hot) to
remain onsite. Second, multiple archival destinations should
be created with automated switchover of destination when a
certain threshold for available free space is reached.
Finally, the archive procedure should dynamically monitor the
archive destinations for lack of free space.
In the event of incomplete or complete
database recovery, caution must be exercised to prevent
confusion between the current set of archive files and the
archive files from the previous version of the database. For
example, after successful incomplete recovery, the archive log
sequence numbers will reset to one. Therefore, new archive
files will be created with sequence number starting from one
and up which should not be confused with the older archive
logs with the same number. Another example is restoring from
an old cold backup and starting up without any roll forward
media recovery. The archive logs generated from that point
will begin with the number that was stored in the control
file. Again, it is necessary to distinguish between the
current database’s logs and the previous versions of those
logs.
What if you lose an archive log or set of logs?
- A complete backup of the database
is necessary and should be accomplish as soon as
possible: cold backup or hot backup.
Archive Log Files
|
Preventive Steps
|
Detection
Mechanisms
|
Archive log files are required for media
recovery and for standby databases.
|
- Scripted procedure to archive to the
archive destination and to archive to tape.
- Tape management tool with error
checking mechanisms should be utilized.
- Hardware mirroring of archive files.
- Two sets of backups.
|
|
Archive Log Files
and Archivelog
|
Detection
Mechanisms
|
Steps
|
Archiver stuck due to lack of free space
in the archive destination.
|
Monitor free space in the archive
destination and alarm when close to reasonable threshold
(70%)
|
1. Free up space in the archive
destination
2. Switch to different archived
destination.
|
Loss of archive file(s) due to media
failure or user error.
|
Monitor tool should check for disk
failures. Alerts should be available when media failure
affects archive logs. When this occurs, automatic
procedures should switch to another archive destination.
|
1. Backup primary.
2. Refresh standby database if one
exists.
|
Archiver can not write to the archive
destination due to media failure.
|
Monitor tool should check for disk
failures. Alerts should be available when media failure
affects archive logs. When this occurs, automatic
procedures should switch to another archive destination.
|
1. Switch archive destination.
|
Archive logs are not successfully backed
up.
|
Checksums must be in place during the
backup. Investigate problem and retry.
|
If backup is not successful because of a
bad archive log file, one needs to backup Primary and
recreate the standby site.
|
Archiver hangs because someone
deactivates archivelog or disable archiving.
|
This should be checked before every open
command.
The archive log list command
will
|
shutdown database
startup mount;
alter database archivelog
alter database open.
|
Recovery Scenarios
Recovery
Full DB or Disaster Recovery
This example shows how to restore all of the datafiles of the target
database. RMAN will go to its last good backup set and restore the
datafiles to the state they were in when that backup set was
created. When restoring database files with RMAN, it reads the
datafile header and makes the determination as to whether the file
needs to be restored. The recovery is done by allocating a channel
for I/O and then issuing the RMAN restore database command. The
database must not be open when restoring or recovering the entire
database.
SQL> startup mount;
or
RMAN> startup force mount;
then connect to rman and perform:
RMAN> restore database;
#Restores datafiles
RMAN> recover
database; #Recovers
database, also restores archived redo logs as required.
RMAN> alter database open;
#Open the DB
Another Method:
1- Assuming that we have a full backup under /my_rman (that directory
contains all the backup pieces plus control files, etc)
2. If possible, generate your init.ora file. You can
place a copy of your init.ora file under $ORACLE_HOME/dbs.
You can also re-generate it from one of your backuped files with
RMAN. Search for a small file with a name like:
CCOM_c-2304532610-20090319-00_ctl.bckp.
Then type the following:
strings CCOM_c-2304532610-20090319-00_ctl.bckp | head -60 and you
will see the first lines of the init.ora file.
Edit the init.ora file and add your local directory information
under:
*.control_files='/db/data/CCOM/control01.ctl','/db/data/CCOM/control02.ctl','/db/data/CCOM/control03.ctl'
*.background_dump_dest='/u01/app/oracle/admin//CCOM/bdump'
*.core_dump_dest='/u01/app/oracle/admin/CCOM/cdump'
*.user_dump_dest='/u01/app/oracle/admin/CCOM/udump'
*.log_archive_dest_1='LOCATION=/db/logs/CCOM/ARCH'
And add at the end the lines that point your OLD paths to your NEW
paths
*.db_file_name_convert=('/data/u02/app/oracle/oradata/CCOM/','/db/data/CCOM/')
*.db_file_name_convert=('/data/u01/app/oracle/oradata/CCOM/','/db/data/CCOM')
*.db_file_name_convert=('/data/u03/app/oracle/oradata/CCOM/','/db/data/CCOM')
*.log_file_name_convert=('/opt/app/oracle/oradata/CCOM/','/db/data/CCOM')
Save that file under $ORACLE_HOME/dbs/initCCOM.ora
3. Start RMAN
$ rman target / nocatalog
4. Start the DB (nomount) and restore the pfile
RMAN> set DBID=2304532610;
startup force
nomount pfile =
'/u01/app/oracle/product/10.2.0/db_1/dbs/initCCOM.ora';
OR
run
{
restore spfile to pfile
'/u01/app/oracle/product/10.2.0/db_1/dbs/initCCOM.ora' from
'/my_rman/CCOM_c-2304532610-20100515-00_ctl.bckp';
shutdown abort;
}
5. Start the DB with the pfile that we just restore:
RMAN> set DBID=2304532610;
RMAN> startup force
nomount pfile =
'/u01/app/oracle/product/10.2.0/db_1/dbs/initCCOM.ora';
6. Restore the control files
run
{
restore controlfile from '/my_rman/CCOM_c-2304532610-20100515-00_ctl.bckp';
alter database mount;
}
7. Restore the data files
catalog start with
'/db/data/BCKP';
run
{
set newname for datafile 01 to '/db/data/CCOM/system01.dbf';
set newname for datafile 02 to '/db/data/CCOM/undotbs01.dbf';
set newname for datafile 03 to '/db/data/CCOM/sysaux01.dbf';
set newname for datafile 04 to '/db/data/CCOM/users01.dbf';
set newname for datafile 05 to '/db/data/CCOM/CC_dat_A01.dbf';
set newname for datafile 06 to '/db/data/CCOM/CC_dat_A02.dbf';
set newname for datafile 07 to '/db/data/CCOM/CC_dat_A03.dbf';
set newname for datafile 08 to '/db/data/CCOM/CC_dat_A04.dbf';
set newname for datafile 09 to '/db/data/CCOM/CC_dat_A05.dbf';
set newname for datafile 10 to '/db/data/CCOM/CC_dat_A06.dbf';
set newname for datafile 11 to '/db/data/CCOM/CC_dat_A07.dbf';
set newname for datafile 12 to '/db/data/CCOM/CC_dat_B01.dbf';
set newname for datafile 13 to '/db/data/CCOM/CC_dat_B02.dbf';
set newname for datafile 14 to '/db/data/CCOM/CC_dat_B03.dbf';
set newname for datafile 15 to '/db/data/CCOM/CC_dat_B04.dbf';
set newname for datafile 16 to '/db/data/CCOM/CC_indx_A01.dbf';
set newname for datafile 17 to '/db/data/CCOM/CC_indx_A02.dbf';
set newname for datafile 18 to '/db/data/CCOM/CC_indx_A03.dbf';
set newname for datafile 19 to '/db/data/CCOM/CC_indx_A04.dbf';
set newname for datafile 20 to '/db/data/CCOM/CC_indx_B01.dbf';
set newname for datafile 21 to '/db/data/CCOM/CC_indx_B02.dbf';
set newname for datafile 22 to '/db/data/CCOM/CC_indx_B03.dbf';
set newname for datafile 23 to '/db/data/CCOM/CC_indx_B04.dbf';
set newname for datafile 24 to '/db/data/CCOM/CC_dat_A08.dbf';
set newname for datafile 25 to
'/db/data/CCOM/CC_ReadOnly_01.dbf';
set newname for datafile 26 to
'/db/data/CCOM/CC_ReadOnly_02.dbf';
set newname for datafile 27 to
'/db/data/CCOM/CC_ReadOnly_03.dbf';
restore database;
switch datafile all;
}
recover
database;
Note: this will
create the online redo logs in the same location as that on host
A.
If this directory location does not exist, then this will fail
with:
ORA-344 : unable to recreate online log
<name>
The workaround is to rename the logfiles prior to opening the
database in another SQL Windows:
SVRMGR> alter database rename file
'<host A location>' to '<host B location>';
8. Start the DB
RMAN> alter database open
resetlogs;
Example
to Restore the Full DB to Another Machine when the Original one
is lost
Let’s say, I have a full RMAN Backup including archivelog backups
maintained on Disks. Let's assume that all the backup files are in
recoverable conditions and the RMAN Backups include the following
files:
1. RMAN Full Backup files
2. RMAN Archivelog files
3. Controlfile and SPFILE autobackup
Let’s say the Production Database Server goes completely offline,
due to a major crash and the server requires a complete rebuild.
The only way to recover the server is to rebuild the entire
server, or to have us recover the complete Environment and the
Database (from the RMAN Backups) on to another Server.
Before we take our case to the RMAN Backups, 2
pre-conditions need to be fulfilled:
1. A Server (the same server or a new one) should be made
available in a state where Oracle Product can be installed; i.e.,
it should be brought online with a functional Operating System and
necessary Environment Configurations, same as how it was before
the crash.
2. Same version of Oracle Product (including any patch sets) needs
to be installed with which the RMAN Backup was taken. This is
critical for the recovery to be successful.
The following steps would outline the procedures for
recovering the Database using only RMAN Backup:
1. Create the Oracle Service
2. Create the relevant folder hierarchy under <ORACLE_BASE>
directory
3. Restore the SPFILE from autobackup, and use the SPFILE to start
the instance in NOMOUNT state
4. Restore the controlfile from autobackup, and MOUNT the database
5. Restore the Database from RMAN backup files
6. Recover the Database from RMAN backup files, and OPEN the
database with RESETLOGS
1. Create the Oracle Service
We need to create an Oracle Service using 'oradim' utility,
without passing any PFILE information
C:\>oradim -new -sid DBTEST -intpwd DBTEST
Instance created.
2. Create the relevant folder hierarchy under <ORACLE_BASE>
directory
Now, we need to create the relevant directories for Oracle
Database, as follows:
a. Create BDUMP, CDUMP, and UDUMP directories under '<ORACLE_
BASE>\admin' directory,
b. Create <DATABASE_NAME> directory under
'<ORACLE_BASE>\oradata' directory; in our case DBTEST.
3. Restore the SPFILE from autobackup, and use the SPFILE to start
the instance in NOMOUNT state
Here, we need to connect to the Recovery Manager using the ‘rman’
utility, and then start the Recovery process.
First we need to set the Database ID (DBID) of the Database
C:\>set oracle_sid=DBTEST
C:\>rman target /
RMAN> set dbid=1106578690;
RMAN> startup force
nomount;
< here you will get a couple of errors, don't worry about
them>
RMAN> SET
CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO
'\\testdb\dbtest\%F';
Next, we need to restore the SPFILE from the
autobackup file and then use it to restart the instance in NOMOUNT
state. You can approach this in two ways, as demonstrated below:
a. Restoring the SPFILE from the autobackup to a non-default
location:
We can restore the SPFILE to a non-default location by using any
one of the below 2 methods:
RMAN> restore spfile to
'C:\spdbtest01.ora' from autobackup;
OR,
RMAN> restore spfile to
'C:\spdbtest02.ora' from
"\\testdb\dbtest\c-1106578690-20100408-00";
Now, we need to restart the instance in NOMOUNT
state using the restored parameter file.
RMAN> shutdown immediate
Remember that we have just restored the SPFILE in a
non-default location and hence we would require referring a PFILE
that should be referencing the SPFILE, so as to start the
instance. For this, we need to create a text file,
'init<SID>.ora', and then mention the path and name of the
restored SPFILE in it. In our case, the ‘initDBTEST.ora’ would
contain only the following statement:
SPFILE=C:\spdbtest01.ora
Next, we need to start the instance in NOMOUNT state using the
newly created PFILE (referencing the SPFILE).
RMAN> startup force
pfile=’C:\initDBTEST.ora’ nomount
b. Restoring the SPFILE from the autobackup to the
default location:
We can restore the SPFILE to the default location
(“<ORACLE_HOME>/database” in Windows) by using any one of
the below 2 methods:
RMAN> restore spfile from
autobackup;
OR,
RMAN> restore spfile from
"\\testdb\dbtest\c-1106578690-20100408-00";
The SPFILE has been restored in the DATABASE
directory of the default home location. Oracle will now
automatically look for an SPFILE in this location when you try to
start the instance
RMAN> shutdown immediate
Start the instance in NOMOUNT state. Here, we do not
need to reference any PFILE or SPFILE, which makes the task a lot
simpler.
RMAN> startup force
nomount;
4. Restore the controlfile from autobackup, and
MOUNT the database
Once the database is restarted in NOMOUNT state, we need to
restore the controlfiles so as to restore and recover the database
using the RMAN Backups.
Again here, we can restore the controlfiles by using any one of
the below 2 methods:
RMAN> startup nomount;
RMAN> set
dbid=1106578690;
RMAN> restore controlfile from autobackup;
OR,
RMAN> restore controlfile
from "\\testdb\dbtest\c-1106578690-20100408-00";
5. Restore the Database from RMAN backup files
With the restored controlfiles, we are now in a position to
restore the Database. We need to mount the Instance first and then
restore the database files from the backup.
RMAN> alter database
mount;
RMAN> restore database;
6. Recover the Database from RMAN backup files, and OPEN the
database with RESETLOGS
Once restoration of the database is complete, we need to recover
the database by applying the archivelogs from the backup.
RMAN> recover database;
RMAN-00571:
===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS
===============
RMAN-00571:
===========================================================
RMAN-03002: failure of recover command at 04/08/2010 14:45:21
RMAN-06054: media recovery requesting unknown log: thread 1 seq 33
lowscn
27801294
Don't panic when you see the above Error Message.
You just need to open the database with RESETLOGS option, as there
are no more archivelogs to apply.
RMAN> alter database open
resetlogs;
Media
Failure,
DB not in ARCH Log Mode, Restore ALL Files
Failure : loss of disk,data
file,or corruption
Recovery : Restore all
Oracle files:
- Data files
- Control files
- Redo log files
- Password file (optional)
- Parameter file (optional)
Advantage
- Easy to perform with low risk of error
- Recovery time is the time it takes to restore all files.
Disadvantage
- Data is lost and must be reapplied manually
- The entire database is restored to the point of the last whole
closed backup
Recovery Process
shutdown abort
Restore all files using operating system
startup
Restoring to a Different Location, you need to rename ALL the files
(control files, redo log files and database files) and modify your
init.ora file to show the new path for files.
startup mount
select name from v$datafile;
select name from v$controlfile;
select member from v$logfile;
alter database rename
file '/server1/oradata/user_01.dbf' to
'/server2/oradata/user_01.dbf';
...
...
alter database open;
Recover
a Particular Datafile when DB is Closed (Closed DB Recovery)
If your system suffers a media failure and the database stays down,
you need to perform a close database recovery. You will get an
error message ORA-1157, ORA-1110, and possibly
an operating system error. Follow these steps to execute a closed
database recovery:
a. Make sure the
database is shutdown.
b. Correct the media
problem if possible.
c. Restore the most
recent backup of only those data files that were damaged by the
media failure.
There is no need to restore any undamaged data files or any online
redo log files. If the hardware problem has been repaired, and
damaged data files can be restored to their original locations, do
so. If the hardware problem still exists, restore the data files to
another location. This location will need to be recorded in
the control file later in this procedure)
d. Start SQL*Plus and
connect to Oracle as SYS.
e. Start the instance
and mount it, but do not open the database.
startup mount
f. Make sure the
tablespace that contain the data files is offline.
select d.name, t.name,
d.status from v$datafile d, v$tablespace t where d.ts# = t.ts#;
alter database datafile '/server1/oradata/user_01.dbf'
offline;
alter tablespace MY_DATA offline;
If you restored the lost files to alternate locations, the new
location of these files must be recorded in the control file.
select name from v$datafile;
select name from v$controlfile;
select member from v$logfile;
alter database rename
file '/server1/oradata/user_01.dbf' to
'/server2/oradata/user_01.dbf';
...
...
g. Query the
V$DATAFILE view and make sure all the data files you want to recover
are online. If a data file is offline, issue the ALTER
DATABASE command with the DATAFILE ONLINE option.
select name, status from
V$DATAFILE;
ALTER DATABASE DATAFILE
'my_data.dbf' ONLINE;
h. To start closed
database recovery, use the RECOVER command. Oracle will now start
the roll forward by applying the archived redo log files and the
online redo log file. If AUTORECOVERY is set to ON, the
applying of the log files is automatic. If it is not set to
ON, you will be prompted for each log file.
recover database;
i. When recovery is complete, open the database
ALTER DATABASE OPEN;
Recover
a Particular Tablespace when DB is Open (Open DB Recovery with
Offline Tablespace)
If your system suffers a media failure but the database stays up,
you can perform an open database recovery and recover only the
tablespaces that are damaged. Users can continue to access the
online tablespaces and data files that were not damaged by the
failure. Oracle automatically takes damaged data files offline.
Follow these steps to execute an open database recovery with offline
tablespaces:
a. The database should
be started and open.
b. Take all
tablespaces containing damaged data files offline. You can query the
V$DATAFILE view to see which data files are offline.
select name, status from
V$DATAFILE;
ALTER TABLESPACE
tablespace_name OFFLINE immediate;
c. Correct the problem that caused the media failure.
(If the problem cannot be corrected in a reasonable amount of time,
your other option is to restore the damaged files to another
location.)
(If the media problem can be corrected, restore the most recent
backup files of only the data files damaged by the media
failure. Remember that the database is open. Do NOT try
to restore undamaged data files, log files, or control files.)
d. If you restored the lost
files to alternate locations, the new location of these files must
be recorded in the control file.
select name from v$datafile;
select name from v$controlfile;
select member from v$logfile;
alter database rename
file '/server1/oradata/user_01.dbf' to
'/server2/oradata/user_01.dbf';
e. Use the RECOVER
TABLESPACE command to start offline tablespace recovery of all
damaged data files in one or more offline tablespaces. Oracle will
now start the roll forward by applying the archived redo log files
and the online redo log file. If AUTORECOVERY is set to ON,
the applying of the log files is automatic. If it is not set
to ON, you will be prompted for each log file.
RECOVER TABLESPACE
tablespace_name ;
f. The damaged
tablespaces of the open database are now recovered up to the point
of failure. You can bring the offline tablespaces online:
ALTER TABLESPACE
tablespace_name ONLINE;
Recover
a Particular Datafile when DB is Open (Open DB Recovery with
Offline Datafile)
If your system suffers a media failure but the database stays up,
you can perform an open database recovery and recover only the data
files that are damaged. Users can continue to access the
online tablespaces and data files that were not damaged by the
failure. Oracle automatically takes damaged data files offline.
Follow these steps to execute an open database recovery with offline
tablespaces:
a. The database should
be started and open.
b. Take all
tablespaces containing damaged data files offline. You can query the
V$DATAFILE view to see which data files are offline.
select name, status from
V$DATAFILE;
ALTER DATABASE DATAFILE
'/path/file_name' OFFLINE immediate;
c. Correct the problem that caused the media failure.
(If the problem cannot be corrected in a reasonable amount of time,
your other option is to restore the damaged files to another
location.)
(If the media problem can be corrected, restore the most recent
backup files of only the data files damaged by the media
failure. Remember that the database is open. Do NOT try
to restore undamaged data files, log files, or control files.)
d. If you restored the lost
files to alternate locations, the new location of these files must
be recorded in the control file.
select name from v$datafile;
select name from v$controlfile;
select member from v$logfile;
alter database rename
file '/server1/oradata/user_01.dbf' to
'/server2/oradata/user_01.dbf';
e. Use the RECOVER
DATAFILE command to start offline datafile recovery of all damaged
data files. Oracle will now start the roll forward by applying the
archived redo log files and the online redo log file. If
AUTORECOVERY is set to ON, the applying of the log files is
automatic. If it is not set to ON, you will be prompted for
each log file.
RECOVER DATAFILE
'/path/file_name';
f. The damaged
datafiles of the open database are now recovered up to the point of
failure. You can bring the offline datafiles online:
ALTER DATABASE DATAFILE '/path/file_name' ONLINE;
Recover
lost Datafile without Valid Backup
How to recover lost datafile when you don’t have valid backup of
your database. Your database is running in archivelog mode and you
have all necessary archive logs.
In situation when you don’t have valid backup and your datatabase is
running in noarchivelog mode - your datafile is lost forever.
BUT, if you have all the Archive Log files since the Datafile was
created, then you can still recover it without a Backup:
I will create new tablespace for my test and name it - DIEGO_TEST.
create tablespace ts_test
datafile '/u01/oradata/DIEGO/DIEGO_TEST.dbf';
Now create a table MY_TEST in DIEGO_TEST tablespace. I will fill
this table with 100,000 rows
create table MY_TEST
tablespace DIEGO_TEST
as select rownum id, a.* from
all_objects a where 1=0;
select tablespace_name,
segment_name
from dba_segments where
segment_name = 'MY_TEST';
TABLESPACE_NAME
SEGMENT_NAME
---------------
---------------------------------------
DIEGO_TEST
MY_TEST
You can see in the alert.log file that several log switches happened
during the insert process, you can also force it:
alter system switchlogfile;
To simulate disaster I will delete DIEGO_TEST datafile using OS
command and flush buffer cache several times.
!rm
/u01/oradata/orcl/DIEGO_TEST.dbf;
alter system flush
buffer_cache;
alter system flush
buffer_cache;
alter system flush
buffer_cache;
Now my query throws an error because datafile 8 is missing.
select count(*) from MY_TEST;
ERROR at line 1:
ORA-01116: error in opening
database file 8
ORA-01110: data file 8:
'/u01/oradata/DIEGO/DIEGO_TEST.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file
or directory
Additional information: 3
How to recover this datafile when you don’t have valid backup to
restore it?
The trick is to create new datafile based on the file DIEGO_TEST and
fill it with data from archive logs using the following commands:
alter database datafile 8
offline drop;
alter database create datafile
'/u01/oradata/DIEGO/DIEGO_TEST.dbf' as
'/u01/oradata/DIEGO/DIEGO_TEST.dbf';
recover datafile 8;
At this point Oracle will start asking for Archive Log Files. Once
all of them are applied just put that datafile online again:
alter database datafile 8
online;
select * from MY_TEST;
Recover Missing Archive Logs
or Redo Log Files (Cancel-Based Recovery)
This situation is most likely occur if archive log files or redo log
files needed for recovery are lost or damaged and cannot be
restored. In this situation, you would apply all logs until
you reached the missing files and then cancel the recovery. Follow
these steps to execute a cancel-based recovery:
a. If the database is still
open, shut down the database using the SHUTDOWN command with the
ABORT option.
b. Make a full backup of the
database, including all datafiles, a control file, and the parameter
files in case an error is made during the recovery.
c. Correct the problem that
caused the media failure. (If the problem cannot be corrected,
the datafiles must be restored to an alternate location. If
this is the case, the ALTER TABLESPACE tablespace_name RENAME
DATAFILE command must be used to change the location of the datafile
in the control file.)
d. If the current control
files do not match the physical structure of the database at the
time you want to recover to, restore a backup of the control file
that matches the database's physical file structure at the point in
time you want to recover to. Replace all current control files
of the database with the one you want to use for recovery. If
you do not have a backup copy of the control file, you can create a
new one.
e. Restore backups of all
datafiles. Make sure the backups were taken before the point
in time you are going to recover to. Any datafiles added after
the point in time you are recovering to should not be
restored. They will not be used in the recovery and will have
to be recreated after recovery is complete. Any data in the
datafiles created after the point of recovery will be lost.
Note: Make sure read-only tablespace are offline before you start
recovery so recovery does not try to update the datafile headers.
e. Start the instance and mount the database:
startup mount
f. If you restored files to an alternative location,
change the location now in the control file by using the ALTER
TABLESPACE RENAME DATAFILE command.
ALTER TABLESPACE
tablespace_name RENAME DATAFILE '/server1/oradata/user_01.dbf' to
'/server2/oradata/user_01.dbf';
g. Use the RECOVER DATABASE UNTIL CANCEL command to
begin cancel-based recovery. If a backup of the control file is
being used, make sure to specify the USING BACKUP parameter.
Oracle will now start the roll forward by applying the archived redo
logfiles and the online redo logfile. Oracle will prompt
you for each logfile. If you used a backup control file, you
must enter the names of the online redo logfiles.
h. Continue applying
redo logfiles until the most recent, undamaged logfile has been
applied. Enter "CANCEL" instead of the logfile name to cancel the
recovery. Oracle will respond with a recovery successful
message.
i. Use the ALTER
DATABASE OPEN command with the RESETLOGS or NORESETLOGS
option.
You should use the RESETLOGS option if you used a backup of the
control file in recovery, or the recovery was incomplete.
Use the NORESETLOGS option if the recovery was complete. If
you are using a standby database and must reset the logs, the
standby database will have to be re-created.
You can check the ALERT file to see if your incomplete recovery was
actually a complete recovery.
If the recovery was a complete recovery, the message in the ALERT
file is as follows:
RESETLOGS after complete recovery through change scn
If the recovery was incomplete, the following message is recorded:
RESETLOGS after incomplete recovery UNTIL CHANGE scn
Alter database open;
or
Alter database open resetlogs
;
j. After
opening the database using the RESETLOGS option, perform a normal
shutdown and a full database backup. If you do not do this,
any changes made after the recovery and before the next full backup
are unrecoverable. If you did not reset the logs, the database
is still recoverable.
Time Based
Recovery (example recover from drop table)
To recover the database to the time specified by the date. The
date must be a character literal in the format
'YYYY-MM-DD:HH24:MI:SS'.
a. If the database is still
open, shut down the database using the SHUTDOWN command with the
ABORT option.
b. Make a full backup of the
database, including all datafiles, a control file, and the parameter
files in case an error is made during the recovery.
c. Restore backup control
files (if necessary) and backup data files and bring them online.
d. Perform media recovery on
the restored backup using the RECOVER DATABASE statement with the
UNTIL TIME option.
Note: Files in read-only tablespaces should be offline if you are
using a control file backup. Otherwise, the recovery will try to
update the headers of the read-only files.
e. Start the instance and
mount the database:
startup mount
f. If you restored files
to an alternative location, change the location now in the control
file by using the ALTER TABLESPACE RENAME DATAFILE command.
ALTER DATABASE RENAME FILE
'/server1/oradata/user_01.dbf' to '/server2/oradata/user_01.dbf';
g. Obtain the names of
all data files requiring recovery by:
- Checking the list of data files that normally accompanies the
control file being used.
- Querying the V$DATAFILE view.
h. Make sure that all
data files of the database are online. All data files of the
database must be online unless an offline tablespace was taken
offline normally. For example:
ALTER DATABASE DATAFILE
'users1' ONLINE;
i. Issue the RECOVER
DATABASE UNTIL TIME statement to begin time-based recovery. The time
is always specified using the following format, delimited by single
quotation marks: 'YYYY-MM-DD:HH24:MI:SS'.
The following statement recovers the database up to a specified time
using a control file backup:
RECOVER DATABASE UNTIL TIME
'2007-12-31:12:47:30' USING BACKUP CONTROLFILE;
or
RECOVER DATABASE UNTIL
TIME '2007-12-31:12:47:30';
j. Apply the necessary
redo log files to reconstruct the restored data files. Unless the
application of files is automated, Oracle supplies the name it
expects to find from LOG_ARCHIVE_DEST and requests you to stop or
proceed with applying the log file. If the control file is a backup,
you must supply names of online logs.
k. Apply redo log
files until the last required redo log file has been applied to the
restored data files. Oracle automatically terminates the recovery
when it reaches the correct time, and returns a message indicating
whether recovery is successful.
l. Open the DB
Alter database open;
or
Alter database open resetlogs
;
Changed Based
Recovery
This recovers the database to a transaction consistent state
immediately prior to the system change number (SCN) specified by
integer:
a. If the database is still
open, shut down the database using the SHUTDOWN command with the
ABORT option.
shutdown abort;
b. Back up the database as
a precaution and correct any media failures.
c. Restore backup data
files.
d. Start the instance and
mount the database:
startup mount
e. If one or more
damaged data files were restored to an alternative location,
change the location now in the control file by using:
ALTER TABLESPACE
tablespace_name RENAME DATAFILE '/server1/oradata/user_01.dbf' to
'/server2/oradata/user_01.dbf';
f. To determine the SCN needed for recovery, you can query
the V$LOG_HISTORY view.
Select * from v$log_history
where rownum < 3;
THREAD#
SEQUENCE#
TIME
LOW_CHANGE#
HIGH_CHANGE#
ARCHIVE_NAME
1 47213
06/26/98 12:47:55
116098950
116098954 /u01/arch/log_47213.arc
1 47212
06/26/98 12:47:33
116098947
116098949 /u01/arch/log_47212.arc
The LOW_CHANGE# column represents the SCN at the beginning of the
log file and the HIGH_CHANGE# represents the SCN at the end of the
log file.
g. Begin change-based
recovery, specifying the SCN for recovery termination. The SCN is
specified as a decimal number without quotation marks. For example,
to recover until SCN 116098954 issue:
RECOVER DATABASE UNTIL CHANGE
116098954;
h. Oracle begins the
roll forward phase of media recovery by applying the necessary redo
log files (archived and online) to reconstruct the restored data
files. Unless the application of files is automated, Oracle supplies
the name it expects to find from LOG_ARCHIVE_DEST and requests you
to stop or proceed with applying the log file. If the control file
is a backup file, you must supply names of online logs.
i. Oracle apply the
redo log files until the last required redo log file has been
applied to the restored data files. Oracle automatically terminates
the recovery when it reaches the correct system change number, and
returns a message indicating whether recovery is successful.
Recover
DataFile in Backup Mode
On this scenario, the DB crashed during the backup process. So you
need to perform the following steps:
a. Startup the database if
it is already starting shut it down then startup. It will ask for
media recovery
ORA-01113 : file 3 needs media recovery
ORA-01110 : data file 3 : ‘/clover/test.dbf’
So finally you can only mount it.
b. Query v$backup to
determine those datafiles that are in the ACTIVE state:
select FILE#, STATUS from
V$Backup;
c. Place the datafiles out
of hot backup mode by using
Alter database datafile 3 end
backup ;
or
Alter database datafile
‘/clover/test.dbf’ end backup ;
Since the datafile header was frozen the database files were not
synchronized.
d. Open the DB
Alter database open;
Loss of Control
File
Control file problems are usually not detected while the database is
up and running. If the control file is lost or damaged in such a way
that Oracle can't recognize it, a subsequent database startup will
result in ORA-205 (error in identifying control file "%s"), along
with an operating system level error.
When only one control file is lost, the following steps can be used
to recover:
a. Perform a create pfile from spfile;
b. Shut down the instance if
it's still running.
c. Find the cause of the
loss of the control file. Is it due to a hardware problem (disk or
controller)?
d. If the hardware isn't a
problem, copy a good copy of an existing control file to the
location of the lost control file and skip to step g
e. If the hardware is the
problem, copy a good copy of an existing control file to a reliable
location.
f. Edit init.ora updating
the CONTROL_FILES parameter to reflect the new location of the
control file. Also perform create
spfile from pfile;
g. Start up the database.
In summary:
1) Copy a good control file to the location of the missing control
file
2) Remove references to the missing control file from CONTROL_FILES
initialization parameter
The solution is simple, provided you have at least one GOOD control
file: replace the corrupted control file with a copy of a good one
using operating system commands.
Remember to rename the copied file. The database should now start up
without any problems.
Another option is to remove the references from init.ora file about
that control file.
Another option is just to recover it from the RMAN Backup:
RMAN> list backup of
controlfile;
RMAN> restore controlfile
from autobackup to ‘/example/control02.ctl’;
Loss of
ALL Control Files
Option 1: All
Control Files are Lost and no Backup is Available
Under this case, your only option is to start the instance in
NOMOUNT and re-create the control file manually with NORESETLOG. Example:
alter database backup controlfile to trace;
STARTUP NOMOUNT
CREATE CONTROLFILE REUSE
DATABASE "DIE" NORESETLOGS ARCHIVELOG
MAXLOGFILES
16
MAXLOGMEMBERS 3
MAXDATAFILES 100
MAXINSTANCES 8
MAXLOGHISTORY 292
LOGFILE
GROUP 1
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\DIE\REDO01.LOG' SIZE 50M,
GROUP 2
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\DIE\REDO02.LOG' SIZE 50M,
GROUP 3
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\DIE\REDO03.LOG' SIZE 50M
DATAFILE
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\DIE\SYSTEM01.DBF',
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\DIE\UNDOTBS02.DBF',
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\DIE\SYSAUX01.DBF',
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\DIE\USERS01.DBF',
'C:\ORACLE\PRODUCT\10.2.0\ORADATA\DIE\FG_DATA.DBF';
alter database open;
Option 3: All Control Files are Lost and
Autobackup is configured
What if you lose all your control files? In that case you have no
option but to use a backup control file. The recovery needs to be
performed from within RMAN, and requires that all logs (archived
and current online logs) since the last backup are available. The
logs are required because all datafiles must also be restored from
backup. The database will then have to be recovered up to the time
the control files went missing. This can only be done if all
intervening logs are available.
--
Connect to RMAN
C:\> rman target /
-- Start DB Instance in NOMOUNT
RMAN> startup nomount
-- Set DBID, get this from the name of the
controlfile autobackup.
-- For example, if autobackup
name is CTL_SP_BAK_C-1507972899-20050124-00 then the DBID is
1507972899.
-- This step will not be required if the instance is started up
from RMAN or if you are using a catalog.
RMAN>
set dbid 1235583
-- Restore ControlFile from
Autobackup
RMAN>
restore controlfile from
autobackup;
or
RMAN> restore controlfile;
-- Now that control files have
been restored, mount the database.
RMAN> alter database mount;
--Database must be recovered
because all datafiles have been restored from backup
RMAN> recover database;
-- Recovery completed. The
database must be opened with RESETLOGS because a backup control
file was used.
RMAN> alter database open resetlogs;
Several points are worth emphasising.
- Recovery using a backup controlfile should be done only if a
current control file is unavailable.
- All datafiles must be restored from backup. This means the
database will need to be recovered using archived and online
redo logs. These MUST be available for recovery until the time
of failure.
- As with any database recovery involving RESETLOGS, take a
fresh backup immediately.
- Technically the above is an example of complete recovery -
since all committed transactions were recovered. However, some
references consider this to be incomplete recovery because the
database log sequence had to be reset.
After recovery using a backup controlfile, all temporary files
associated with locally-managed tablespaces are no longer available.
You can check that this is so by querying the view V$TEMPFILE - no
rows will be returned. Therefore tempfiles must be added (or
recreated) before the database is made available for general use. In
the case at hand, the tempfile already exists so we merely add it to
the temporary tablespace. This can be done using SQLPlus or any tool
of your choice:
SQL>
alter tablespace temp add
tempfile 'D:\oracle_data\datafiles\ORCL\TEMP01.DBF';
Check that the file is available by querying v$TEMPFILE.
Option 4: Restore Control
File When FRA is Configured
Problem:
- All Control Files are lost
- Flash Recovery Area (FRA) is configured
- Autobackup feature is disabled
Solution(s):
- Start database instance in NOMOUNT mode
RMAN> startup nomount
- Restore control file
RMAN> restore controlfile from autobackup
db_recovery_file_dest='C:\oracle\product\10.2.0\flash_recovery_area'
db_name='DIE';
- MOUNT the database
RMAN> alter database mount;
- Recover database
RMAN> recover database;
- Open the database with RESETLOGS option
RMAN> alter database open resetlogs;
Loss of Redo
Log Files - Explanation
The STATUS column of V$LOG view reflects the status of the log
group:
CURRENT: The log group that is currently being written to by the log
writer.
ACTIVE: The log group is required for crash recovery and may or may
not have been archived.
INACTIVE: The log group isn’t needed for crash recovery and may or
may not have been archived.
UNUSED: The log group has never been written to as it was recently
created.
The STATUS column of V$LOGFILE reports the status of a online redo
log file member :
INVALID: The log file member is inaccessible, or it has been
recently created.
NULL: The log file member is being used by the database.
Recovering
from Redo Log File Failures - Flowchart

Loosing a Member of Multiplexed Redo Log Files
Problem:
A member of multiplexed redo log group is lost
Database is UP
Solution(s):
Fix the media or
Drop the affected Redo Log File and Create a new one in a
different location
Loss of "INACTIVE"
Online Redo Log Group
- Status of the Redo Log Group is INACTIVE
- Shutdown the instance.
- Mount the instance.
- If the lost log file was archived, check the v$log view.
- Issue the alter database clear logfile command.
- If there is an off-line datafile (check v$datafile) that
requires the cleared unarchived log to bring it online, the
keywords unrecoverable datafile are required. The datafile and
its entire tablespace must be dropped from the database because
the redo necessary to bring it online is being cleared, and
there is no copy of it.
- Issue the alter database backup controlfile command.
- Delete the operating system redo log files by hosing to the
operating system and deleting the redo log files for the damaged
redo log group.
- Drop the damaged redo log file group.
- Add a redo log group, using the information noted on the
Database Configuration Checklist.
- Add a redo log file member using the information of the
Database Configuration Checklist.
- Determine if a full off-line backup is required and perform
one if necessary.
- Ensure the instance is started and the database is open.
Loss of "Current"
Online Redo Log Group
- Start the instance if necessary.
- Attempt to alter the database and drop the redo log group. You
will receive an ORA-00350 error stating that the redo log group
needs to be archived.
- Shutdown the instance.
- Review the trace files.
- Review the alert.log file.
- Copy the datafiles and redo log files from the backup
directory into their respective paths.
- Mount the instance.
- Query the v$log view to determine the sequence# of the current
log group.
- Run archive log list.
- Recover the database until cancel, cancelling the recovery
operation one log file sequence# before the current log group.
- Once recovery is complete open the database with the resetlogs
command.
- Determine if a full off-line backup is required and perform
one if necessary.
- Delete the trace files in the $HOME/trace directory.
- View the alert.log file for the recovery that was applied.
- Remove the archive log files from the archive directory.
- Remove the alert.log and trace files from the trace directory.
- Ensure the instance is started and the database is open.
Loss of UNDO
Tablespace
If you get an ORA-00600 or the following error message on your
Oracle database:
...
ORA-00600: internal error code, arguments: [3020], [356515961], [1],
[4606], [2], [16], [], []
ORA-10567: Redo is inconsistent with data block (file# 85, block#
121)
ORA-10564: tablespace UNDO
...
Please note, that the following rescue scenario resolves from
corrupt rollback systems, the UNDO datafile still exists. If your
UNDO datafile is corrupt or lost go to the section Rescue
Tablespace.
1. Shutdown the Database if it is still running
sqlplus "/
as sysdba"
create
pfile from spfile;
shutdown abort;
2. Verify the Rollback Segments in the UNDO Tablespace
The name of the Rollback Segments in the UNDO
Tablespace are _SYSSMU1$, _SYSSMU2$ , _SYSSMU3$, etc. One
Rollback Segment in the SYSTEM Tablespace is called SYSTEM, we use
exactly this Rollback Segment to rescue the UNDO Tablespace.
cd
<DIRECTORY-OF-SYSTEM-TABLESPCAE>
strings
<SYSTEM-TABLESPACE-FILE>.dbf | grep _SYSSMU | cut -d $ -f 1
| sort -u
Probably you get the following Rollback Segments:
_SYSSMU1$,
_SYSSMU2$,
_SYSSMU3$,
_SYSSMU4$,
_SYSSMU5$,
_SYSSMU6$,
_SYSSMU7$,
_SYSSMU8$,
_SYSSMU9$,
_SYSSMU10$
_SYSSMU11$
_SYSSMU12$
_SYSSMU13$
_SYSSMU14$
_SYSSMU15$
_SYSSMU16$
_SYSSMU17$
_SYSSMU18$
_SYSSMU19$
_SYSSMU20$
3. Uncomment and add the following lines to PFILE
#
undo_management = auto
#
undo_retention = 10800
#
undo_tablespace = undotbs1
undo_management = manual
rollback_segments = (system)
_corrupted_rollback_segments =
('_SYSSMU1$','_SYSSMU2$','_SYSSMU3$','_SYSSMU4$','_SYSSMU5$','_SYSSMU6$','_SYSSMU7$','_SYSSMU8$','_SYSSMU9$','_SYSSMU10$','_SYSSMU11$','_SYSSMU12$','_SYSSMU13$','_SYSSMU14$','_SYSSMU15$','_SYSSMU16$','_SYSSMU17$','_SYSSMU18$','_SYSSMU19$','_SYSSMU20$')
4. Startup in RESTRICT Mode with edited PFILE, then drop the
Rollback Segments and UNDO Tablespace
sqlplus "/
as sysdba"
startup
restrict pfile='/home/oracle/oracle/initTEST.ora';
drop
rollback segment "_SYSSMU1$";
drop
rollback segment "_SYSSMU2$";
drop
rollback segment "_SYSSMU3$";
drop
rollback segment "_SYSSMU4$";
drop
rollback segment "_SYSSMU5$";
drop
rollback segment "_SYSSMU6$";
drop
rollback segment "_SYSSMU7$";
drop
rollback segment "_SYSSMU8$";
drop
rollback segment "_SYSSMU9$";
drop
rollback segment "_SYSSMU10$";
drop rollback segment
"_SYSSMU11$";
drop rollback segment
"_SYSSMU12$";
drop rollback segment
"_SYSSMU13$";
drop rollback segment
"_SYSSMU14$";
drop rollback segment
"_SYSSMU15$";
drop rollback segment
"_SYSSMU16$";
drop rollback segment
"_SYSSMU17$";
drop rollback segment
"_SYSSMU18$";
drop rollback segment
"_SYSSMU19$";
drop rollback segment
"_SYSSMU20$";
5. Check for any other corrupted rollback Segmnent
select
segment_name,status,tablespace_name from dba_rollback_segs;
SEGMENT_NAME
STATUS
TABLESPACE_NAME
------------------------------ ----------------
------------------------------
SYSTEM
ONLINE
SYSTEM
If not more segments except SYSTEM, them proceed to the next step;
otherwise go back to step # 2
6. Remove the OLD UNDO Tablespace and create a new one:
drop
tablespace UNDOTBS1 including contents and datafiles;
CREATE UNDO TABLESPACE undotbs2 DATAFILE '/home/oracle/undotbs2.dbf' SIZE 500M;
At
this point, no UNDOTBS1 Tablespace exists anymore, and the SYSTEM
Rollback Segment is the only active Rollback Segment.
7. Now go back to your init.ora PFILE and uncomment the ones
that we commented before and commented out the added lines in step
#3
undo_management = auto
undo_retention = 10800
undo_tablespace = undotbs2
#
undo_management = manual
#
rollback_segments = (system)
#
#
_corrupted_rollback_segments =
('_SYSSMU1$','_SYSSMU2$','_SYSSMU3$','_SYSSMU4$','_SYSSMU5$','_SYSSMU6$','_SYSSMU7$','_SYSSMU8$','_SYSSMU9$','_SYSSMU10$','_SYSSMU11$','_SYSSMU12$','_SYSSMU13$','_SYSSMU14$','_SYSSMU15$','_SYSSMU16$','_SYSSMU17$','_SYSSMU18$','_SYSSMU19$','_SYSSMU20$')
8. Finally, we are ready to re-start the DB and assign
the corrent new Undo tablespace called UNDOTBS2
shutdown
immediate;
create spfile from pfile;
startup;
alter system set undo_tablespace=undotbs2
scope=both;
select
segment_name,status,tablespace_name from dba_rollback_segs;
SEGMENT_NAME
STATUS
TABLESPACE_NAME
------------------------------ ----------------
------------------------------
SYSTEM
ONLINE
SYSTEM
_SYSSMU21$
ONLINE
UNDO
_SYSSMU22$
ONLINE
UNDO
_SYSSMU23$
ONLINE
UNDO
Loss
spfile/pfile files
You don’t have PFILE/SPFILE backup and you lost it.
Case A = Loss of
pfile/spfile with no Backup and DB is up:
a) Check database alert.log bottom-up
When Oracle instance starts, all non-default parameters are recorded
in the database alert.log. So you can get those files into a new
init.ora File
b)You can query V$PARAMETER2 view to re-create your pfile file or
query V$SPPARAMETER to re-create your spfile file:
column name format a30
column display_value format
a40
select name , display_value
from v$parameter2 where isdefault='FALSE';
or
column name format a30
column display_value format
a40
select name , display_value
from v$spparameter where isspecified = 'TRUE';
c) If DB is up, you can create PFILE/SPFILE from memory (Oracle
11g):
create pfile =
'C:\inittest.ora' from memory;
create spfile =
'C:\sp_test.ora' from memory;
Case B = Loss of
pfile/spfile , DB is down and AutoBackup is configured:
Start DB
Instance in NOMOUNT wihout a parameter file
RMAN> startup nomount;
Set DBID
RMAN> set dbid 1235583
Restore SPFILE from Autobackup
RMAN> restore spfile from
autobackup;
Case C = Loss of
pfile/spfile , DB is down and FRA is enabled:
Start DB Instance in NOMOUNT wihout a parameter file
RMAN> startup nomount;
Restore SPFILE from FRA
RMAN> restore spfile from
autobackup db_recovery_file_dest='/u01/app/oracle/oradata/DEV/fra'
db_name='DEV';
Case D = Loss of
pfile/spfile , DB is down and Recovery Catalog is configured:
Start DB Instance in NOMOUNT wihout a parameter file
RMAN> startup nomount;
Restore SPFILE from FRA
RMAN>
restore spfile ;