SlideShare a Scribd company logo
Backup and Recovery Overview
Objectives  After completing this lesson, you should be able to do the following:  •   Describe the basics of database backup, restore, and recovery  •   List the types of failure that may occur in an Oracle environment  •   Define a backup and recovery strategy
Backup and Recovery Issues  •   Protect the database from numerous types of failures  •   Increase Mean-Time-Between-Failures (MTBF)  •   Decrease Mean-Time-To-Recover (MTTR)  •   Minimize data loss  Overview  One of a database administrator’s (DBA) major responsibilities is to ensure that the database is available for use. The DBA can take precautions to minimize failure of the system. In spite of the precautions, it is naive to think that failures will never occur. The DBA must make the database operational as quickly as possible in case of a failure and minimize the loss of data.  To protect the data from the various types of failures that can occur, the DBA must back up the database regularly. Without a current backup, it is impossible for the DBA to get the database up and running if there is a file loss, without losing data.  Backups are critical for recovering from different types of failures. The task of validating backups cannot be overemphasized. Making an assumption that a backup exists without actually checking it’s existence can prove very costly if it is not valid.
Categories of Failures  Statement failure  User process failure User error  Instance failure Media failure Network failure  Categories of Failures  Each type of failure requires a varying level of involvement by the DBA to recover effectively from the situation. In some cases, recovery depends on the type of backup strategy that has been implemented. For example, a statement failure requires little DBA intervention, whereas a media failure requires the DBA to employ a tested recovery strategy.
Causes of Statement Failures  •   Logic error in an application  •   Attempt to enter invalid data into the table •   Attempt an operation with insufficient privileges  •   Attempt to create a table but exceed allotted quota limits  •   Attempt an  INSERT  or  UPDATE  to a table, causing an extent to be allocated, but with insufficient free space available in the table space  Statement Failure  Statement failure occurs where there is a logical failure in the handling of a statement in an Oracle program. Types of statement failures include:  •  A logical error occurs in the application.  •  The user attempts to enter invalid data into the table, perhaps violating integrity constraints.  •  The user attempts an operation with insufficient privileges, such as an insert on a table using only  SELECT  privileges.  •  The user attempts to create a table but exceeds the user’s allotted quota limit •  The user attempts an  INSERT  or  UPDATE  on a table, causing an extent to be allocated, but insufficient free space is available in the table space.  Note:  When a statement failure is encountered, it is likely that the Oracle server or the operating system will return an error code and a message. The failed SQL statement is automatically rolled back, then control is returned to the user program. The application developer or DBA can use the Oracle error codes to diagnose and help resolve the failure.
Resolutions for Statement Failures  •   Correct the logical flow of the program. •   Modify and reissue the SQL statement. •   Provide the necessary database privileges. •  Change the user’s quota limit by using the  ALTER USER  command.  •   Add file space to the table space. •   Enable presumable space allocation.
Statement Failure Resolution  DBA intervention after statement failures will vary in degree, depending on the type of failure, and may include the following:  •  Fix the application so that logical flow is correct. Depending on your environment this may be an  application developer  task rather than a DBA task.  •  Modify the SQL statement and reissue it. This may also be an application developer task rather than a  DBA task.  •  Provide the necessary database privileges for the user to complete the statement successfully.  •  Add file space to the table space. Technically, the DBA should make sure this does not happen; however, in some cases it may be necessary to add file space. A DBA can also use the  RESIZE  and  AUTOEXTEND  options for data files.  •  Oracle9 i  provides a means for suspending, and later resuming, the execution of large database operations in the event of space allocation failures. This enables an administrator to take corrective action, instead of the Oracle database server returning an error to the user. After the error condition is corrected, the suspended operation automatically resumes. This feature is called presumable space allocation and the statements that are affected are called presumable statements.
Causes of User Process Failures  •   The user performed an abnormal disconnect in the session.  •   The user’s session was abnormally terminated.  •   The user’s program raised an address exception, which terminated the session.  Causes of User Process Failures  A user’s process may fail for a number of reasons; however, the more common causes include:  •  The user performed an abnormal disconnect in the session. For example, a user issues a [Ctrl] + [Break] in SQL*Plus while connected to a database in a client-server configuration.  •  The user’s session was abnormally terminated. One possible scenario is the user rebooted the client while connected to a database in a client-server configuration. •  The user’s program raised an address exception which terminated the session. This is common if the application does not properly handle exceptions when they are raised.
Resolution of User Process Failures  •   The PMON process detects an abnormally terminated user process.  •   PMON rolls back the transaction and releases any resources and locks being held by it.  User Process Failure and DBA Action  The DBA will rarely need to take action to resolve user process errors. The user process cannot continue to work, although the Oracle server and other user processes will continue to function.  PMON Background Process  The PMON background process is usually sufficient for cleaning up after an abnormally terminated user process.  When the PMON process detects an abnormally terminated server process, it rolls back the transaction of the abnormally terminated process, and releases any resources and locks it has acquired.
Possible User Errors  SQL> DROP TABLE employees;  SQL> TRUNCATE TABLE employees;  SQL> DELETE FROM employees; SQL> COMMIT;  SQL> UPDATE employees  SET salary = salary * 1.5;  SQL> COMMIT;  User Errors  DBA intervention is usually required to recover from user errors.  Common Types of User Errors  •   The user accidentally drops or truncates a table. •   The user deletes all rows in a table.  •   The user commits data, but discovers an error in the committed data.
Resolution of User Errors  •   Train the database users. •   Recover from a valid backup. •   Import the table from an export file. •   Use Log Miner to determine the time of error. •   Recover with a point-in-time recovery. •   Use Log Miner to perform object-level recovery. •  Use Flashback to view and repair historical data.  Minimizing User Errors  A key issue in any database and application environment is to make sure that users are properly trained and are aware of database availability and integrity implications. A DBA should understand the types of applications and business operations that may result in loss of data from user errors and how to implement recovery measures for those situations. Some recovery situations may be quite extensive, such as restoring the database to a point-in- time just prior to the error, exporting the lost data, and then importing that data back into the database from which it was lost.  Oracle9 i  provides a new feature called Flashback, which lets you view and repair historical data. Flashback offers the ability to perform queries on the database as of a certain wall clock time or user-specified system commit number (SCN).
 
Instance Failure  An instance failure may occur for numerous reasons:  •  A power outage occurs that causes the server to become unavailable. •  The server becomes unavailable due to hardware problems such as a CPU failure, memory corruption, or an operating system crash.  •  One of the Oracle server background processes (DBW n , LGWR, PMON, SMON, CKPT) experiences a failure.  To recover from instance failure, the DBA:  •  Starts the instance by using the “startup” command. The Oracle server will automatically recover, performing both the roll forward and rollback phases. •  Investigates the cause of failure by reading the instance  alert.log  file and any other trace files that were generated during the instance failure.
Recovery from Instance Failure  •   No special recovery action is needed from DBA. •   Start the instance.  •   Wait for the “database opened” notification.  •   Notify users.  •   Check alert file to determine the reason for the failure.  Instance Recovery  Instance recovery restores a database to its transaction-consistent state just prior to instance failure. The Oracle server automatically performs instance recovery when the database is opened if it is necessary.  No recovery action needs to be performed by you. All required redo information is read by SMON. To recover from this type of failure, start the database:  SQL>  CONNECT / AS sysdba;  Connected.  SQL> STARTUP;  Database opened.  After the database has opened, notify users that any data that they did not commit must be re- entered.
Instance Recovery (continued)  Note:  •  There may be a time delay between starting the database and the “Database opened” notification—this is the roll forward phase that takes place while the database is mounted.  - SMON performs the roll forward process by applying changes recorded in the online redo log files from the last checkpoint.  - Rolling forward recovers data that has not been recorded in the database files, but has been recorded in the online redo log, including the contents of rollback segments.  •  Rollback can occur while the database is open, because either SMON or a server process can perform the rollback operation. This allows the database to be available for users more quickly.  Oracle9 i  DBA Fundamentals II  6 - 13
Causes of Media Failures  •   Head crash on a disk drive  •   Physical problem in reading from or writing to database files  •   File was accidentally erased  Media Failure  Media failure involves a physical problem when reading from or writing to a file that is necessary for the database to operate. Media failure is the most serious type of failure because it usually requires DBA intervention.  Common Types of Media Related Problems  •  The disk drive that held one of the database files experienced a head crash •  There is a physical problem reading from or writing to the files needed for normal database operation.  •  A file was accidentally erased.
Resolutions for Media Failures  •   The recovery strategy depends on which backup method was chosen and which files  are affected.  •   If available, apply archived redo log files to recover data committed since the last backup.  Media Failure Resolution  A tested recovery strategy is the key component to resolving media failure problems. The ability of the DBA to minimize down time and data loss as a result of media failure depends on the type of backups that are available. A recovery strategy, therefore, depends on the following:  •  The backup method you choose and which files are affected.  •  The Archive log mode of operation of the database. If archiving is used, you can apply archived redo log files to recover committed data since the last backup.
Defining a Backup and Recovery Strategy  •   Business requirements  •   Operational requirements •   Technical considerations •   Management concurrence  Questions for the DBA  Whatever backup strategy you choose, it is important to obtain agreement from all appropriate levels of management. For example, if your company wants to avoid taking physical image copies of the files to minimize the usage of disk space, management must be aware of the ramifications of this decision.  Here are some questions to consider when selecting a backup strategy: • Given the expectation of system availability, does management understand the tradeoffs of the backup strategy that is chosen?  •   Are there dedicated resources available which will be needed to ensure a successful backup and  recovery strategy?  •  Is the importance of taking backups and preparing recovery procedures clearly understood?  Performing a thorough analysis of the business, operational, and technical requirements provides management with the information needed to support an effective backup and recovery strategy.
Business Requirements  •  Mean-Time-To-Recover  •   Mean-Time-Between-Failure •   Evolutionary process  Business Impact  You should understand the impact that down time has on the business. Management must quantify the cost of down time and the loss of data and compare this with the cost of reducing down time and minimizing data loss.  MTTR  Database availability is a key issue for a DBA. In the event of a failure the DBA should strive to reduce the Mean-Time-To-Recover (MTTR). This strategy ensures that the database is unavailable for the shortest possible amount of time. Anticipating the types of failures that can occur and using effective recovery strategies, the DBA can ultimately reduce the MTTR.  MTBF  Protecting the database against various types of failures is also a key DBA task. To do this, a DBA must increase the Mean-Time-Between-Failures (MTBF). The DBA must understand the backup and recovery structures within an Oracle database environment and configure the database so that failures do not often occur.  Evolutionary Process  A backup and recovery strategy evolves as business, operational, and technical requirements change. It is important that both the DBA and appropriate management review the validity of a backup and recovery strategy on a regular basis.
Operational Requirements  •   24-hour operations  •   Testing and validating backups •   Database volatility  24-Hour Operations  Backups and recoveries are always affected by the type of business operation that you provide, particularly in a situation where a database must be available 24 hours a day, 7 days a week for continuous operation. Proper database configuration is necessary to support these operational requirements because they directly affect the technical aspects of the database environment.  Testing Backups  DBAs can ensure that they have a strategy that enables them to decrease the MTTR and increase the MTBF by having a plan in place to test the validity of backups regularly. A recovery is only as good as the backups that are available. Here are some questions to consider when selecting a backup strategy:  •   Can you depend on system administrators, vendors, backup DBAs, and other critical personnel when you need help?  •   Can you test your backup and recovery strategies at frequently scheduled intervals?  •  Are backup copies stored at an off-site location?  •  Is a plan well documented and maintained?
Database Volatility  Other issues that impact operational requirements include the volatility of the data and structure of the database. Here are some questions to consider when selecting a backup strategy:  •  Are tables frequently updated?  •  Is data highly volatile? If so, you must perform backups more frequently than a business where data is relatively static.  •  Does the structure of the database change often? •  How often do you add data files?
Technical Considerations  •   Resources: hardware, software, manpower, and time  •   Physical image copies of the operating system files  •   Logical copies of the objects in the database •   Database configuration  •   Transaction volume which affects desired frequency of backups  Physical Image Copies  Certain technical requirements are dictated by the types of backups that are required. For example, if physical image copies of data files are required, this may significantly impact available storage space.  Logical Copies  Creating logical copies of objects in the database may not have as significant storage requirements as physical image copies; however, system resources may be affected because logical copies are performed while the database is being accessed by users.  Database Configuration  Database configuration affects how backups are performed and the availability of the database. Depending on the database configuration, system resources, such as disk space required to support a backup and recovery strategy, may be limited.
Transaction Volume  Transaction volume also affect system resources. If 24-hour operations require regular backups, the load on system resources is increased.  Technical Requirements  Here are some questions to consider when selecting a backup strategy: • How much data do you have?  •   Do you have the machine power and capacity to support backups? • Is the data easily recreated?  •   Can you reload the data into the database from a flat file?  •   Does the database configuration support resiliency to different types of failures?
Disaster Recovery Issues  •   How will your business be affected in the event of a major disaster?  -   Earthquake, flood, or fire  -   Complete loss of machine  -   Malfunction of storage hardware or software  -   Loss of key personnel, such as the database administrator  •   Do you have a plan for testing your strategy periodically?  Natural Disaster  Perhaps your data is so important that you must ensure resiliency even in the event of a complete system failure. Natural disasters and other issues can affect the availability of your data and must be considered when creating a disaster recovery plan. Here are some questions to consider when selecting a backup and recovery strategy:  •  What will happen to your business in the event of a serious disaster such as: - Flood, fire, earthquake, or hurricane  - Malfunction of storage hardware or software  •  If your database server fails, will your business be able to operate during the hours, days, or even weeks it might take to get a new hardware system? • Do you store backups at an off-site location?
Solutions  •  Off-site backups  •  Oracle9 i  Data Guard which protects critical data by automating the creation, management, and  monitoring aspects of a standby database environment. •  Geomirroring  •  Messaging •  TP monitors  Loss of Key Personnel  In terms of key personnel, consider the following questions: •  How will a loss of personnel affect your business?  •  If your DBA leaves the company or is unable to work, will you be able to continue to run the database system?  •  Who will handle a recovery situation if the DBA is unavailable?
Summary  In this lesson, you should have learned how to: •   Evaluate potential failures in your environment •   Develop a strategy dictated by business, operational, and technical requirements •   Consider a test plan for a backup and recovery strategy
 
 
 
 

More Related Content

What's hot (20)

PPTX
Data Warehousing
Kamal Acharya
 
PPT
Database backup and recovery basics
Shahed Mohamed
 
PPTX
Intro to dbms
Surkhab Shelly
 
PPT
An Introduction To Weka
weka Content
 
PDF
Oracle Cloud is Best for Oracle Database - High Availability
Markus Michalewicz
 
PDF
DB12c: All You Need to Know About the Resource Manager
Andrejs Vorobjovs
 
PPT
Oracle backup and recovery
Yogiji Creations
 
PPT
Dataguard presentation
Vimlendu Kumar
 
PPTX
Introduction to Oracle Database
puja_dhar
 
PPTX
Backup & recovery with rman
itsabidhussain
 
DOCX
Oracle architecture
Soumya Das
 
PPT
Oracle archi ppt
Hitesh Kumar Markam
 
PPT
Sql joins
Berkeley
 
PPT
Sql Server Basics
rainynovember12
 
PDF
Database recovery techniques
pusp220
 
PPTX
Database systems
NazmulHossen5
 
PPTX
Active Directory Replication.pptx
masbulosoke
 
PDF
How to Use EXAchk Effectively to Manage Exadata Environments
Sandesh Rao
 
PDF
RMAN in 12c: The Next Generation (PPT)
Gustavo Rene Antunez
 
PPSX
Parallel Database
VESIT/University of Mumbai
 
Data Warehousing
Kamal Acharya
 
Database backup and recovery basics
Shahed Mohamed
 
Intro to dbms
Surkhab Shelly
 
An Introduction To Weka
weka Content
 
Oracle Cloud is Best for Oracle Database - High Availability
Markus Michalewicz
 
DB12c: All You Need to Know About the Resource Manager
Andrejs Vorobjovs
 
Oracle backup and recovery
Yogiji Creations
 
Dataguard presentation
Vimlendu Kumar
 
Introduction to Oracle Database
puja_dhar
 
Backup & recovery with rman
itsabidhussain
 
Oracle architecture
Soumya Das
 
Oracle archi ppt
Hitesh Kumar Markam
 
Sql joins
Berkeley
 
Sql Server Basics
rainynovember12
 
Database recovery techniques
pusp220
 
Database systems
NazmulHossen5
 
Active Directory Replication.pptx
masbulosoke
 
How to Use EXAchk Effectively to Manage Exadata Environments
Sandesh Rao
 
RMAN in 12c: The Next Generation (PPT)
Gustavo Rene Antunez
 
Parallel Database
VESIT/University of Mumbai
 

Viewers also liked (6)

PPT
Backup And Recovery
Wynthorpe
 
PPTX
2.6 backup and recovery
mrmwood
 
PPTX
Backup and recovery
dhawal mehta
 
PDF
What is Backup and Disaster Recovery
HOS5
 
PPTX
Data backup and disaster recovery
catacutanjcsantos
 
PPT
Presentation on backup and recoveryyyyyyyyyyyyy
Tehmina Gulfam
 
Backup And Recovery
Wynthorpe
 
2.6 backup and recovery
mrmwood
 
Backup and recovery
dhawal mehta
 
What is Backup and Disaster Recovery
HOS5
 
Data backup and disaster recovery
catacutanjcsantos
 
Presentation on backup and recoveryyyyyyyyyyyyy
Tehmina Gulfam
 
Ad

Similar to Backup And Recovery (20)

PPT
Less14 Br Concepts
vivaankumar
 
PPTX
backup. the database administration.pptx
momnatanveer321
 
PPT
Less14 br concepts
Amit Bhalla
 
PPT
Les 06 Perform Rec
vivaankumar
 
PDF
Real liferecoverypaper
oracle documents
 
PPT
Les 06 rec
Femi Adeyemi
 
PPTX
Relational Database Management System-- vivek singh
shekhawatvsshp
 
PDF
Real liferecoverypresentation
oracle documents
 
PPT
Less16 Recovery
vivaankumar
 
PPTX
Recovery Techniques and Need of Recovery
Pooja Dixit
 
PPT
Oracle PL/SQL exception handling
Smitha Padmanabhan
 
PPT
Error management
daniil3
 
PPTX
Unit Three: Database Recovery Points & Procedures
asterbelete021
 
DOC
Oracle OCP Backup Exam
Inprise Group
 
PDF
2516186 oracle9i-dba-fundamentals-ii-volume-ii
Nishant Gupta
 
DOC
Oracle ocp backup exam
sriram raj
 
PPTX
Database failure and recovery 1
vishal choudhary
 
PDF
Nevera Dul Moment
kurtvm
 
PPT
5 backuprecoveryw imp
Hitesh Kumar Markam
 
PPT
Oracle Flashback Query 3
grogers1124
 
Less14 Br Concepts
vivaankumar
 
backup. the database administration.pptx
momnatanveer321
 
Less14 br concepts
Amit Bhalla
 
Les 06 Perform Rec
vivaankumar
 
Real liferecoverypaper
oracle documents
 
Les 06 rec
Femi Adeyemi
 
Relational Database Management System-- vivek singh
shekhawatvsshp
 
Real liferecoverypresentation
oracle documents
 
Less16 Recovery
vivaankumar
 
Recovery Techniques and Need of Recovery
Pooja Dixit
 
Oracle PL/SQL exception handling
Smitha Padmanabhan
 
Error management
daniil3
 
Unit Three: Database Recovery Points & Procedures
asterbelete021
 
Oracle OCP Backup Exam
Inprise Group
 
2516186 oracle9i-dba-fundamentals-ii-volume-ii
Nishant Gupta
 
Oracle ocp backup exam
sriram raj
 
Database failure and recovery 1
vishal choudhary
 
Nevera Dul Moment
kurtvm
 
5 backuprecoveryw imp
Hitesh Kumar Markam
 
Oracle Flashback Query 3
grogers1124
 
Ad

Recently uploaded (20)

PDF
Biological Bilingual Glossary Hindi and English Medium
World of Wisdom
 
PDF
The Constitution Review Committee (CRC) has released an updated schedule for ...
nservice241
 
PDF
Generative AI: it's STILL not a robot (CIJ Summer 2025)
Paul Bradshaw
 
PDF
The dynastic history of the Chahmana.pdf
PrachiSontakke5
 
PPTX
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
PDF
The-Ever-Evolving-World-of-Science (1).pdf/7TH CLASS CURIOSITY /1ST CHAPTER/B...
Sandeep Swamy
 
PDF
People & Earth's Ecosystem -Lesson 2: People & Population
marvinnbustamante1
 
PPTX
I AM MALALA The Girl Who Stood Up for Education and was Shot by the Taliban...
Beena E S
 
PPTX
2025 Winter SWAYAM NPTEL & A Student.pptx
Utsav Yagnik
 
PPTX
How to Create a PDF Report in Odoo 18 - Odoo Slides
Celine George
 
PDF
community health nursing question paper 2.pdf
Prince kumar
 
PPTX
SPINA BIFIDA: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
PPTX
Growth and development and milestones, factors
BHUVANESHWARI BADIGER
 
PDF
Chapter-V-DED-Entrepreneurship: Institutions Facilitating Entrepreneurship
Dayanand Huded
 
PDF
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
PDF
LAW OF CONTRACT ( 5 YEAR LLB & UNITARY LLB)- MODULE-3 - LEARN THROUGH PICTURE
APARNA T SHAIL KUMAR
 
PPTX
MENINGITIS: NURSING MANAGEMENT, BACTERIAL MENINGITIS, VIRAL MENINGITIS.pptx
PRADEEP ABOTHU
 
PPTX
How to Set Maximum Difference Odoo 18 POS
Celine George
 
PPTX
Stereochemistry-Optical Isomerism in organic compoundsptx
Tarannum Nadaf-Mansuri
 
PPT
Talk on Critical Theory, Part One, Philosophy of Social Sciences
Soraj Hongladarom
 
Biological Bilingual Glossary Hindi and English Medium
World of Wisdom
 
The Constitution Review Committee (CRC) has released an updated schedule for ...
nservice241
 
Generative AI: it's STILL not a robot (CIJ Summer 2025)
Paul Bradshaw
 
The dynastic history of the Chahmana.pdf
PrachiSontakke5
 
Unit 2 COMMERCIAL BANKING, Corporate banking.pptx
AnubalaSuresh1
 
The-Ever-Evolving-World-of-Science (1).pdf/7TH CLASS CURIOSITY /1ST CHAPTER/B...
Sandeep Swamy
 
People & Earth's Ecosystem -Lesson 2: People & Population
marvinnbustamante1
 
I AM MALALA The Girl Who Stood Up for Education and was Shot by the Taliban...
Beena E S
 
2025 Winter SWAYAM NPTEL & A Student.pptx
Utsav Yagnik
 
How to Create a PDF Report in Odoo 18 - Odoo Slides
Celine George
 
community health nursing question paper 2.pdf
Prince kumar
 
SPINA BIFIDA: NURSING MANAGEMENT .pptx
PRADEEP ABOTHU
 
Growth and development and milestones, factors
BHUVANESHWARI BADIGER
 
Chapter-V-DED-Entrepreneurship: Institutions Facilitating Entrepreneurship
Dayanand Huded
 
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
LAW OF CONTRACT ( 5 YEAR LLB & UNITARY LLB)- MODULE-3 - LEARN THROUGH PICTURE
APARNA T SHAIL KUMAR
 
MENINGITIS: NURSING MANAGEMENT, BACTERIAL MENINGITIS, VIRAL MENINGITIS.pptx
PRADEEP ABOTHU
 
How to Set Maximum Difference Odoo 18 POS
Celine George
 
Stereochemistry-Optical Isomerism in organic compoundsptx
Tarannum Nadaf-Mansuri
 
Talk on Critical Theory, Part One, Philosophy of Social Sciences
Soraj Hongladarom
 

Backup And Recovery

  • 2. Objectives After completing this lesson, you should be able to do the following: • Describe the basics of database backup, restore, and recovery • List the types of failure that may occur in an Oracle environment • Define a backup and recovery strategy
  • 3. Backup and Recovery Issues • Protect the database from numerous types of failures • Increase Mean-Time-Between-Failures (MTBF) • Decrease Mean-Time-To-Recover (MTTR) • Minimize data loss Overview One of a database administrator’s (DBA) major responsibilities is to ensure that the database is available for use. The DBA can take precautions to minimize failure of the system. In spite of the precautions, it is naive to think that failures will never occur. The DBA must make the database operational as quickly as possible in case of a failure and minimize the loss of data. To protect the data from the various types of failures that can occur, the DBA must back up the database regularly. Without a current backup, it is impossible for the DBA to get the database up and running if there is a file loss, without losing data. Backups are critical for recovering from different types of failures. The task of validating backups cannot be overemphasized. Making an assumption that a backup exists without actually checking it’s existence can prove very costly if it is not valid.
  • 4. Categories of Failures Statement failure User process failure User error Instance failure Media failure Network failure Categories of Failures Each type of failure requires a varying level of involvement by the DBA to recover effectively from the situation. In some cases, recovery depends on the type of backup strategy that has been implemented. For example, a statement failure requires little DBA intervention, whereas a media failure requires the DBA to employ a tested recovery strategy.
  • 5. Causes of Statement Failures • Logic error in an application • Attempt to enter invalid data into the table • Attempt an operation with insufficient privileges • Attempt to create a table but exceed allotted quota limits • Attempt an INSERT or UPDATE to a table, causing an extent to be allocated, but with insufficient free space available in the table space Statement Failure Statement failure occurs where there is a logical failure in the handling of a statement in an Oracle program. Types of statement failures include: • A logical error occurs in the application. • The user attempts to enter invalid data into the table, perhaps violating integrity constraints. • The user attempts an operation with insufficient privileges, such as an insert on a table using only SELECT privileges. • The user attempts to create a table but exceeds the user’s allotted quota limit • The user attempts an INSERT or UPDATE on a table, causing an extent to be allocated, but insufficient free space is available in the table space. Note: When a statement failure is encountered, it is likely that the Oracle server or the operating system will return an error code and a message. The failed SQL statement is automatically rolled back, then control is returned to the user program. The application developer or DBA can use the Oracle error codes to diagnose and help resolve the failure.
  • 6. Resolutions for Statement Failures • Correct the logical flow of the program. • Modify and reissue the SQL statement. • Provide the necessary database privileges. • Change the user’s quota limit by using the ALTER USER command. • Add file space to the table space. • Enable presumable space allocation.
  • 7. Statement Failure Resolution DBA intervention after statement failures will vary in degree, depending on the type of failure, and may include the following: • Fix the application so that logical flow is correct. Depending on your environment this may be an application developer task rather than a DBA task. • Modify the SQL statement and reissue it. This may also be an application developer task rather than a DBA task. • Provide the necessary database privileges for the user to complete the statement successfully. • Add file space to the table space. Technically, the DBA should make sure this does not happen; however, in some cases it may be necessary to add file space. A DBA can also use the RESIZE and AUTOEXTEND options for data files. • Oracle9 i provides a means for suspending, and later resuming, the execution of large database operations in the event of space allocation failures. This enables an administrator to take corrective action, instead of the Oracle database server returning an error to the user. After the error condition is corrected, the suspended operation automatically resumes. This feature is called presumable space allocation and the statements that are affected are called presumable statements.
  • 8. Causes of User Process Failures • The user performed an abnormal disconnect in the session. • The user’s session was abnormally terminated. • The user’s program raised an address exception, which terminated the session. Causes of User Process Failures A user’s process may fail for a number of reasons; however, the more common causes include: • The user performed an abnormal disconnect in the session. For example, a user issues a [Ctrl] + [Break] in SQL*Plus while connected to a database in a client-server configuration. • The user’s session was abnormally terminated. One possible scenario is the user rebooted the client while connected to a database in a client-server configuration. • The user’s program raised an address exception which terminated the session. This is common if the application does not properly handle exceptions when they are raised.
  • 9. Resolution of User Process Failures • The PMON process detects an abnormally terminated user process. • PMON rolls back the transaction and releases any resources and locks being held by it. User Process Failure and DBA Action The DBA will rarely need to take action to resolve user process errors. The user process cannot continue to work, although the Oracle server and other user processes will continue to function. PMON Background Process The PMON background process is usually sufficient for cleaning up after an abnormally terminated user process. When the PMON process detects an abnormally terminated server process, it rolls back the transaction of the abnormally terminated process, and releases any resources and locks it has acquired.
  • 10. Possible User Errors SQL> DROP TABLE employees; SQL> TRUNCATE TABLE employees; SQL> DELETE FROM employees; SQL> COMMIT; SQL> UPDATE employees SET salary = salary * 1.5; SQL> COMMIT; User Errors DBA intervention is usually required to recover from user errors. Common Types of User Errors • The user accidentally drops or truncates a table. • The user deletes all rows in a table. • The user commits data, but discovers an error in the committed data.
  • 11. Resolution of User Errors • Train the database users. • Recover from a valid backup. • Import the table from an export file. • Use Log Miner to determine the time of error. • Recover with a point-in-time recovery. • Use Log Miner to perform object-level recovery. • Use Flashback to view and repair historical data. Minimizing User Errors A key issue in any database and application environment is to make sure that users are properly trained and are aware of database availability and integrity implications. A DBA should understand the types of applications and business operations that may result in loss of data from user errors and how to implement recovery measures for those situations. Some recovery situations may be quite extensive, such as restoring the database to a point-in- time just prior to the error, exporting the lost data, and then importing that data back into the database from which it was lost. Oracle9 i provides a new feature called Flashback, which lets you view and repair historical data. Flashback offers the ability to perform queries on the database as of a certain wall clock time or user-specified system commit number (SCN).
  • 12.  
  • 13. Instance Failure An instance failure may occur for numerous reasons: • A power outage occurs that causes the server to become unavailable. • The server becomes unavailable due to hardware problems such as a CPU failure, memory corruption, or an operating system crash. • One of the Oracle server background processes (DBW n , LGWR, PMON, SMON, CKPT) experiences a failure. To recover from instance failure, the DBA: • Starts the instance by using the “startup” command. The Oracle server will automatically recover, performing both the roll forward and rollback phases. • Investigates the cause of failure by reading the instance alert.log file and any other trace files that were generated during the instance failure.
  • 14. Recovery from Instance Failure • No special recovery action is needed from DBA. • Start the instance. • Wait for the “database opened” notification. • Notify users. • Check alert file to determine the reason for the failure. Instance Recovery Instance recovery restores a database to its transaction-consistent state just prior to instance failure. The Oracle server automatically performs instance recovery when the database is opened if it is necessary. No recovery action needs to be performed by you. All required redo information is read by SMON. To recover from this type of failure, start the database: SQL> CONNECT / AS sysdba; Connected. SQL> STARTUP; Database opened. After the database has opened, notify users that any data that they did not commit must be re- entered.
  • 15. Instance Recovery (continued) Note: • There may be a time delay between starting the database and the “Database opened” notification—this is the roll forward phase that takes place while the database is mounted. - SMON performs the roll forward process by applying changes recorded in the online redo log files from the last checkpoint. - Rolling forward recovers data that has not been recorded in the database files, but has been recorded in the online redo log, including the contents of rollback segments. • Rollback can occur while the database is open, because either SMON or a server process can perform the rollback operation. This allows the database to be available for users more quickly. Oracle9 i DBA Fundamentals II 6 - 13
  • 16. Causes of Media Failures • Head crash on a disk drive • Physical problem in reading from or writing to database files • File was accidentally erased Media Failure Media failure involves a physical problem when reading from or writing to a file that is necessary for the database to operate. Media failure is the most serious type of failure because it usually requires DBA intervention. Common Types of Media Related Problems • The disk drive that held one of the database files experienced a head crash • There is a physical problem reading from or writing to the files needed for normal database operation. • A file was accidentally erased.
  • 17. Resolutions for Media Failures • The recovery strategy depends on which backup method was chosen and which files are affected. • If available, apply archived redo log files to recover data committed since the last backup. Media Failure Resolution A tested recovery strategy is the key component to resolving media failure problems. The ability of the DBA to minimize down time and data loss as a result of media failure depends on the type of backups that are available. A recovery strategy, therefore, depends on the following: • The backup method you choose and which files are affected. • The Archive log mode of operation of the database. If archiving is used, you can apply archived redo log files to recover committed data since the last backup.
  • 18. Defining a Backup and Recovery Strategy • Business requirements • Operational requirements • Technical considerations • Management concurrence Questions for the DBA Whatever backup strategy you choose, it is important to obtain agreement from all appropriate levels of management. For example, if your company wants to avoid taking physical image copies of the files to minimize the usage of disk space, management must be aware of the ramifications of this decision. Here are some questions to consider when selecting a backup strategy: • Given the expectation of system availability, does management understand the tradeoffs of the backup strategy that is chosen? • Are there dedicated resources available which will be needed to ensure a successful backup and recovery strategy? • Is the importance of taking backups and preparing recovery procedures clearly understood? Performing a thorough analysis of the business, operational, and technical requirements provides management with the information needed to support an effective backup and recovery strategy.
  • 19. Business Requirements • Mean-Time-To-Recover • Mean-Time-Between-Failure • Evolutionary process Business Impact You should understand the impact that down time has on the business. Management must quantify the cost of down time and the loss of data and compare this with the cost of reducing down time and minimizing data loss. MTTR Database availability is a key issue for a DBA. In the event of a failure the DBA should strive to reduce the Mean-Time-To-Recover (MTTR). This strategy ensures that the database is unavailable for the shortest possible amount of time. Anticipating the types of failures that can occur and using effective recovery strategies, the DBA can ultimately reduce the MTTR. MTBF Protecting the database against various types of failures is also a key DBA task. To do this, a DBA must increase the Mean-Time-Between-Failures (MTBF). The DBA must understand the backup and recovery structures within an Oracle database environment and configure the database so that failures do not often occur. Evolutionary Process A backup and recovery strategy evolves as business, operational, and technical requirements change. It is important that both the DBA and appropriate management review the validity of a backup and recovery strategy on a regular basis.
  • 20. Operational Requirements • 24-hour operations • Testing and validating backups • Database volatility 24-Hour Operations Backups and recoveries are always affected by the type of business operation that you provide, particularly in a situation where a database must be available 24 hours a day, 7 days a week for continuous operation. Proper database configuration is necessary to support these operational requirements because they directly affect the technical aspects of the database environment. Testing Backups DBAs can ensure that they have a strategy that enables them to decrease the MTTR and increase the MTBF by having a plan in place to test the validity of backups regularly. A recovery is only as good as the backups that are available. Here are some questions to consider when selecting a backup strategy: • Can you depend on system administrators, vendors, backup DBAs, and other critical personnel when you need help? • Can you test your backup and recovery strategies at frequently scheduled intervals? • Are backup copies stored at an off-site location? • Is a plan well documented and maintained?
  • 21. Database Volatility Other issues that impact operational requirements include the volatility of the data and structure of the database. Here are some questions to consider when selecting a backup strategy: • Are tables frequently updated? • Is data highly volatile? If so, you must perform backups more frequently than a business where data is relatively static. • Does the structure of the database change often? • How often do you add data files?
  • 22. Technical Considerations • Resources: hardware, software, manpower, and time • Physical image copies of the operating system files • Logical copies of the objects in the database • Database configuration • Transaction volume which affects desired frequency of backups Physical Image Copies Certain technical requirements are dictated by the types of backups that are required. For example, if physical image copies of data files are required, this may significantly impact available storage space. Logical Copies Creating logical copies of objects in the database may not have as significant storage requirements as physical image copies; however, system resources may be affected because logical copies are performed while the database is being accessed by users. Database Configuration Database configuration affects how backups are performed and the availability of the database. Depending on the database configuration, system resources, such as disk space required to support a backup and recovery strategy, may be limited.
  • 23. Transaction Volume Transaction volume also affect system resources. If 24-hour operations require regular backups, the load on system resources is increased. Technical Requirements Here are some questions to consider when selecting a backup strategy: • How much data do you have? • Do you have the machine power and capacity to support backups? • Is the data easily recreated? • Can you reload the data into the database from a flat file? • Does the database configuration support resiliency to different types of failures?
  • 24. Disaster Recovery Issues • How will your business be affected in the event of a major disaster? - Earthquake, flood, or fire - Complete loss of machine - Malfunction of storage hardware or software - Loss of key personnel, such as the database administrator • Do you have a plan for testing your strategy periodically? Natural Disaster Perhaps your data is so important that you must ensure resiliency even in the event of a complete system failure. Natural disasters and other issues can affect the availability of your data and must be considered when creating a disaster recovery plan. Here are some questions to consider when selecting a backup and recovery strategy: • What will happen to your business in the event of a serious disaster such as: - Flood, fire, earthquake, or hurricane - Malfunction of storage hardware or software • If your database server fails, will your business be able to operate during the hours, days, or even weeks it might take to get a new hardware system? • Do you store backups at an off-site location?
  • 25. Solutions • Off-site backups • Oracle9 i Data Guard which protects critical data by automating the creation, management, and monitoring aspects of a standby database environment. • Geomirroring • Messaging • TP monitors Loss of Key Personnel In terms of key personnel, consider the following questions: • How will a loss of personnel affect your business? • If your DBA leaves the company or is unable to work, will you be able to continue to run the database system? • Who will handle a recovery situation if the DBA is unavailable?
  • 26. Summary In this lesson, you should have learned how to: • Evaluate potential failures in your environment • Develop a strategy dictated by business, operational, and technical requirements • Consider a test plan for a backup and recovery strategy
  • 27.  
  • 28.  
  • 29.  
  • 30.