An enterprise-wide business continuity testing policy should be established by the board and senior management and should set expectations for business lines and support functions to follow in implementing testing strategies and test plans. The policy should establish a testing cycle that increases in scope and complexity over time. As such, the testing policy should continuously improve by adapting to changes in business conditions and supporting expanded integration testing.
The testing policy should incorporate the use of a BIA and risk assessment for developing enterprise-wide and business line continuity testing strategies. The policy should identify key roles and responsibilities and establish minimum requirements for the institution's business continuity testing, including baseline requirements for frequency, scope, and reporting test results.
Testing policies will vary depending on the size and risk profile of the institution. While all institutions should develop testing policies on an enterprise-wide basis and involve essential employees in the testing process, some considerations differ depending on whether the institution relies on service providers (serviced institutions) or whether it processes its work internally (in-house).
A serviced institution's testing policy should include guidelines addressing tests between the financial institution and its service provider.Refer to the following guidance included in the FFIEC IT Examination Handbook for additional information: June 2004 "Outsourcing Technology Services Booklet" in the section entitled, Related Topics, and the June 2004 "Management Booklet" in the section entitled, Management Considerations for Technology Service Providers. Serviced institutions should test communication and connectivity procedures to be followed when either the financial institution's or service provider's systems, at their primary or alternate sites, are inoperable. Serviced institutions should participate in tests with their critical service providers to ensure that institution employees fully understand the recovery process.
The testing policy for in-house institutions should address the active involvement of personnel when systems and data files are tested. In-house institutions often send their back-up media to a recovery site to be processed by the back-up service provider's employees. This is not a sufficient test of an institution's BCP and is considered ineffective because financial institution employees are not directly involved in the testing process. As a result, the institution cannot verify that tests were conducted properly and institution personnel may not be familiar with recovery procedures and related logistics in the event of a true disaster.
Once an institution develops the testing policy, this policy is typically implemented through the development of testing strategies that include the testing scope and objectives and test planning using various scenarios and testing methods.
The testing policy should include enterprise-wide testing strategies that establish expectations for individual business lines.Business lines include all internal and external supporting functions, such as IT and facilities management. across the testing life cycle of planning, execution, measurement, reporting, and test process improvement. The testing strategy should include the following:
- Expectations for business lines and support functions to demonstrate the achievement of business continuity test objectives consistent with the BIA and risk assessment;
- A description of the depth and breadth of testing to be accomplished;
- The involvement of staff, technology, and facilities;
- Expectations for testing internal and external interdependencies; and
- An evaluation of the reasonableness of assumptions used in developing the testing strategy.
Testing strategies should include the testing scope and objectives, which clearly define what functions, systems, or processes are going to be tested and what will constitute a successful test. The objective of a testing program is to ensure that the business continuity planning process is accurate, relevant, and viable under adverse conditions. Therefore, the business continuity planning process should be tested at least annually, with more frequent testing required when significant changes have occurred in business operations. Testing should include applications and business functions that were identified during the BIA. The BIA determines the recovery point objectives and recovery time objectives, which then help determine the appropriate recovery strategy. Validation of the RPOs and RTOs is important to ensure that they are attainable
Testing objectives should start simply, and gradually increase in complexity and scope. The scope of individual tests can be continually expanded to eventually encompass enterprise-wide testing and testing with vendors and key market participants. Achieving the following objectives provides progressive levels of assurance and confidence in the plan. At a minimum, the testing scope and objectives should:
- Not jeopardize normal business operations;
- Gradually increase the complexity, level of participation, functions, and physical locations involved;
- Demonstrate a variety of management and response proficiencies under simulated crisis conditions, progressively involving more resources and participants;
- Uncover inadequacies so that testing procedures can be revised;
- Consider deviating from the test script to interject unplanned events, such as the loss of key individuals or services; and
- Involve a sufficient volume of all types of transactions to ensure adequate capacity and functionality of the recovery facility.
The testing policy should also include test planning, which is based on the predefined testing scope and objectives established as part of management's testing strategies. Test planning includes test plan review procedures and the development of various testing scenarios and methods. Management should evaluate the risks and merits of various types of testing scenarios and develop test plans based on identified recovery needs. Test plans should identify quantifiable measurements of each test objective and should be reviewed prior to the test to ensure they can be implemented as designed. Test scenarios should include a variety of threats, event types, and crisis management situations and should vary from isolated system failures to wide-scale disruptions. Scenarios should also promote testing alternate facilities with the primary and alternate facilities of key counterparties and third-party service providers. Comprehensive test scenarios focus attention on dependencies, both internal and external, between critical business functions, information systems, and networks.Integrated testing moves beyond the testing of individual components, to include testing with internal and external parties and the supporting systems, processes, and resources. Refer to Appendix E: "Interdependencies" and Appendix H: "Testing Program - Governance and Attributes" for additional information. As such, test plans should include scenarios addressing local and wide-scale disruptions, as appropriate. Business line management should develop scenarios to effectively test internal and external interdependencies, with the assistance of IT staff members who are knowledgeable regarding application data flows and other areas of vulnerability. Institutions should periodically reassess and update their test scenarios to reflect changes in the institution's business and operating environment.
Test plans should clearly communicate the predefined test scope and objectives and provide participants with relevant information, including:
- A master test schedule that encompasses all test objectives;
- Specific description of test objectives and methods;
- Roles and responsibilities for all test participants, including support staff;
- Designation of test participants;
- Test decision makers and succession plans;
- Test locations; and
- Test escalation conditions and test contact information.
Test Plan Review
Management should prepare and review a scriptRefer to Appendix H: "Testing Program - Governance and Attributes" for additional information on test scripts. for each test prior to testing to identify weaknesses that could lead to unsatisfactory or invalid tests. As part of the review process, the testing plan should be revised to account for any changes to key personnel, policies, procedures, facilities, equipment, outsourcing relationships, vendors, or other components that affect a critical business function. In addition, as a preliminary step to the testing process, management should perform a thorough review of the BCP (checklist review). A checklist review involves distributing copies of the BCP to the managers of each critical business unit and requesting that they review portions of the plan applicable to their department to ensure that the procedures are comprehensive and complete.
Testing methods can vary from simple to complex depending on the preparation and resources required. Each bears its own characteristics, objectives, and benefits. The type or combination of testing methods employed by a financial institution should be determined by, among other things, the institution's age and experience with business continuity planning, size, complexity, and the nature of its business.
Testing methods include both business recovery and disaster recovery exercises. Business recovery exercises primarily focus on testing business line operations, while disaster recovery exercises focus on testing the continuity of technology components, including systems, networks, applications, and data. To test split processing configurations, in which two or more sites support part of a business line's workload, tests should include the transfer of work among processing sites to demonstrate that alternate sites can effectively support customer-specific requirements and work volumes and site-specific business processes. A comprehensive test should involve processing a full day's work at peak volumes to ensure that equipment capacity is available and that RTOs and RPOs can be achieved.
More rigorous testing methods and greater frequency of testing provide greater confidence in the continuity of business functions. While comprehensive tests do require greater investments of time, resources, and coordination to implement, detailed testing will more accurately depict a true disaster and will assist management in assessing the actual responsiveness of the individuals involved in the recovery process. Furthermore, comprehensive testing of all critical functions and applications will allow management to identify potential problems; therefore, management should use one of the more thorough testing methods discussed in this section to ensure the viability of the BCP before a disaster occurs. Examples of testing methods in order of increasing complexity include:
Tabletop Exercise/Structured Walk-Through Test
A tabletop exercise/structured walk-through test is considered a preliminary step in the overall testing process and may be used as an effective training tool; however, it is not a preferred testing method. Its primary objective is to ensure that critical personnel from all areas are familiar with the BCP and that the plan accurately reflects the financial institution's ability to recover from a disaster. It is characterized by:
- Attendance of business unit management representatives and employees who play a critical role in the BCP process;
- Discussion about each person's responsibilities as defined by the BCP;
- Individual and team training, which includes a walk-through of the step-by-step procedures outlined in the BCP; and
- Clarification and highlighting of critical plan elements, as well as problems noted during testing.
Walk-Through Drill/Simulation Test
A walk-through drill/simulation test is somewhat more involved than a tabletop exercise/structured walk-through test because the participants choose a specific event scenario and apply the BCP to it. However, this test also represents a preliminary step in the overall testing process that may be used for training employees, but it is not a preferred testing methodology. It includes:
- Attendance by all operational and support personnel who are responsible for implementing the BCP procedures;
- Practice and validation of specific functional response capabilities;
- Focus on the demonstration of knowledge and skills, as well as team interaction and decision-making capabilities;
- Role playing with simulated response at alternate locations/facilities to act out critical steps, recognize difficulties, and resolve problems in a non-threatening environment;
- Mobilization of all or some of the crisis management/response team to practice proper coordination without performing actual recovery processing; and
- Varying degrees of actual, as opposed to simulated, notification and resource mobilization to reinforce the content and logic of the plan.
Functional Drill/Parallel Test
Functional drill/parallel testing is the first type of test that involves the actual mobilization of personnel to other sites in an attempt to establish communications and perform actual recovery processing as set forth in the BCP. The goal is to determine whether critical systems can be recovered at the alternate processing site and if employees can actually deploy the procedures defined in the BCP. It includes:
- A full test of the BCP, which involves all employees;
- Demonstration of emergency management capabilities of several groups practicing a series of interactive functions, such as direction, control, assessment, operations, and planning;
- Testing medical response and warning procedures;
- Actual or simulated response to alternate locations or facilities using actual communications capabilities;
- Mobilization of personnel and resources at varied geographical sites, including evacuation drills in which employees test the evacuation route and procedures for personnel accountability; and
- Varying degrees of actual, as opposed to simulated, notification and resource mobilization in which parallel processing is performed and transactions are compared to production results.
Full-interruption/full-scale test is the most comprehensive type of test. In a full-scale test, a real-life emergency is simulated as closely as possible. Therefore, comprehensive planning should be a prerequisite to this type of test to ensure that business operations are not negatively affected. The institution implements all or portions of its BCP by processing data and transactions using back-up media at the recovery site. It involves:
- Enterprise-wide participation and interaction of internal and external management response teams with full involvement of external organizations;
- Validation of crisis response functions;
- Demonstration of knowledge and skills as well as management response and decision-making capability;
- On-the-scene execution of coordination and decision-making roles;
- Actual, as opposed to simulated, notifications, mobilization of resources, and communication of decisions;
- Activities conducted at actual response locations or facilities;
- Actual processing of data using back-up media; and
- Exercises generally extending over a longer period of time to allow issues to fully evolve as they would in a crisis and to allow realistic role-playing of all the involved groups.
Roles and Responsibilities
Execution, Evaluation, Independent Assessment, and Reporting of Test Results