Appendix G: Business Continuity Plan Components
An enterprise-wide business continuity plan (BCP) should be developed to prevent the interruption of normal operations and to allow for the resumption of business processes in a timely manner. In addition, a comprehensive BCP should provide guidelines for emergency responses, extended back-up operations, and post-disaster recovery. All financial institutions are required to establish a comprehensive BCP regardless of whether they process their work internally or outsource their processing to a service provider. If a financial institution uses a service provider to process its daily transactions, management should ensure that it has incorporated applicable guidelines from the vendor's BCP into the financial institution's plan. The guidelines in this appendix will address the components that should be implemented as part of the business continuity planning process to ensure an effective BCP.
Defining the Business continuity Strategy
The business continuity strategy represents a critical aspect of the BCP and is derived from the information collected during the business impact analysis (BIA) process. The following components should be considered when defining the business continuity strategy and developing the BCP:
- Technology issues;
- Electronic payment systems;
- Liquidity concerns;
- Financial disbursement;
- Manual operations; and
- Other considerations.
When developing the continuity strategy, consideration should be given to both short-term and long-term goals and objectives.
Short-term goals and objectives may include:
- Critical personnel, facilities, computer systems, operations, and equipment;
- Priorities for processing, recovery, and mitigation;
- Maximum downtime before recovery of operations; and
- Minimum resources required for recovery.
Long-term goals and objectives may include:
- Management's enterprise-wide strategic plan;
- Coordination of personnel and activities;
- Budgetary considerations; and
- Supervision of third-party resources.
Human resources represent one of most critical BCP components, and often, personnel issues are not fully integrated into the enterprise-wide plan. Based on the BIA, the BCP should assign responsibilities to management, specific personnel, teams, and service providers. The planning group should comprise representatives from all departments or organizational units, and the BCP should be prepared by the individuals responsible for carrying out the assigned tasks. In addition, the plan should specifically identify the integral personnel that are needed for successful implementation of the BCP, and succession plans should assign responsibilities to back-up personnel in the event integral employees are not available. Additionally, vendor support needs should be identified. The BCP should address:
- How will management prepare employees for a disaster, reduce the overall risks, and shorten the recovery window?
- How will decision-making succession be determined in the event management personnel are unavailable?
- How will management continue operations if employees are unable or unwilling to return to work due to personal losses, closed roads, or unavailable transportation?
- How will management contact employees in the event personnel are required to evacuate to another area during non-business hours?
- Will the financial institution have the resources necessary to transport personnel to an offsite facility that is located a significant distance from their residence?
- Who will be responsible for contacting employees and directing them to their alternate locations?
- Who will be responsible for leading the various BCP Teams (e.g., Crisis/Emergency, Recovery, Technology, Communications, Facilities, Human Resources, Business Units and Processes, Customer Service)?
- Who will be the primary contact with critical vendors, suppliers, and service providers?
- Who will be responsible for security (information and physical)?
One of the first things that many financial institutions realize during a disaster is that recovery cannot take place without adequate personnel. Recovery efforts are typically more successful when management attempts to solicit and meet the immediate needs of their employees. Ideally, advance plans should be established regarding living arrangements for displaced employees and their families, such as securing blocks of hotel rooms or maintaining rental contracts for small homes, within and outside the local area. If an emergency lodging program is offered by the financial institution, management should be aware of the business needs of each employee to ensure that proper communication channels and alternative telecommunications options are available, particularly if employees are required to work at their hotel or at an alternate location.
Management should plan for basic necessities and services for its staff members who have been displaced during a disaster. If possible, management should establish plans to obtain water, food, clothing, child care, medical supplies, and transportation prior to the disruptive event. On-site medical support, mobile command centers, and access to company vehicles and other modes of transportation should also be provided, if available. Management's efforts to maintain good employee relations will likely contribute to the commitment and loyalty of financial institution personnel and their desire to assist with the timely recovery of operations.
Since personnel are critical to the recovery of the financial institution, business continuity training should be an integral part of the BCP. During a disaster, a well-trained staff will more likely remain calm during an emergency, realize the potential threats that may affect the financial institution, and be able to safely implement required procedures without endangering their lives or the lives of others. A comprehensive training program should be developed for all employees, conducted at least annually, and kept up-to-date to ensure that everyone understands their current role in the overall recovery process. In addition, an audit trail should be maintained to document management's training efforts.
Cross Training and Succession Planning
Cross-training of personnel and succession planning is also an important element of the business continuity planning process. Management should cross train employees throughout the organization and assign back-up personnel for key operational positions. The financial institution should also plan to shift employees to other corporate sites, branches, back-up locations, or service provider facilities outside of the disaster area and prior to the development of transportation problems, if possible.
To ensure adequate staffing at the alternate site, financial institutions may decide to locate staff at the back-up facility on a permanent basis or hire employees who live outside the primary business area and closer to the alternate facility. If employees are unable to return to work, management may use formal agreements with temporary agencies and headhunting services to provide temporary staffing solutions.
BCP Team Assignments
Planning should also consider human resources necessary for decision making and staffing at alternate facilities under various scenarios. Typically, a recovery team is established to perform this function, and their primary responsibility is to recover predefined critical business functions at the alternate back-up site. They will be responsible for retrieving materials from the off-site storage location, such as data files, supplies, equipment, and software. Once these materials have been obtained, the recovery team will install the necessary hardware, software, telecommunications equipment, and data files required for recovery.
Key personnel should also be identified to make decisions regarding the renovation or rebuilding of the primary facility after the immediate disaster has ended. These tasks usually require personnel beyond what is necessary for ongoing business continuity efforts. Personnel responsible for returning the primary facility to normal operations are usually designated to a salvage team, which should be separate from the recovery team. The salvage team must be certain that all pending danger is over, and employees can safely return to the primary facility. Once personal security is ascertained, the salvage team will be responsible for supervising the retrieval and cleaning of equipment, the removal of debris, and the recovery of spoiled media and reports. The salvage team is also given the authority to resume normal operations at the primary facility, which is a significant task since numerous areas must be closely reviewed to ensure that operations will function properly.
Once the salvage team approves the resumption of normal operations, the recovery team is assigned the responsibility of returning production to the primary facility. However, before restoration tasks can be performed and employees return to the primary facility, the salvage team should perform an inventory of all property and ensure that the on-site investigation is complete. The BCP should address guidelines for transferring operations from the back-up site to the primary facility with minimum disruption. In addition, records should be maintained detailing associated costs and property valuations for documenting budgetary changes, general ledger records, and insurance claims.
Finally, the business continuity planning coordinator or planning committee should be given responsibility for regularly conducting employee awareness training and performing annual tests of the BCP. In addition, the BCP should be updated at least annually, or more frequently, after significant changes to business operations, or if training and testing reveal gaps in the policy guidelines.
Communication is a critical aspect of a BCP and should include communication with employees, emergency personnel, regulators, vendors/suppliers (detailed contact information), customers (notification procedures), and the media (designated media spokesperson). Alternate telecommunications capabilities should be implemented to prevent any single point of failure that could disrupt operations. Policy guidelines should also address alternate methods of telecommunications in the event primary providers are unable to supply necessary services, and regular audits should confirm the adequacy of these diverse systems.
Communicating With Employees
One of the most important activities of business continuity planning involves communicating with employees. Employees should be promptly notified of a pending disaster, and specific evacuation instructions should be provided and included in the BCP. Management must be able to communicate with personnel located in isolated areas or dispersed across multiple locations, and management should be aware of each employee's evacuation plans to ensure that they can be contacted in a timely manner during a disaster. While manually dialed telephone call trees may be a viable communication tool in some instances, emergency notification systems should be evaluated to determine their cost effectiveness. With either method, management should ensure that contact information is current and easily accessible. Synchronization with human resource departments and company mail systems may prove helpful in maintaining the currency of contact information. Employee notification solutions may also include the following:
- An in-bound hotline number for employees to retrieve up-to-date voice messages from any location or a website accessible only by employees that provides important information regarding the operational status of the financial institution and contact numbers for financial institution personnel;
- A two-way polling phone system that confirms all employees have been contacted, with confirmed delivery of messages;
- Remote access provided to employees through the use of laptops, software, and Internet based solutions by utilizing dial-up connections, cable modems, virtual private networks (VPNs), integrated services digital networks (ISDNs), digital subscriber lines (DSLs), or wireless capabilities;
- Ultra forward service, which allows incoming calls to be rerouted to a pre-determined alternate location;
- Custom redirect service, which allows management to determine where incoming calls are answered and redirect calls to various locations or pre-established phone numbers;
- Provisioning local phone services to one office from two different telecommunications provider locations to provide phone system redundancy; and
- Adding a back-up Internet Service Provider (ISP) and balancing the traffic between the two ISPs over separate communication paths.
Interfacing With External Groups
Financial institutions often forget about the need to include BCP guidelines regarding their interaction with external groups such as local and state municipal employees and city officials. Management should implement BCP guidelines addressing escalation procedures and include contact information for communicating with these various groups. Consideration should be given to the proximity of the financial institution to police, fire, and medical facilities, and the timeliness of their response should be factored into BCP recovery strategies.
Given the importance of the on-going operation of the financial system, financial institutions should be able to communicate with their industry counterparts. Current contact information should be maintained and should be easily accessible to facilitate conference calls and meetings between financial sector trade associations, financial authority working groups, emergency response groups, and international exchange organizations. These groups should assess the potential impact of major operational disruptions, coordinate recovery efforts, and promptly respond to failures in critical communication systems.
A significant part of any BCP and related test plan should involve dealing with the media. When a disruptive event occurs that could affect the financial institution's ability to continue operations, the public must be informed. Before a disaster strikes, management should prepare a response that has been approved by the board and the shareholders. In addition, employees should be instructed to refer any questions to the financial institution's media contact. The chosen spokesperson should be adequately informed, credible, have strong communication skills, and be accessible to the media so that inaccurate information is not broadcast to the public, which could potentially harm the reputation of the financial institution. Only confirmed information should be provided, and the spokesperson should discuss what the financial institution is doing to mitigate any potential threats. In order to ease customer's concerns regarding the security of their deposit funds, it is a good idea to conduct regular media briefings until the emergency has ended.
The technology issues that should be addressed in an effective BCP include:
- Hardware - mainframe, mid-range, servers, network, end-user;
- Software - applications, operating systems, utilities;
- Communications (network and telecommunications);
- Data files and vital records;
- Operations processing equipment; and
- Office equipment.
These technology issues play a critical role in the recovery process; therefore, comprehensive inventories should be maintained to ensure that all applicable components are considered during plan development. Planning should include identifying critical business unit data that may only reside on individual workstations, which may or may not adhere to proper back-up schedules. Additionally, the plan should address vital records, necessary back-up methods, and appropriate back-up schedules for these records.
The BCP team or coordinator should also identify and document end-user requirements. For example, employees may be able to work on a stand-alone personal computer (PC) to complete most of their daily tasks, but they may require a network connection to fulfill other critical duties. Consequently, management should consider providing employees with laptops and remote access capabilities using software or a VPN connection.
When developing the BCP, institutions should exercise caution when identifying non-critical assets. An institution's telephone banking, Internet banking, or automated teller machine (ATM) systems may not seem mission critical when systems are operating normally. However, these systems may play a critical role in the BCP and be a primary delivery channel to service customers during a disruption. Similarly, an institution's electronic mail system may not appear to be mission critical, but may be the only system available for employee or external communication in the event of a disruption.
Data Center Recovery Alternatives
Financial institutions should make formal arrangements for alternate processing capability in the event their data processing site becomes inoperable or inaccessible. The type of recovery alternative selected will vary depending on the criticality of the processes being recovered and the recovery time objectives (RTOs). For example, financial industry participants whose operations are critical to the functioning of the overall financial system and other financial industry participants should establish high recovery objectives, such as same-day business resumption. Conversely, less stringent recovery objectives may be acceptable for other entities. Considerations such as the increased risk of failed transactions, liquidity concerns, solvency, and reputation risks should be factored into the decision making process. The scope of the recovery plan should address alternate measures for core operations, facilities, infrastructure systems, suppliers, utilities, interdependent business partners, and key personnel. Recovery plan alternatives may take several forms and involve the use of another data center or a third-party service provider. A legal contract or agreement should evidence recovery arrangements with a third-party vendor. The following are acceptable alternatives for data center recovery. However, institutions will be expected to describe their reasons for choosing a particular alternative and why it is adequate based on their size and complexity.
- Hot Site (traditional "active/back-up" model)-A hot site is fully configured with compatible computer equipment and typically can be operational within several hours. Financial institutions may rely on a service provider for back-up facilities. The traditional active/back-up model requires relocating at least core employees to the alternative site. This model also requires data files to be transferred off-site on at least a daily basis. Large institutions that operate critical real-time processing operations or critical high-volume processing activities should consider mirroring or vaulting their data to the alternate site on a continuous basis using either synchronous or asynchronous data replication. If an institution is relying on a third party to provide the hot site, there remains a risk that the capacity at the service provider may not be able to support their operations in the event of a regional or large-scale event. In addition, there are also security concerns when using a hot site since the applications may contain production data. Consequently, management should ensure that the same security controls that are required at the primary site are also replicated at the hot site. Smaller, less complex institutions may contract for a "mobile hot site," i.e., a trailer outfitted with the necessary computer hardware that is towed to a predetermined location in the event of a disruption and connected to a power source.
- Duplicate Facilities/Split Operations ("active/active" model)-Under this scenario, two or more separate, active sites provide inherent back-up to one another. Each site has the capacity to absorb some or all of the work of the other site for an extended period of time. This strategy can provide almost immediate resumption capacity, depending on the systems used to support the operations and the operating capacity at each site. The maintenance of excess capacity at each site and added operating complexity can have significant costs. Even using the "active/active" model, current technological limitations preclude wide geographic diversity of data centers that use real-time, synchronous data mirroring back-up technologies. Other alternatives beyond synchronous mirroring are available to allow for greater distance separation; however, there is a risk that a small amount of transaction data may be lost in transit between the primary and alternate centers at the moment of the business disruption. Depending on the type of lost data and the cost of identifying and reprocessing it, the risk of losing a small amount of data in transit may be overshadowed by the ability to restore the institution to full business service in a short amount of time. This trade-off is not a technology decision; it is a business decision.
- Warm Site-Warm sites provide resumption capacity somewhere between that of a hot and cold site. The facility will be equipped with electricity; heating, ventilation, and air conditioning systems; computers; and external communication links. However, applications may not be installed, and there may be a limited number of available workstations. Therefore, management will need to deliver workstations for remote processing, and production data will need to be restored from back-up media. This recovery option is less costly, more flexible, and requires fewer resources to maintain than a hot site. Conversely, it will take longer to begin processing at the warm site and recover operations. However, if critical transaction processing is not required, this alternative may be acceptable.
- Cold Site-Cold sites are locations that are part of a longer-term recovery strategy. A cold site provides a back-up location without equipment, but with power, air conditioning, heat, electrical, network and telephone wiring, and raised flooring. An example of a situation when a cold site can be a viable alternative is when a financial institution has recovered at another location, such as a hot site, but needs a longer-term location while their data center is being rebuilt. Institutions may rely on the services of a third party to provide cold site facilities or may house such a facility at another location, such as a branch or other operations center. A variation of this recovery option is the rolling/mobile back-up site, which provides the same facility arrangements, but with mobility advantages. While cold sites represent a low cost solution, they typically can take up to several weeks to activate. Therefore, this type of facility is usually not considered an adequate primary recovery option because of the time it takes to start production and resume operations. In addition, it is difficult to perform a recovery test using this type of facility since parallel processing would take a great deal of time and effort to complete.
- Tertiary Location-Some financial institutions have identified the need to have a third location or a "back-up to the back-up." These tertiary locations provide an extra level of protection in the event neither the primary location nor the secondary location is available. Moreover, a tertiary location becomes the primary back-up location in the event the institution has declared a disaster and is operating out of its contingency or secondary site.
- Multiple Centers or Dual Sites-Multiple centers distribute processing among various facilities for redundancy. These facilities could be owned by one entity or represent a reciprocal agreement with other financial institutions or businesses. The cost of this recovery option is predictable and allows for resource sharing among the various facilities; however, if the facilities are not geographically distributed in different locations, an area-wide disaster could render all of the sites useless. In addition, this type of facility could be more difficult to manage and administer. Management should also understand that implementing a reciprocal agreement might not always provide an optimal back-up solution due to limited excess capacity.
- Service Bureaus-Financial institutions may contract with a service bureau to provide full processing capabilities. This recovery option will provide immediate availability, testing opportunities, and the possibility of additional services provided. Conversely, the disadvantage of this option is the associated costs and the likelihood of strained resources during an area-wide disaster.
- In-house or Vendor Supplied Hardware-This recovery option provides the supply of needed hardware to replace damaged equipment either through internal means or by contracting with an outside supplier to provide critical components using overnight delivery services. Depending on the amount of damaged equipment and the complexity of the damaged systems, this recovery option may be similar to a cold site and take several days or weeks to implement.
- Prefabricated Building-Financial institutions may contract for the construction of a prefabricated building at a predefined location to house back-up processing functions. While this alternative is not considered an adequate recovery option by itself, it may be considered an acceptable solution when used as a redundant or dual site recovery option or in combination with subscription services that provide immediate availability.
Some financial institutions enter into agreements, commonly referred to as "Reciprocal Agreements," with other institutions to provide equipment back-up. This arrangement is usually made on a best effort basis, whereby institution "A" promises to serve as a back- up for institution "B" as long as institution "A" has time available, and vice versa. In most cases, reciprocal agreements are unacceptable because the institution agreeing to provide back-up has insufficient excess capacity to enable the affected institution to process its transactions in a timely manner. If an institution chooses to enter into a reciprocal agreement and can establish that such an arrangement will provide an acceptable level of back-up, the agencies expect such an agreement to be in writing and to obligate institution "A" to make available sufficient processing capacity and time. The agreement should also specify that each institution would be notified if the other institution implements equipment and software changes, and provisions should be included addressing each institution's right to conduct annual tests at the reciprocal site.
Back-up Recovery Facilities
The recovery site should be tested at least annually and when equipment or application software is changed to ensure continued compatibility. Additionally, the recovery facility should exhibit a greater level of security protection than the primary operations site since the people and systems controlling access to the recovery site will not be as familiar with the relocated personnel using it. This security should include physical and logical access controls to the site as well as the computer systems. Further, the BCP and recovery procedures should be maintained at the alternative and off-site storage locations.
Regardless of which recovery strategy is utilized, the recovery plan should address how any backlog of activity or lost transactions will be recovered. The plan should identify how transaction records will be brought current from the time of the disaster and the expected recovery timeframes.
The back-up site should mirror operational functionality. Consequently, duplicate check processing, imaging services, ATMs, telephone banking platforms, call centers, commercial cash management services, and electronic funds transfer systems should be duplicated for immediate activation at the back-up site.
Alternative workspace capacity is just as important as alternative data processing capabilities. Management should arrange for workspace facilities and equipment for employees to conduct ongoing business functions.
When determining the physical location of an alternate processing site, management should consider geographic diversity. In addition, alternate sites should not rely on the same critical infrastructure system that provides utility services such as electricity, telecommunications, transportation, and water. While geographic diversity is important for all financial institutions, this is a particularly important factor for financial industry participants whose rapid recovery is critical to the financial industry. Financial institutions should consider the geographic scope of disruptions and the implications of a citywide or regional disruption. The distance between primary and back-up locations should consider RTOs and business unit requirements. Locating a back-up site too close to the primary site may not insulate it sufficiently from a regional disaster. Alternatively, locating the back-up site too far away may make it difficult to relocate the staff necessary to operate the site. If relocation of staff is necessary to resume business operations at the alternate site, consideration should be given to their willingness to travel, the modes of transportation available, and if applicable, lodging and living expenses for employees that relocate. When evaluating the locations of alternate processing sites, it is also important to subject the secondary sites to a threat scenario analysis.
Back-up and Storage Strategies
Institution management should base software and data file back-up decisions on the criticality of the software and data files to the financial institution's operations. In establishing back-up priorities, management should consider all types of information and the potential impact from the loss of such files. This includes financial, regulatory, and administrative information, and operating, application, and security software. In assigning back-up priority, management should perform a risk assessment that addresses whether:
- The loss of these files would significantly impair the institution's operations;
- The files are being used to manage corporate assets or to make decisions regarding their use;
- The files contain updated security and operating system configurations that would be necessary to resume operations in a secure manner;
- The loss of the files would result in lost revenue; and
- Any inaccuracy or data loss would result in significant impact on the institution (including reputation) or its customers.
The frequency of file back-up also depends on the criticality of the application and data. Critical data should be backed up using the multiple generation (i.e., "grandfather-father-son") method and rotated to an off-site location at least daily. Online/real-time or high volume systems may necessitate more aggressive back-up methods such as electronic vaulting, remote journaling, disk shadowing or data mirroring, hierarchical storage management (HSM), storage area network (SAN), or network-attached storage (NAS) to ensure appropriate back-up of operations.
Electronic vaulting represents a batch process that periodically transfers copies of modified files to an offsite back-up location. Conversely, remote journaling refers to the real time transfer of transaction logs or journals to a remote location. These logs and journals are used to recover transaction and database changes since the most recent back- up. As a result, this back-up process allows the alternate site to be fully operational at all times. Disk shadowing or data mirroring uses two separate disks or multiple servers, on which either data images or identical information is written to simultaneously. These back-up processes ensure data redundancy and the availability of duplicate disks or hardware.
Additional back-up options include HSM, SAN, and NAS. HSM uses optical disks, magnetic disks, or tapes to dynamically manage the back-up and retrieval of files to devices that vary in speed and cost. For example, the faster devices or media are used to hold the information that will be accessed more frequently, and the files that are not needed as often are stored on the slower devices or media. SAN represents several storage systems that are connected to form a single back-up network. This back-up option provides the ability for several devices to communicate with each other and with the various storage devices, which prevents dependence on a single connection. NAS systems usually contain one or more hard disks that are arranged into logical, redundant storage containers, much like traditional file servers. NAS provides readily available storage resources and helps alleviate the bottlenecks associated with access to storage devices. NAS environments are designed to facilitate the movement of data and allow any application or client to use any operating system to send data to or receive data from a NAS device.
Back-up tape storage remains an effective solution for many financial institutions. However, when an institution uses this type of media for its primary back-up storage, back-up tapes should be sent to the off-site storage facility as soon as possible, should not reside at their originating location overnight, should not be returned to the originating location until the are replaced with the current day's back-up tapes, and should be properly secured to prevent damage or unauthorized access. Back-up media, especially tapes, should be periodically tested to ensure that they are still readable. Tapes repeatedly used or subjected to extreme variations in temperature or humidity may become unreadable, in whole or part, over time.
Back-up of operating system software and application programs must be performed whenever they are modified, updated, or changed.
Data File Back-up
One of the most critical components of the back-up process involves the financial institution's data files, regardless of the platform on which the data is located. Institutions must be able to generate a current master file that reflects transactions up to the time of the disruption. Data files should be backed up both onsite and off-site to provide recovery capability. Retention of current data files, or older master files and the transaction files necessary to bring them current, is important so that processing can continue in the event of a disaster or other disruption. The creation and rotation of core processing data file back-up should occur at least daily, more frequently if the volume of processing or online transaction activity warrants. Less critical data files may not need to be backed up as frequently. In either case, back-up data files should be transported off-site in a timely manner and should not be returned to the originating location until new back-up files are off-site. Retaining multiple versions of the back-up files off-site on a "grandfather-father-son" rotating basis is recommended so that if the newest daily incremental files ("sons") are not readable, the weekly full sets ("fathers") are there as the next best alternative, and if the "fathers" are not readable, the end-of-month back-up files ("grandfathers") are available to restore business processes.
Software back-up for all hardware platforms consists of four basic areas: operating system software, application software, utility programs, and databases. An inventory of all software and related documentation should have adequate off-premises storage. Even when using a standard software package from one vendor, the software can vary from one location to another. Differences may include parameter settings and modifications, security profiles, reporting options, account information, or other options chosen by the institution during or subsequent to system implementation. It is also common for financial institutions to request customized software programs from their software vendor. Therefore, a comprehensive back-up of all critical software is essential.
The operating system software should be backed up with at least two copies of the current version. One copy should be stored in the tape and disk library for immediate availability in the event the original is impaired; the other copy should be stored in a secure, off-premises location. Duplicate copies should be tested periodically and recreated whenever there is a change to the operating system.
Application software, which includes both source (if the institution has it in its possession) and object versions of all application programs, should be maintained in the same manner as the operating system software. Back-up copies of the programs should be updated as program changes are made. In the event management does not have the source code in its possession, a software escrow agreement is established whereby a third-party maintains the source code, back-up copies of the compiled code, manuals, and other supporting materials in a secure location. A formal agreement is established between the financial institution, the software vendor, and the escrow agent, which allows the financial institution access to the source code if the software vendor goes out of business or is unable to fulfill their contact obligations. The BCP should identify this issue and applicable audit controls that protect the bank's interest in the source code.
Utility programs are used to assist in the operation of a computer by configuring or maintaining systems, making changes to stored or transmitted data, or compressing data. Utility programs should be maintained in the same manner as operating system software and application software to ensure that back-up copies are readily available when needed.
Databases represent the collection of data that may be stored on any type of computer storage medium. For example, a financial institution may maintain a database on their network file server that contains employee information used for processing payroll. Back-up copies of the database should be maintained off-site, and management should assess the criticality of the database to determine how frequently the database should be backed up.
Given the increased reliance on the distributed processing environment, the importance of adequate back-up resources and procedures for local area networks and wide area networks is important. As such, management should ensure that all critical networks and related software and data files are backed up appropriately to ensure timely recovery of operations.
Depending on the size of the financial institution and the nature of anticipated risks and exposures, the time spent backing up data is minimal compared with the time and effort necessary for restoration. Files that can be backed up within a short period of time may require days, weeks, or months to recreate from hardcopy records, assuming hardcopy records are available. Comprehensive and clear procedures are necessary to recover critical networks and systems. Procedures should, at a minimum, include:
- Frequency of update and retention cycles for back-up software and data;
- Periodic review of software and hardware for compatibility with back-up resources;
- Periodic testing of back-up procedures for effectiveness in restoring normal operations;
- Guidelines for the labeling, listing, transportation, and storage of media;
- Maintenance of data file listings, their contents, and locations;
- Hardware, software, and network configuration documentation;
- Controls to minimize the risks involved in the transfer of back-up data, whether by electronic link or through the physical transportation of diskettes and tapes to and from the storage site; and
- Controls to ensure data integrity, client confidentiality, and the physical security of hardcopy output, media, and hardware.
The off-site storage location should be environmentally controlled, fire-resistant, and secure, with procedures for restricting physical access to authorized personnel. Management should keep in mind that using a timed vault for off-site storage may present a problem if an unexpected emergency requires immediate retrieval during non-business hours. Consequently, a secure method for storing vault combinations and keys should be established to ensure that off-site storage items are accessible when needed. Financial institutions are discouraged from allowing employees to store back-up data files at their residence due to potential security concerns. Moreover, the off-site premises should be an adequate distance from the computer operations location so that both locations will not be affected by the same event.
In addition to a copy of the BCP, duplicate copies of all necessary procedures, including end of day, end of month, end of quarter, and procedures covering relatively rare and unique issues should be stored at the offsite locations. For example, most networks change over time as software, service packs, and patches are installed and configurations are altered. Therefore, documentation supporting the current network environment is crucial. Another back-up alternative to consider would be to place the critical information on a secure shared network drive, with the data backed up during regularly scheduled network back-up. However, this shared drive should be in a different physical location that would not be affected by the same disruption. Management needs to maintain a certain level of non-networked (e.g., hardcopy) material in the event the financial institution's or service provider's computer systems are not available for a period of time. For example, a hard copy of current customer information should be maintained at the main facility and at an off-site location to ensure that employees have the information they need to perform manual operations and serve the financial institution's customers.
Reserve supplies, such as forms, manuals, letterhead, etc., should also be maintained in appropriate quantities at an off-site location, and management should maintain a current inventory of what is held in the reserve supply.
The BCP should address site relocation for short-, medium-, and long-term disaster and disruption scenarios. Continuity planning for recovery facilities should consider location, size, capacity (computer and telecommunications), and required amenities necessary to recover the level of service required by the critical business functions. This includes planning for workspace, telephones, workstations, network connectivity, etc. When determining an alternate processing site, management should consider scalability, in the event a long-term disaster becomes a reality.
As a service industry, one of the most critical components of the BCP involves the physical presence where customers can go to conduct business. Based on past experience during disaster situations, successful sharing of banking facilities with other financial institutions has benefited each bank by having an operational facility to service customer's needs, establish basic operations during the recovery process, and instill confidence in the financial institution's business continuity efforts. Therefore, management may consider establishing formal agreements with local and out-of-area businesses and financial institutions to use their facilities in the event of a disaster. Alternatively, management may also plan to enlist the assistance of state and local agencies to expedite building permits and inspections for temporary facilities. Close communication with regulatory authorities is imperative to ensure that approval requirements for additional branch facilities are properly followed. In addition, prior notification may expedite the recovery process.
If possible, the plan should include logistical procedures for moving personnel to the recovery location prior to a pending emergency. It is particularly important that recovery team members inspect the site before a disaster strikes to determine what items they will need to transport to the facility to ensure timely recovery of operations. Once the institution returns to their original facility, the BCP should be reassessed to determine if these alternate plans warrant adjustment.
Electronic Payment Systems (EPS)
The BCP should address alternate arrangements in the event EPS, such as ATM systems and electronic funds transfer (EFT) systems are inoperable. When mainframe systems are down, ATM switches cannot communicate with host systems to validate withdrawal requests. Therefore, management should consider plans for pre-established withdrawal limits based on the institution's relationship with the customer. In addition, the financial institution should prepare for an increase in potential branch traffic when ATM systems are down. Pre-established agreements with various cash delivery services within and outside of the local area should also be considered to ensure that ATMs are adequately stocked with cash to meet potential customer demands when service returns.
BCP guidelines should also address alternate plans for retrieving and transmitting EFTs when payment systems are disrupted. Alternate solutions may include manual procedures for calling in or faxing wire and automated clearinghouse requests to correspondent banks. In addition, web based systems or third-party software may be used to conduct EFTs.
Management should also ensure that redundant EPS are included at recovery sites for immediate activation, and thorough documentation should be maintained to ensure timely posting of applicable entries when systems are recovered.
Management should ensure that the BCP addresses liquidity and cash concerns, and annual budget projections should include an analysis of potential cash needs to cover emergencies. During a disaster, power and communication systems may fail, requiring the use of cash to purchase supplies and necessary services due to inoperable ATM, debit, and credit card systems. Funding the short-term needs of your employees and customers should be considered when determining the amount of cash to have on hand during a disaster. If management is aware of an approaching emergency, cash limits for various locations within and outside the potential disaster area should be assessed to determine how much cash is needed. Management should also establish agreements with cash providers, delivery services, and transportation providers, within and outside trade areas that are subject to a common disaster, to ensure timely delivery of cash. Management should ensure that borrowing lines have been pre-established and funds are readily available during an emergency. Customer notification regarding the security of depositor's funds is also important since a perceived liquidity crisis could evolve if customer confidence is impaired.
Alternate methods of obtaining delivery of the financial institution's cash letter should also be considered since typical transmission methods may be unavailable during an emergency. For example, document imaging systems using remote capture technology may provide an alternative method for the electronic delivery and processing of a financial institution's cash letter.
The BCP should address guidelines regarding purchase authorities beyond approved policy limits and expense reimbursement options for financial institution personnel during a disaster. In addition, management should also consider distributing higher limit credit cards or establishing a separate checking account, which designates individuals who can sign checks in the event of an emergency or who have authorized debit card access that could be utilized to purchase emergency supplies.
Management should determine whether automated tasks could be conducted manually if automated systems are inoperable. For example, if the network, mainframe, or Internet is not functioning, management should determine if employees could fulfill their daily duties using traditional, non-technical procedures. The BCP should provide specific guidelines addressing manual procedures for critical functions, such as back-office operations, loan operations, and customer support. Management should maintain back-up records to ensure that customer account information (account numbers, customer names, addresses, account status, and account balances) is readily accessible during a disaster. The BCP should also address the distribution of hard copy documents, equipment, and supplies, as necessary. The BCP should also include instructions for dealing with customer requests during downtime, keeping track of daily transactions, reconciling general ledger accounts, documenting operational tasks, and posting manual entries after system recovery. Furthermore, to ensure that the institution's staff understands how to perform these manual procedures, the BCP should include employee training and testing guidelines.
Each financial institution is different and processes will vary. However, management should consider how to accomplish the following:
- Prevention and preparedness, including the determination of adequate insurance coverage based upon threats and the resulting loss potential identified in the BIA;
- Awareness programs designed to prepare customers for a disaster, using various methods such as statement stuffers, web postings, and advertisements;
- Reconciliation of recovery times with business unit requirements;
- Disaster declaration and plan implementation processes;
- Understanding of local, state, and federal emergency preparedness requirements and related programs available to manage disasters;
- Recovery progress reports; and
- Regularly reviewing, evaluating, auditing, testing, modifying, and maintaining the BCP based on changes in personnel and their responsibilities, changes in business operations, and gaps identified in the BCP based on test results and audit recommendations.
Appendix F: Business Impact Analysis Process
Appendix H: Testing Program - Governance and Attributes