key considerations for business resiliency
Post on 01-Sep-2014
1.855 Views
Preview:
DESCRIPTION
TRANSCRIPT
Key Considerations for Business Resiliency
Steve Suther, Product Manager, CISMMarch 18, 2010
► Business Resiliency, what is it?– Crisis Management– Incident Response– Business Continuance– Disaster Recovery
► Testing Methods► “Return To Normal”► Heterogeneous Approach► Final Thoughts
Agenda
Business Resiliency – What is It?
► Consolidation of multiple common elements into a single program– Command and Control– Incident Response– Business Continuance– Disaster Recovery
► Provides organization the ability to deal with business impacting events in a structured and organized fashion
► Proactive instead of reactive approach
Crisis Management
► Umbrella for all other capabilities
► Comprised of senior leadership and key stakeholders
► Responsible for crisis identification, classification, management, and resolution
► Uses pre-determined scenarios for guideline development
► Utilizes generic and specific guidelines for activities
Crisis Management
► People, processes, procedures, and facilities to identify analyze and react appropriately to business impacting events
► Formulates action plans for pre-determined and unidentified scenarios
► Addresses critical initial 72 hours– Requires the most
advanced and prescriptive planning
Command and Control
Crisis Management
► Identify key leaders and stakeholders and their functional knowledge– Documentation, responsibilities,
financial and signature authority, contacts, etc.
► Do not assume senior leadership will be available in a crisis– Validate or develop delegation of
authority– Ensure multiple backups are
identified and briefed– Geographically separated whenever
possible
Leadership Identification and Availability
Crisis Management
► Everyone will want to know what happened and what is being done to resolve the situation
► Misinformation will run rampant if clear communication plan is not established and utilized– Rumors perceived as reality
► Zero hour communications should be pre-established and approved– Generic language for initial
communications– Include when and how future
updates will provided
Communication Plan – Initial
Crisis Management
► Communication should be performed through multiple platforms– Web, blog, telephone, press release, and in-person briefings
► Update schedule should be structured– Initial updates more frequent then future updates
► Updates should be provided on schedule even if there are no updates– Provides confidence in organizations capability to resolve issue– No update introduces mistrust and perception of possible
deception
Communication Plan - Ongoing
► Consistency for both internal and external communications– Information will leak– Internal updates should include authentication layer for
accountability and traceability► Interactive updates important at least once every 24 hours
during initial 72 hours
Crisis Management
► Critical to have an external entity assist in crisis management activities– Do not use regular public relations firm– Establish retainer relationship– Ensure call center and communication
capabilities available► Provide zero hour communication plans in
advance– No content approvals required
► Educate firm about your business and your industry– Identify industry hot buttons and key issues
Communication Plan - External Assistance
Command and Control
► Identify internal and external legal resources– External counsel involvement assists with public
opinion► Establish legally documented delegations of authority
– Enable expanded signature authority– Provide proof of authority to internal and external
parties► Develop declaration and completion of incident
documents– Enable special powers for designated individuals to be
legally recognized– Ensure powers are removed at the end of the incident
Legal Considerations
Command and Control
► Establish rally points for command and control activities– Physical site– Conference bridge
► Ensure sites include redundant capabilities for power, communication, and life safety
► Establish multiple rally points– Geographically separated if possible
► Identify single points of failure and scheduled refresh for supplies and equipment– Base requirements on recovery time
and point objectives
Information Infrastructure
Command and Control
► Contain essential information for crisis management– Contact information– Processes and procedures– Forms– Communication plans
► Require highest level of data protection controls– Access control and encryption
► Important to constantly update– Electronic versions ideal for data synchronization
• Directory and data store synchronization
► Store electronic versions in secure distant location
Grab and Go Books
Tiered Response Model
► Each tier invokes different capabilities and resource availability– Minimizes disruption to
normal business activities► Command and control
oversees incident response, business continuance, and disaster recovery– Operational response
overseen by operations management
– Trust people to do their jobs
Incident Response
► Events and incidents require different levels of investigation and response
► Events highlight business impacting activities to investigate– Can lead to incidents
► Incidents require structured and focused response– Identify, analyze, remediate,
and document – Formal documentation
Events versus Incidents versus Investigations
Incident Response
► Incident identification process classifies response type– Operational or forensic
► Operational response focuses on return to normal activities– Minimal disruption to business
activities
Operational versus Forensic Response
► Forensic response focuses on preservation and integrity of evidence (ex., e-Discovery)– Required for litigation activities– Potential for business disruption
Incident Response
► Important to identify incident completion– Reduce or discontinue incident response
resource usage► Completion of physical incidents easier
to identify then logical incidents– Dormant attack code and multi-phase attacks
► Reduce to operational response instead of discontinuing efforts completely– Operational response team can monitor situation
for “flare up’s”► Engage legal council for opinion in
forensic response– Evidence preservation– Chain of custody
Recognition of Incident Completion
Business Continuance
► Focuses on ability of enterprise to operate effectively while encountering business debilitating incident
► Based on business processes not facilities and technology
► Includes partial and complete business disruptions
Overview
Business Continuance
► Mapping of revenue streams is traditional approach identifying key business processes– Revenue required for business
survival► Other considerations
– Compliance requirements– Contractual arrangements
• Service level agreements
– Customer expectations– Public and customer opinions
Key Business Process Identification
Business Continuance
► Businesses are typically customers and consumers of other businesses
► Contractual availability requirements may exist– Service Level Agreements (SLAs)– Legal and financial consequences if requirements are
not met can be significant► Important to establish secondary capabilities to minimize
impact to partners and vendors– Reciprocal arrangements with similar organizations– Establish arrangements in advance
Partner and Vendor Impact
Business Continuance
► Enumerates impact of loss of or all of business process capabilities
► Typically performed through surveys and questionnaires– Highlight obvious processes and impacts– Often miss critical considerations and data points
Business Impact Analysis
► Business process mapping key to success– Provides visual
depiction of all business process elements and dependencies
Business Continuance
► Identify information infrastructure and data elements– People, processes, procedures, technical infrastructure
and data used in business process► Account for partial loss as well as full loss
– Ensures response is measured and appropriate► Perform Threat and Vulnerability Analysis
Business Impact Analysis (continued)
– High likelihood and business impact
Business Continuance
► Recovery point objectives– Establish key business process and information infrastructure
requirements for business resumption
► Recovery time objectives– Establish time windows to reach established recovery points
► Define level of effort and investment for recovery efforts
Business Impact Analysis – Recovery Objectives
► Provide realistic metrics – Prioritization schedule for recovery
activities – When efforts can be reduced– When efforts should be discontinued– When it is not appropriate to recover
business process
Business Continuance
► Inventory of skills and knowledge required for recovery and operation activities– Move knowledge from brain to documentation
► Map to Human Resource skill inventories– Regularly identify gaps in available staff capabilities
► Should be simple enough to provide to staffing organizations
► Focused on needs during recovery not for normal business activities– Job descriptions are not sufficient
Competency Models
Business Continuance
► Be careful of the “Hero” assumption– Staff will typically be less effective during incident– Staff may not be willing or able to participate
► Gain commitment to participate in advance– Brief staff on expectations and requirements
► Contract with third party staffing firms in advance for key skill areas
Staff Availability
– Provide competency models to third parties
► Ensure deep bench of staff available for key skills– Maintain current contact database
of candidates
Business Continuance
► Availability of funds key to success– Payroll and capital expense plan key
during incident– Reserves for initial costs and finance
plan– Utilize insurance for long term
financial coverage► Ensure contingencies in place for
financial mechanisms– Confidence built after first payments
made– Accounts payable and receivable
capabilities need to be a high priority
Financial Planning and Reserves
Business Continuance
► Most unavailable workforce scenarios effect regional areas– Pandemic– Natural disaster– Hazardous material incident
► Limited services available from infrastructure providers– Limited internet bandwidth from service providers– Limited telephony capabilities
► Develop remote capabilities which utilize minimal bandwidth– Limited use of graphics– Text based services– Minimal file transfers
• Off hour scheduling– No screen scrape applications
Unavailable / Remote Workforce
Disaster Recovery
► Focused on physical and technical infrastructure► Typically utilize mirrored capabilities in separate location
– Data centers– Data replication– Working space
► Typically overlook logical disruptions– Account for physical
disruptions only– Assume staff
capabilities and business process resiliency
Disaster Recovery
► Recovery location should be determined by using recovery time and point objective analysis
► Recover in place typically preferred if metrics can be met– Least disruptive to organization– Typically most cost effective
– Fastest return to normal
► Ensure remote site appropriately configured and available– Providers with shared service model may not have availability– Data and infrastructure synchronization
► Focus on business process impact– What will cause the least disruption to business activities
Recover Remote Or In Place
Disaster Recovery
► Logical attacks are more likely then physical attacks– Delayed attack scenario using malicious code– Integrity attacks– Denial of service– Cryptographic attacks
► Insiders are most dangerous adversaries– Trust but verify
► Simple countermeasures can counteract wide range of logical threats– Integrity checks– N+1 access controls for sensitive
environments
Overlooked Threat Scenarios
Disaster Recovery
► Access and Availability of Facilities– Replenishment capabilities challenged in regional disaster
► Local authorities may not allow access to facilities– Arrangements for clearance need to be made in advance
• Physical access• Telephony and networking
► Network, power, and cooling at shared facilities may not be adequate during regional outage at shared provider
► Backup of backup facilities should be identified– Reduced capabilities compared to primary recovery facility
• Virtualization and Software as a Service providers
Overlooked Threat Scenarios (continued)
Table Top Versus Actual Tests
► Table top tests identify obvious challenges– Provide false sense of security– Typically not representative of actual incident
► Actual tests– Full test should be performed with key personnel
at least once per year• Perform during time with least disruption• Should be unannounced
– Test key elements on regular basis• Communication plan• Activation of alternative environments
► Lessons learned activity essential to testing activities– Lessons learned exercise will enumerate areas for improvement– Important to document findings and update plans appropriately
Return to Normal Considerations
► Easier to activate capabilities then to deactivate them– Crisis will drive cooperation for plan inception– Longer capabilities are in place harder it is to back out of them
► Detailed plans for return to normal as important as inception plans– Utilize same methods and practices as other elements of business
resiliency capabilities• Business impact analysis• Recovery point and time objectives
► Phased approach based to de-escalation provides checkpoints and minimal business disruption– Ensure operational effectiveness checks are made at each level
of reduction
Heterogeneous Approach
– Consistent language, methods, practices, processes, and procedures
► Develop test cases which utilize multiple elements– Wide scale disruption of
services– Case studies of scenarios
which have effected similar organizations
► Do not develop capabilities independent of each other– Focus on business processes
► Multiple capabilities often used during business disrupting event– Interdependencies between elements become apparent quickly
► Cooperative development will minimize costs and ensure interoperability between elements
Final Thoughts
► Business Resiliency is maturation and consolidation of traditionally separate capabilities– Command and Control, Incident Response, Disaster
Recovery, and Business Continuance► In order to be effective advanced planning must be utilized
– Perform threat and vulnerability analysis– Define Recovery time and point objectives– Validate assumptions– Consider details– Develop capabilities heterogeneously
► Capabilities will not account for all possibilities– Develop capabilities which are flexible enough to adapt to
any scenario but detailed enough for full recovery of key business processes
► Business Resiliency capabilities will constantly evolve– Business evolution will drive capability evolution as well– Ability to adapt is key to success
Business Continuity Management
Overview► Centralize business continuity and disaster
recovery plans, business impact analyses and recovery tasks.
► Prioritize business processes based on the impact to your business in the event of process disruption or failure.
► Test plans to identify process gaps and determine the time it will take to restore processes and infrastructure.
► Track crisis events in real time.► Implement rapid response plans,
contacting emergency responders through phased notification plans.
► Report on plan testing, gap analyses and remediation efforts using real-time reports and graphical dashboards.
Benefits► Automate and streamline your plan
creation, review, testing and activation.► Reduce effort and expense through a
“create once, use many times” approach.
Automate your approach to business continuity and disaster recovery planning, and enable rapid, effective crisis management in one solution.
Business Continuity Management Dashboard
Steve SutherRSA, the Security Division of EMCeGRC Solution ManagerSteve.Suther@archer.com
top related