ETA Expert Insights: What Does the Epic Facebook Outage Mean to PCI?
By ETA Risk, Fraud & Security Committee
Jim Bibles, Aperia Solutions • Lisa Conroy, FIS Global • Tom Gilligan, Aliaswire • Sam Hall, SecureTrust • Gurpal Singh, Finix Payments • Sam Pfanstiel, Viking Cloud, a Sysnet Global Solutions company
ETA’s Risk, Fraud & Security working group recently discussed the impact of the Oct. 4 Facebook outage with respect to PCI. This brought up several good discussion points, which may be valuable to others within the community:
- Does an outage like this have any bearing on PCI? The CIA Triad of information security refers to Confidentiality, Integrity, and Availability. Issues like outages, denial-of-service attacks, and ransomware impact an organization’s availability, which may indeed have disastrous effects on a business (to wit, Facebook’s loss of $40B in market capitalization due to the outage). PCI DSS, however, is not directly concerned with availability and is focused primarily on confidentiality of cardholder data — and to some degree on integrity of authentication credentials and certificates used to provide that confidentiality. This doesn’t mean that availability isn’t important, only that being PCI compliant provides little assurance that you can withstand an outage like the one that struck Facebook.
- The Facebook outage also meant that many users were unable to use Facebook’s single-sign-on (SSO), a security control for authentication. This brings to light the fact that outages may actually impact security controls. While PCI DSS doesn’t directly address this issue, other PCI standards such as P2PE and SSF have specific controls that are relevant. They can help dictate that when a security control fails, the entity is required to ensure that this does not negatively impact the environment and that this lapse in security is addressed by any fallback controls. For instance, if the loss of SSO availability leads to a fallback to another authentication method that has not been evaluated for issues such as default passwords, encryption, or multifactor authentication, this would cause a security reduction for entities subject to these PCI standards.
- Does this mean we shouldn’t use cloud services for important security components? Not necessarily. Cloud services can offer scalability, flexibility, and reduced costs, which may allow an organization to reinvest in other security controls or other risk-mitigation efforts that would more than compensate for losses that may result from an unlikely outage. Organizations should conduct a risk assessment and include loss of availability as one possible risk scenario. They should estimate the impact and likelihood, implement fallback measures to reduce residual risk, and decide for themselves whether their cloud strategy can withstand the impacts from an incident like the one that happened on Oct. 4 — in case it should impact themselves, or one of their security or infrastructure vendors.
- The other point that this outrage raises is the importance of vendor risk management (VRM) with respect to managing risk — either by reducing risk by choosing more resilient partners or by transferring risk to the entity in the form of a service-level agreement (SLA). One key aspect of VRM is due diligence when selecting a vendor. Below are some good questions to ask vendors:
- What is their disaster recovery (DR) and business continuity plan (BCP)?
- Are their plans tested regularly?
- Will they support an SLA that would offset lost revenues in such incidents?
- Have they experienced outages in the past?
- What is their actual recovery time?
- Can they provide references you can contact to confirm they honored their SLAs?
- Finally, as part of final execution of a contract, and in support of their SLAs, ask any vendor or Third-Party Processor (TPP) to see their DR and BCP. While they may not be willing to share the full-blown internal versions, they should be able to share a public version that substantiates their claims of having one. Trust, but verify.
Although it is almost impossible to guarantee that an outage will not affect your organization, acknowledging that even the tech giants can experience issues reminds us that we must be prepared for the unthinkable. We must ensure that our risk and security programs are not caught off-guard — even if PCI itself doesn’t require 100% uptime.
Special thanks to Sam Pfanstiel for summarizing the above meeting points.
Interested in being a part of these types of discussions? Join us! ETA committee applications open on Nov. 1.
Have any questions about how committee participation works? Email Sarah Brown-Campello, Manager of Industry Affairs, at [email protected].