When public and private sector institutions depend on your data for critical defense and investigation contexts, there is very little room for error. And as a first principle of security operations, in particular, and information technology, in general, it is critical to set up, maintain, and re-evaluate standardized and replicable processes for situations like security reviews of new software requests from the rest of the company. This is especially true with software-as-a-service (SaaS) offerings that provide less visibility than traditional on-premises software.
Enter the artificial intelligence hype cycle and the glut of generative AI (GenAI) functions being shoehorned into nearly every application on the market, as well as entirely new GenAI-powered services not available previously. This new wrinkle of GenAI complicates the security review process for a number of reasons. Among them:
- The novelty of the GenAI business sector compared with more traditional models brings with it uncertainty about some technology impacts and direct concerns about others.
- The complicated nature of GenAI data processing and underlying and often interweaving data flows in and out of multiple companies, increasing data exposure to multiple entities.
- The extractive nature of many GenAI services feeds all available data into training datasets as pristine training data becomes more scarce, and a regular disregard for user consent as GenAI models cannot consume their own output for training data.
Given these and other concerns, our security operations team has completed a large number of software security evaluations according to a general framework that seems to be working well so far. We’ll go into it here in the hopes that it can help other organizations inform their own software review processes and reinforce their security posture in an uncertain and fast-changing environment.
(For those SecOps practitioners looking solely for advice related to generative AI, you may want to skip directly to the section titled “Grab a Coffee and Hit the Books,” as it discusses specifics like data processing addenda.)
See also: A Reality Check for Generative AI
The Importance of Self-Evaluation
Any security review, whether it’s an internal vulnerability, geopolitical concern, or external new software request, must be rooted in a regular and accurate assessment of your own organization, company, or institution. There is no “One Size Fits All” risk profile - take a look at the industry you’re in and the threats it faces, as well as the data and services you’re responsible for protecting. A shoe sales company charged with protecting employees, customers, and commerce data has a very different footprint here than a company that provides intelligence feeds to investigators and defenders. A public safety agency has a very different footprint than a private incident response company. Understand where you sit and what the environment looks like around you, and let it inform your security posture.
Also prior to taking on the review of external software, ensure you understand the new request and its use cases. What data will be actively exposed to this new software, and how critical is it? Is it the crown jewels of customer data, proprietary source code, sensitive business operations, or is it aggregated open-source data available elsewhere? This is especially important when it comes to SaaS offerings that integrate with multiple other services. If the software wraps around Salesforce or Github, you’ve got to go deep.
Can I Get a Vibe Check?
Start with open-source research around the software and company. Ensure the product is actively developed or at least maintained as needed, with security updates prioritized. If it’s stale and the last update was a year ago, it should not be a contender for active use within any of your workflows. Threat environments and vulnerability ecosystems simply change too fast at this point.
Evaluate the company’s response to security issues either in the media or through Github. Were they responsive or dismissive? Does their update cadence dovetail with your own risk profile, or is it too spaced-out to trust for your purposes? Also evaluate company maturity. If they lack a trust center and data processing documentation, or if their documentation or support system is through Discord, that is not a mature solution. Not every company possesses the resources for these steps, but always remember: Not every company is going to fit well with your company’s risk management posture.
See also: How Gen AI Changes Everything in 2024
Grab a Coffee and Hit the Books
Now it’s time to focus on the deeper documentation that’s found in places like a trust center. Any software you’re evaluating should be able to make available documents like a SOC2 or ISO27001 certificate, recent penetration testing reports or attestations, and elements of their business continuity/disaster recovery plan. There are reasons some businesses may not be able to produce all of these, but if they cannot provide any, allow your skepticism to deepen.
For the above-mentioned documents, be sure to check effective dates to ensure they're not years-old, but also, if the company provides multiple years of something like pentest reports, take time to leaf through them. See if the same vulnerabilities show up year after year; or if the company is serious and responsive enough that they act fast and remediate discovered vulnerabilities during the course of the testing period, which is no small feat. Pay attention to whether the software company retains formal pentesting firms, skips from one firm to another to another across multiple years, or only engages in unfocused and less-effective testing like bug bounty platform advertised testing periods. This will all speak on some level to what’s going on underneath the surface.
For generative AI, in particular, one of the most crucial documents you’ll have access to is the company’s data processing addendum (DPA), which should be public and easy to access and understand. This is a legal document usually established as an addendum to the terms and conditions of a service that covers data processing according to one or several geolocated standards such as GDPR. The DPA should also list all data subprocessors that a company has contracted with, their location, and a general description of their function in relation to your data. Pay attention to the geolocation and breadth of data exposure, and ensure it meets or exceeds your risk management needs. Some DPAs have five or six subprocessors; some have dozens. Some companies only contract subprocessors in the U.S. or E.U.; some include countries you may not want to come within miles of your data.
For extra points, if you analyze the DPA of each subprocessor in turn, you see the first and second order of your data exposure. It’s not usually a pleasant sight.
Reading through the DPA, pay special attention to what standards data is held. More mature organizations will stick to U.S. and E.U. best practices, especially GDPR; whereas companies you should avoid will use boilerplate language that points to non-U.S./non-E.U. data processing and “equivalent standards.” If the implication of a DPA is that your data can be sent off to completely different regions of the world with no sworn legal protections, it is time to find a different solution. The DPA will also provide background information on the standards the software company holds its subprocessors to. In this section, what you want to see is language along the lines of “a written agreement with each subprocessor that imposes data protection obligations no less protective of personal data than those set out in our agreement with our clients.” Addenda lacking that kind of language often provide purposeful loopholes for subprocessors that are actually “data partners” -- and are probably extracting data for unstated purposes without your consent.
More than any other SaaS segment I’ve performed security reviews for, services with generative AI components have complex and problematic data subprocessor lists. Tracking back through third- and fourth-order subprocessor lists, you quickly find many of the smaller companies are just white label packages for the larger GenAI firms, and most of the larger GenAI firms are connected with each other. You also find recurring patterns and recurring single points of deep exposure, such as data warehouse Snowflake -- if that name sounds familiar, it’s because multiple Snowflake datastores were continuously scraped by unauthorized third parties, sometimes for months, resulting in a swarm of pivot compromises for companies storing data there as well as those relying on those companies as vendors.
Before completing a security review, you should ensure you understand the data exposure caused by the new software’s subprocessor list, as well as specifics on data geolocation and any possible data shifts outside of approved regions. Also ensure the DPA specifies how and when the subprocessor list will be updated, and how notifications occur. Forthright companies email customers about subprocessor changes with proscribed periods to opt out before the changes take effect. Questionable companies require you to somehow monitor the DPA page for updates yourself.
Another specific callout is training data. If you are left with any questions whatsoever as to whether your data will be extracted or analyzed for training datasets, ask the specific question and get the answer in writing. More than a few companies provide robust-looking data policies that leave specific loopholes in place and avoid answering when asked -- make it a key piece of your inquiry, and make it clear that approval hinges upon the company’s answer.
Repeat the process for any plugins, extensions, or other add-ons your internal use case inquiry identified. If you thoroughly vet the web app but the Gmail plugin is trivially compromised, your data is still gone.
Conclusion
At a high level, the process of evaluating GenAI software requests takes the following form, adjusted to fit the particular organizational needs:
- Define your own organization’s risk profile and threat environment.
- Understand the software use case and request, as well as what data the software will touch and the scope of exposure.
- Confirm the software is in active development or at least active maintenance, and not stale/unmaintained.
- Research the vulnerability history of the software and company and their responsiveness to security issues.
- Access trust center materials to understand the deeper context (SOC2, ISO27001, pentest reports, BCDR).
- Analyze the company’s data processing addendum and its DPA update notification protocol.
- Repeat these analysis steps for any plugins or extensions.
- Ask specific questions about the extraction of data for training datasets.
- If no chart is provided, chart out the data flow between your systems and theirs to understand how complex the paths are, and how much attack surface area they represent.
We are in a liminal period -- the old signs have fallen before us, and new trails must be blazed as generative AI software and features crowd most markets. But we aren’t at a place of stability or certainty. While we move through what’s likely to be the horseless carriage phase of Generative AI, security operations and similar teams must move carefully and deliberately, ask hard questions, and analyze dull documents. Establishing flexible frameworks for software security reviews that pay special attention to trust-related and data processing documentation eases the burden and helps inform critical business decisions as we all adapt to changing conditions while seeking to arm our colleagues with the best technology available.