Leveraging large language models (LLMs) for corporate security and privacy

Once a new technology rolls over you, if you’re not part of the steamroller, you’re part of the road.” – Stewart Brand

The digital world is vast and ever-evolving, and central to this evolution are large language models (LLMs) like the recently popularized ChatGPT. They are both disrupting and potentially revolutionizing the corporate world. They’re vying to become a Swiss Army knife of sorts, eager to lend their capabilities to a myriad of business applications. However, the intersection of LLMs with corporate security and privacy warrants a deeper dive.

LLMs privacy

In the corporate world, LLMs can be invaluable assets. They’re being applied and changing how we collectively do business in customer service, internal communication, data analysis, predictive modeling, and much more. Picture a digital colleague that’s tirelessly efficient, supplementing and accelerating your work. That’s what an LLM brings to the table.

But the potential of LLMs stretches beyond productivity gains. We now must consider their role in fortifying our cybersecurity defenses. (There’s a dark side to consider too, but we’ll get to that.)

LLMs can aid cybersecurity efforts

LLMs can be trained to identify potential security threats, thus acting as an added layer of protection. Moreover, they’re fantastic tools for fostering cybersecurity awareness, capable of simulating threats, and providing real-time guidance.

Yet, with the adoption of LLMs, privacy concerns inevitably emerge. These AI models can handle sensitive business data, and hence, need to be handled with care. The key is striking the right balance between utility and privacy, without compromising either.

The silver lining here is that we have the tools to maintain this balance. Techniques like differential privacy can ensure that LLMs learn from data without exposing individual information. Additionally, the use of robust access controls and stringent audit trails can aid in preventing unauthorized access and misuse.

How do we adopt LLMs while safeguarding our corporate ecosystem?

It starts with understanding the capabilities and limitations of these models. Next, the integration process should be gradual and measured, keeping in mind the sensitivity of different business areas. There are some applications that should always maintain human oversight and governance: LLMs haven’t passed the bar and aren’t doctors.

Privacy should never take a backseat when training LLMs with business-specific data. Be transparent with stakeholders about the kind of data being used and the purpose behind it. Lastly, don’t skimp on monitoring and refining the LLM’s performance and ethical behavior over time. Some specifics to consider here:

  • Obviously, LLM interfaces and corporate offerings are simple to block – trivial even – and should be blocked, so that the company can choose which ones to allow. This means a policy is needed on correct usage, and a project should exist to sanction the right ones. Ignoring this latter part will lead to rogue or maverick use of LLMs, so don’t ignore it.
  • Over time, new LLM services will pop up to satisfy suppressed demand (like organizations that outright ban and never subscribe to sanctioned services). So I suspect that a new RBL-like list or means of detection will arise for blocking access to new and emergent or even shady LLM services. For that matter, expect cybercriminals to intentionally stand up such services with everything from real models to mechanical Turks behind the scenes.
  • People will start using phones and personal systems to interact with LLMs if interaction is prohibited. Motivations for bypassing bans (e.g., performance, time savings, revenue generation, etc.) are there, so it will happen.
  • LLM-directed traffic will morph over (not much) time to progressively look a lot like other traffic types. It will rapidly begin to look like new traffic types to get around blocking, behavioral detection, and pattern spotting. It will also migrate to new protocols. QUIC is an obvious choice, but even traffic shape could look very different if resistance to the use of these services is high.
  • Other services will act as proxies and connect via API/WS to GPT, meaning anything could be a gateway. And worse, many users may not know where they are in the “API supply chain,” especially in “human interaction” services like support services. This is the big one. I would recommend that companies should always flag their own services as having humans or machines with a flag for end users, to encourage similar behaviors (an example where a standard is a great thing to champion).
  • Now for the biggest concern. Part of the problem is keeping IP and specific information out of third-party hands, sure, and data loss protection (DLP) can help there. But there are also traffic analysis attacks and data analysis attacks (especially at scale) and information that can be inferred from LLM use. The data never has to leave to imply something about what exists or is happening within a company’s walls (The big lesson of the Big Data era was that Big Data stores can create PII and other data types. This is even worse than that). This means that the formerly carbon-based to carbon-based life form rules about secrecy now have to apply to carbon-to-silicon interactions too: Loose lips sink ships when talking to anything!

Don’t resist change, adapt to it

Looking ahead, the incorporation of LLMs in the corporate landscape is a tide that’s unlikely to recede. The sooner we adapt, the better equipped we’ll be to navigate the challenges and opportunities that come with it. LLMs like ChatGPT are set to play pivotal roles in shaping the corporate and security landscapes. It’s an exciting era we’re stepping into, and as with any journey, preparedness is key. So, buckle up and let’s embrace this “AI-led” future with an open mind and a secure blueprint.

One last critical comment: The genie is out of the bottle, so to speak, which means that cybercriminals and nation-states will weaponize and use AI and derivative tools for offensive measures. The temptation to ban these uses outright should be avoided because we must ensure that pen testers and red teamers can access these tools to make sure our blue teams and defenses are prepared. This is why we have Kali Linux for instance. We cannot hamstring purple teaming with bans on the use of LLM and AI tools, now or in the future.

Don't miss