Skip to content
Back to articles

GPT-5.4-Cyber: OpenAI's Model for Security Experts Only

April 17, 2026/7 min read/1,318 words
AIAI SecurityOpenAIAnthropicIBM
IBM Technology podcast thumbnail for the GPT-5.4-Cyber episode
Image: Screenshot from YouTube.

Key insights

  • GPT-5.4-Cyber is fine-tuned specifically for cybersecurity, not just a version of GPT-5.4 with looser guardrails. That is a qualitative change, not a policy tweak
  • OpenAI's broader TAC program and Anthropic's tight Glasswing consortium take opposite paths to the same conclusion: open access to the most capable cyber models is ending
  • The 1995 SATAN debate about dual-use security tools has never been settled. Every decade the same argument returns with a new name and higher stakes
  • Security by obscurity is already lost. WormGPT and jailbroken models circulate on underground forums, so gating frontier models helps defenders but cannot stop attackers
SourceYouTube
Published April 16, 2026
IBM Technology
IBM Technology
Hosts:Matthew Kosinski
IBM
Guest:Jeff Crume and Martin KeenIBM

This is an AI-generated summary. The source video may include demos, visuals and additional context.

Watch the video · How the articles are generated

In Brief

Earlier this week, OpenAI released GPT-5.4-Cyber, a variant of its flagship model that the company calls cyber-permissive. In plain English: it is allowed to do things a normal GPT refuses, inside the narrow lane of defensive security work. You cannot just sign up for it. Access runs through OpenAI's Trusted Access for Cyber (TAC) program, which vets individual researchers and whole teams before the highest tier unlocks.

On a bonus episode of IBM's Security Intelligence podcast, host Matthew Kosinski sits down with two IBM master inventors: Jeff Crume and Martin Keen. They work through a question the announcement itself does not answer clearly: who decides what counts as legitimate security work, and what happens when the gatekeepers disagree?

The broader backdrop is that Anthropic took a very different route only weeks ago with Project Glasswing and Claude Mythos: a tight consortium of approved partners. OpenAI's TAC is wider, more automated, and open to individuals. Different philosophies, same underlying admission: the era of open access to frontier cyber-capable models is ending.

What GPT-5.4-Cyber actually is

Two things are going on here, and they matter in different ways.

The first is that OpenAI lowered the refusal boundary for this model. A refusal boundary is the point where a chatbot goes from answering to politely saying no. Standard GPT models will refuse detailed questions about malware, exploit code, or binary reverse engineering. GPT-5.4-Cyber refuses less often on those topics, as long as the user is inside the TAC program.

The second, and more interesting, is that this is not just a guardrail setting. Keen flags it late in the episode: OpenAI says the model was fine-tuned specifically for cybersecurity work. Fine-tuning means taking a base model and continuing to train it on a focused dataset, so it gets noticeably better at that domain. So GPT-5.4-Cyber is not simply GPT-5.4 with the handcuffs off. It is a model that has been taught to think like a security engineer.

Access works through a tiered verification process. Individuals and enterprises apply through chatgpt.com/cyber. Approved users at the highest tier get GPT-5.4-Cyber; lower tiers get softer versions of existing models with reduced friction around dual-use cyber questions.

Two paths to gated access

Six weeks ago, you could reasonably say every big AI lab was racing in the same direction: the newest model, released to everyone, benchmarks first. Not anymore. The two most visible labs in the US have now drawn very different lines around their strongest cyber-capable models.

OpenAI (TAC)Anthropic (Glasswing / Mythos)
Access modelAutomated vetting through TAC program; tieredConsortium of selected partners
Net castWide. Individuals and teams, thousands of defendersNarrow. A small number of approved organizations
DecisionVerification algorithm plus human reviewDirect partnership agreements
PhilosophyDefense at scale beats secrecyContainment of dangerous capability beats breadth

Keen's reading on this shift is blunt: "We've kind of already reached the end of the open model approach." The labs keep telling us these systems are now too capable for unrestricted release. What remains is an argument about who gets keys and how you prove you deserve them.

The two approaches look opposite. They end up at the same destination. Frontier cyber-capable models are no longer something a random developer pulls off an API page.

The SATAN debate, 30 years later

Crume's contribution to the episode is a reminder that almost nothing here is new. His analogy is Groundhog Day, the film where Bill Murray wakes up in the same day over and over again. Cybersecurity, he argues, keeps having the same argument.

Exhibit A, from 1995: a tool called SATAN, short for Security Administrator Tool for Analyzing Networks, built by Dan Farmer and Wietse Venema. SATAN was one of the first open vulnerability scanners, software that points itself at a network and lists its weak spots. The intent was defensive: sysadmins could audit their own systems.

The reaction was exactly what you would expect. The community tore itself in half. One camp said SATAN handed attackers a turnkey burglar's kit. The other said defenders needed the same tools attackers already had, or they would lose. The debate was loud enough that Silicon Graphics fired Farmer over it.

Three decades later, Crume's point is that GPT-5.4-Cyber is the same conversation at a different scale. A hammer doesn't know whether it is driving a nail or cracking a skull. A vulnerability scanner doesn't check who is holding it. Neither does a language model that knows how to read compiled software and find memory corruption bugs.

Responsible disclosure as the middle path

Between "only our secret club can touch this" and "let everyone have it", the security industry already has a well-worn compromise: responsible disclosure. If you find a vulnerability, you tell the vendor first. You give them a fixed window, often 90 days, to ship a patch. If they do nothing, you go public, which forces their hand.

Crume suggests that same principle could shape how AI labs handle powerful cyber-capable models. Not a permanent lockup, not a free-for-all. A structured process that pressures defenders to actually defend. TAC is arguably a first step in that direction: apply, prove yourself, get access in return for accountability.

Why "security by obscurity" is already lost

The hardest pill in the conversation comes late. Crume states it plainly: keeping cyber-capable AI locked down is security by obscurity, the idea that you are safe as long as the attacker does not know what you know. That strategy does not work, and the security industry has spent decades proving it.

Why? Because the bad guys are not waiting in line for OpenAI approval. There is already an ecosystem of uncensored, criminal language models for sale on underground forums. WormGPT appeared on Hack Forums in 2023 as an explicitly malicious GPT variant, trained on malware samples and phishing templates. It was shut down and reborn several times, most recently as variants running on jailbroken Grok and Mixtral models, sold for 60 to 100 euros a month.

The lesson: gating frontier models changes the shape of the defender advantage, but it does not create one by itself. Attackers have lower-quality tools than GPT-5.4-Cyber, but they have tools, and the capability gap is closing fast.

What this is really about

GPT-5.4-Cyber is less interesting for what it can do today than for what it signals. Three shifts are happening at once.

Fine-tuning over flag-flipping. Labs used to talk about toggling guardrails off for approved users. Now they are training new models specifically for the domain. The difference matters: a model that has been trained on vulnerability research is better at vulnerability research, not just less polite about it.

Gated access is the new default. Both OpenAI and Anthropic have concluded that frontier cyber-capable models cannot ship to everyone through a public API. Individual subscribers keep the consumer-grade models. The sharper tools move behind verification.

The race does not stop. Crume's closing frame is that this is a temporary advantage for the defenders who get access first. As Keen points out, nobody has come out with the model to end all models. The next lab will release a better one, and the next, and the next. Which means the point of TAC and Glasswing is to stay a step ahead of the criminal underground, not to freeze the capability in place.

The cybersecurity community, true to form, cannot agree on whether this is the right move. Keen ends the episode with Crume's running joke: ask three security professionals the same question and you get five answers. Three decades after SATAN, we still cannot agree on whether the latest scanner is a gift or a weapon. Only now the scanner can write its own exploits.

Glossary

TermDefinition
Fine-tuningContinuing to train an existing model on a focused dataset so it gets better at a specific domain
GuardrailA rule that stops an AI model from answering dangerous or disallowed questions
Refusal boundaryThe line where the model stops answering and starts saying no. Raising or lowering it changes how permissive the model feels
Cyber-permissiveOpenAI's own term for a model tuned to accept cybersecurity queries that standard models refuse
Responsible disclosureTelling a vendor about a vulnerability first and giving them a fixed window to patch it before going public
Security by obscurityThe assumption that a system is safe because attackers don't know how it works. Generally considered unreliable as a sole defense
Vulnerability scannerA tool that automatically probes a system for known weaknesses. SATAN (1995) was one of the first
Reverse engineeringTaking a finished product apart to understand how it was built. In security, often applied to compiled software to find bugs
ConsortiumA formal group of organizations with a shared agreement. Anthropic's Glasswing partners form one
TAC (Trusted Access for Cyber)OpenAI's vetting program that approves individuals and teams for access to cyber-capable models, including GPT-5.4-Cyber

Sources and resources

Share this article