GPT-5.3-Codex crossed a threshold in February 2026: it became the first AI model in OpenAI's history to receive a "high" rating on their cybersecurity preparedness framework — meaning meaningfully elevated capability to assist with offensive security tasks.
What "High" Cybersecurity Risk Means
🔍
Vulnerability Discovery
Identifying exploitable weaknesses at a level comparable to skilled human researchers.
⚙️
Exploit Assistance
Generating or adapting proof-of-concept exploit code with minimal human scaffolding.
🕵️
Social Engineering
Crafting highly contextualised phishing content that bypasses conventional detection.
🔐
Privilege Escalation
Reasoning about post-compromise lateral movement and privilege escalation paths.
OpenAI's Safety Response
1
Safety Training
Enhanced RLHF targeting cybersecurity harm categories with adversarial red-team training.
2
Real-Time Monitoring
Continuous output monitoring with anomaly detection tuned for offensive security task sequences.
3
Trusted Access Tiers
Full capability gated behind verified identity and use-case disclosure.
4
Policy Enforcement
Automated suspension of API access upon detection of malicious use patterns.
Opportunity for Security Teams: AI-assisted penetration testing, code review, and threat modelling translate directly into better defence postures when deployed through controlled channels.
What Development Teams Must Do Now
- Audit API keys and secrets in all repositories
- Review authentication and authorisation logic targeting privilege escalation paths
- Update dependency inventories — vulnerable libraries are easier to exploit at scale
- Implement structured logging of all privileged actions
- Brief teams on social engineering escalation — AI-generated phishing is harder to detect
Our custom software development and cloud and DevOps practices incorporate security review as a standard phase.