Should I use AI to write any authentication code at all?

Yes, with two constraints. First, scope it to mechanical glue — wiring an existing library's API into your codebase — rather than implementing the security primitive from scratch. Second, verify every line against the library's official documentation, not the model's explanation of it. Auth code where the model is implementing the protocol is where most of the failures concentrate.

How do I keep up with the actual security best practices when they change?

Subscribe to the relevant authorities directly: the OWASP cheat sheet series for application security, the MDN security pages for browser primitives, the CNCF security blog for cloud and Kubernetes. These get updated when the recommendations change; the model's training data does not.

What about AI tools that specifically claim to do secure code review?

They catch a useful subset — the patterns that show up frequently in CVE databases and static-analysis rules. They miss the category this post is about, which is subtly-wrong logic that looks correct. Treat them as one input among several, not as a replacement for reading the security primitive's documentation and the diff itself.

Catching Plausible-But-Wrong Security Advice From LLMs

The LLM is willing to write you authentication code. The code will look correct. It will compile, it will run, it will pass the tests the model wrote alongside it. The places where it gets subtly wrong are exactly the places an audit would catch and a casual review would not — and security is the one category where subtle wrongness shows up as a CVE rather than a customer complaint.

Where to point a security review

AI-written code usually compiles, runs, and passes its own tests — so a security pass can't rely on 'does it work'. It has to target the categories where subtle wrongness turns into a CVE. These are the buckets the patterns in this post fall into.

1. Token storage — localStorage vs httpOnly cookies

Ask the model where to store an authentication token in a single-page app and it will frequently default to localStorage. The code is short, the example compiles, and tutorials across the public internet have done exactly this for a decade. The problem: any cross-site scripting vulnerability anywhere in your app gives an attacker access to the token, because every script can read localStorage.

// Model output — looks reasonable, ships an XSS-readable session
async function login(email, password) {
  const res = await fetch("/api/login", { method: "POST", body: JSON.stringify({ email, password }) });
  const { token } = await res.json();
  localStorage.setItem("auth_token", token);  // ← any XSS reads this
}

The correct shape is an httpOnly, Secure, SameSite cookie set by the server on the login response. The client never sees the token in JavaScript; the browser attaches it to subsequent requests automatically. The model knows this when asked directly; it does not volunteer it.

Detection move: any AI-generated auth code that mentions localStorage or sessionStorage for the access token is a red flag. Ask the model the follow-up: "What is the XSS risk of this approach?" It will tell you. Then ask for the cookie-based version and verify against the OWASP session management cheat sheet.

2. Encryption setups — wrong cipher modes, IV reuse, padding bugs

The model will produce encryption code that uses the right cipher family with the wrong mode, or the right mode with a parameter that destroys its security guarantee. AES-GCM with a reused IV is the canonical example: the code compiles, the tests pass, and one repeat of the IV across two messages undermines confidentiality and integrity simultaneously.

// "Almost right" — IV derived deterministically from the message id
const iv = sha256(messageId).slice(0, 12);
const cipher = crypto.createCipheriv("aes-256-gcm", key, iv);
// If two messages ever share an id, GCM's security argument collapses.
// Correct: iv = crypto.randomBytes(12); store iv alongside ciphertext.

Other shapes in the same category: CBC mode with a static IV; PKCS7 padding implemented manually with an off-by-one in the padding length; ECB mode used by accident because it was the default in a code example the model trained on.

Detection move: for any AI-generated encryption code, read the WebCrypto MDN page or the documentation for the specific crypto library being used, paying attention to the security notes section. Verify the IV generation, the mode of operation, the key derivation, and the authentication tag handling all match the documented correct pattern. Reused IV is the single most common error to look for.

3. CSP headers — almost-right directives that allow inline scripts

The model produces Content-Security-Policy headers that read as strict and ship as cosmetic. The most common shape: a thoughtfulscript-src list of trusted origins followed by 'unsafe-inline', which makes the entire script-src list cosmetic because inline scripts are now allowed unconditionally.

// Looks strict, ships as cosmetic
Content-Security-Policy:
  default-src 'self';
  script-src 'self' https://cdn.example.com 'unsafe-inline';
  // 'unsafe-inline' allows any <script>...</script>, defeating the allowlist

Other shapes: script-src * as a "temporary" debug measure; missing object-src 'none' allowing Flash-era plugin injection; default-src 'self' overridden by a looser script-src further down. The pattern is the same — the directive looks restrictive, one parameter undoes it.

Detection move: run the proposed CSP through Google's CSP Evaluator (csp-evaluator.withgoogle.com) or Mozilla's Observatory before accepting it. Both tools flag the common "almost right" shapes. Cross-check against the MDN CSP reference for any directive you are not sure about.

4. Auth flow ordering — verifying after using

The model produces JWT-handling code that reads the claims out of a token, uses them for an authorization decision, and only then verifies the signature. The code path produces the same output when the signature is valid; it produces a security vulnerability when the signature is forged.

// Model output — order matters, and the model got it wrong
const decoded = jwt.decode(token);                 // ← unverified read
if (decoded.role !== "admin") return forbidden();  // ← used for auth
jwt.verify(token, publicKey);                      // ← verified too late

// Correct: verify FIRST, then read claims from the verified payload.
const decoded = jwt.verify(token, publicKey);
if (decoded.role !== "admin") return forbidden();

Other shapes in this category: verifying the signature but not checking the exp claim; accepting none as a valid algorithm; trusting the alg field from the token header to pick the verification algorithm (allowing algorithm confusion attacks).

Detection move: in any auth code involving signed tokens, trace the data flow and confirm that no claim value is read or used before the signature has been verified with a key the server controls. The keyword to find in the diff is jwt.decode — any use of it before jwt.verify is a defect.

The verification routine

The discipline that catches all four shapes is the same: never trust the model's confidence on security code; always read the official documentation for the primitive being used; compare line by line. The routine is short and runs every time:

Name the primitive — JWT verification, AES-GCM encryption, CSP header construction, cookie session.
Open the canonical reference — MDN, OWASP, the RFC, the library's documentation. Not a tutorial. Not a Stack Overflow answer.
Compare each line of the AI-generated code against the documented pattern. Note every parameter, every order dependency, every default value.
Run an automated tool where one exists — CSP evaluators, JWT debuggers, header scanners. Treat them as a backstop, not a primary check.
For anything you cannot fully reason about, escalate to a security-focused colleague before merge.

When we review AI-generated security code in a ShareCode code space, we keep the canonical reference open next to the diff and read them side by side — the shape that gets caught most this way is the jwt.decode used before jwt.verify, because a second reader tracing the data flow spots the ordering long before a casual top-to-bottom skim would.

Why this category in particular

Security is the domain where the model has read a lot of plausible-looking code and very little correct code. The training set is dominated by tutorials, which historically ship insecure defaults to make the example short. Stack Overflow answers from 2014 are still in the corpus and still being weighted equally with the OWASP cheat sheet from last year. When the model averages across this data, it produces output that looks like the average tutorial — which is to say, more like the insecure tutorials than the secure ones, because the insecure ones are more common.

A second reason is that security failures are silent. A functional bug surfaces as a stack trace; a security bug surfaces as a CVE or a breach disclosure, often months later. The feedback loop the model would need to "learn" what works in security is precisely the loop that does not exist in the training data.

The review checklist

For any AI-generated code that touches security, six concrete items to verify before merge:

Tokens are stored in httpOnly, Secure, SameSite cookies — not in localStorage or any JavaScript-accessible storage.
Encryption uses a documented mode (AES-GCM, ChaCha20-Poly1305) with a fresh random IV per encryption.
CSP headers have no 'unsafe-inline', no 'unsafe-eval', no wildcards in script-src, and have been run through a CSP evaluator.
Token signatures are verified before any claim value is read or used.
Algorithm selection for verification is server-controlled — never from the token header.
Password hashing uses a memory-hard algorithm (Argon2id, scrypt, or bcrypt) — never plain SHA-256 or MD5.

The checklist is short on purpose. A short checklist gets run. A 50-item checklist gets skimmed.

The habit that compounds

Security is the category where AI assistance helps the least and reading the official docs helps the most. The model can accelerate the mechanical parts — wiring an existing auth library into your routes, formatting a header, scaffolding the login form — but it cannot reliably make the decisions that determine whether the result is actually secure. The compounding habit is small and unglamorous: every time you reach for AI on a security-adjacent task, open the canonical reference first and keep it next to the editor while you read the diff. The docs are slower than the model. They are also right more often.

References & Sources

The primary sources, specifications, and documentation behind this article. Each link opens in a new tab.

Session Management Cheat Sheet
OWASP · OWASP Cheat Sheet Series
States plainly not to store tokens in localStorage and to prefer httpOnly, Secure, SameSite cookies — the canonical reference for shape 1.
cheatsheetseries.owasp.org
SubtleCrypto: encrypt() method
MDN Web Docs · Mozilla
The AES-GCM example uses a fresh random IV per encryption — the documented pattern the reused-IV bug in shape 2 violates.
developer.mozilla.org
Content Security Policy (CSP)
MDN Web Docs · Mozilla
Warns that 'unsafe-inline' defeats much of the purpose of a CSP — the reference for the almost-right directive in shape 3.
developer.mozilla.org
Content Security Policy Cheat Sheet
OWASP · OWASP Cheat Sheet Series
Recommends nonce/hash-based strict CSP over unsafe-inline, with the directive pitfalls the detection move checks for.
cheatsheetseries.owasp.org
JSON Web Token Best Current Practices
Yaron Sheffer, Dick Hardt, Michael B. Jones · IETF RFC 8725 · 2020
The standard for JWT handling: verify before use, reject 'none', and pin the algorithm server-side — directly covers shape 4.
datatracker.ietf.org
Password Storage Cheat Sheet
OWASP · OWASP Cheat Sheet Series
Recommends Argon2id / scrypt / bcrypt over fast hashes like SHA-256 — backing for the password-hashing item in the review checklist.
cheatsheetseries.owasp.org

About the writers

Author

Kishan Vaghani

Founder & Lead Engineer, ShareCode

Founder of ShareCode. Writes the engineering deep-dives on this site — WebRTC, Firebase Auth, real-time sync, and the production patterns behind the editor itself.

Real-time collaboration & CRDTsWebRTC & low-latency mediaFirebase authentication & security rulesNext.js & full-stack JavaScript

Kajal Pansuriya

Developer Educator, ShareCode

Developer educator at ShareCode. Writes the tutorial track — Python, JavaScript debugging, coding-interview prep, and the everyday code-quality habits that hold up in real codebases.

Python fundamentals & teaching beginnersJavaScript debugging & DevToolsCoding-interview preparationClean code & code review

Reviewing AI-generated security code?

Paste the diff and the relevant docs into a code space, share it with a security-minded colleague, and walk through the six-item checklist together. Most of the almost-right shapes get caught in the second pair of eyes.

Open a code space →

Keep reading on the ShareCode blog

AI Failure Modes

The Five Ways AI-Generated Code Goes Wrong

Engineering

Understanding Firebase Authentication Internals: JWTs, Sessions, Security, and Scaling

AI Workflows

Pair-Programming With an LLM Without Losing the Craft