I Audited Vibe-Coded Applications: Here Are the Security Nightmares I Found

I Audited Vibe-Coded Applications: Here Are the Security Nightmares I Found

In February 2025, Andrej Karpathy casually coined a term that would define one of the biggest debates in modern software development: "vibe coding." The idea is simple — you describe what you want, an AI generates the code, and you accept it without fully understanding or reviewing what it produced. You go with the vibes.

The concept resonated so deeply that it now has its own Wikipedia page. Ironically, Karpathy himself has since suggested retiring the term in favor of "agentic engineering" — a more precise description of AI-assisted development done responsibly. But the genie is out of the bottle. Developers everywhere are shipping AI-generated code at unprecedented speed.

The question nobody was asking loudly enough: how secure is that code? I audited multiple vibe-coded applications — projects built primarily by accepting AI-generated code with minimal human review. What I found was deeply concerning.

The Numbers Are Alarming

The data paints a troubling picture. A comprehensive study by Black Duck found that AI co-authored code contains 75% more misconfigurations than code written entirely by humans. Research published on Towards Data Science revealed that AI-generated code has a 2.74x higher rate of security vulnerabilities compared to human-written code. Perhaps most striking: 24.7% of AI-generated code contains at least one security flaw — roughly one in four suggestions shipping with a vulnerability baked in.

The Numbers Are Alarming

These are not hypothetical risks. They are measurable, reproducible patterns showing up across languages, frameworks, and AI models.

The Top 5 Security Nightmares

During my audit, I encountered the same categories of vulnerabilities over and over again.

The Top 5 Security Nightmares

1. SQL Injection — The Classic That AI Refuses to Learn

SQL injection has been a known vulnerability for over two decades. It is in every security textbook, every OWASP list, every beginner tutorial. And yet, AI consistently generates vulnerable code by building queries using string concatenation instead of parameterized queries — because that is what a huge percentage of its training data contains. Stack Overflow answers from 2010, tutorial blogs, quick-start guides all use string concatenation because it is simpler to explain.

The result: an attacker passes a malicious string as a query parameter and dumps your entire database. The fix — parameterized queries where the database driver handles escaping automatically — is straightforward, but AI reaches for the insecure pattern by default.

2. Hardcoded Secrets — AI Puts Your Keys in Plain Sight

This is the most embarrassingly common flaw I found. When you ask an AI to integrate with an API, it almost always puts credentials directly in the source code — Stripe secret keys, AWS credentials, database passwords, all hardcoded as string literals.

I found live API keys in three out of five applications I audited. One had a Stripe secret key committed to a public GitHub repository. Another had AWS root credentials — not IAM, root — embedded in a frontend JavaScript file served to every visitor.

Secrets should always come from environment variables or a secrets manager. Never from source code. This is non-negotiable.

3. Missing Input Validation — AI Trusts Everything

AI-generated code almost never validates input. It assumes every request is well-formed, every user is honest, and every payload conforms to expectations.

Consider a money transfer endpoint: AI generates something that blindly takes fromAccount, toAccount, and amount from the request body and runs the database queries. No check that the amount is positive. No check that the sender owns the account. No check for sufficient balance. No database transaction to ensure atomicity. A negative amount reverses the transfer. A race condition enables double-spending.

The AI-generated version might be 10 lines. The secure version — with schema validation, ownership checks, balance verification, and transactional atomicity — is closer to 55. That gap represents every attack vector the AI did not think about.

4. Insecure Dependencies — AI Imports Yesterday's Vulnerabilities

When AI generates code, it recommends packages based on training data that may be months or years out of date. This means it frequently suggests packages with known CVEs, deprecated APIs, or abandoned projects.

Every AI-suggested package should be vetted before installation. Run npm audit or pip-audit immediately. Check the package's maintenance status, download trends, and whether it has open security issues. Never accept a dependency list from an AI without auditing it — you inherit every vulnerability in every package you install.

5. Broken Authentication — AI Builds Doors That Look Locked

Authentication is among the hardest things to get right. AI-generated auth code often looks correct on the surface but contains critical flaws. In a single login endpoint, I counted six distinct security flaws: different error messages that reveal whether an email exists, passwords compared in plain text with no hashing, a weak hardcoded JWT secret, tokens with 30-day expiry, no rate limiting, and tokens returned in the response body where XSS can steal them.

It compiled. It passed basic tests. A user could log in and get a token. And it was catastrophically insecure. Secure authentication requires bcrypt for password hashing, consistent generic error messages, rate limiting on login attempts, JWT secrets from environment variables, short token expiry, and httpOnly secure cookies. AI generates none of this by default.

Why AI Gets Security Wrong

The training data problem. AI models are trained on massive code corpora that include Stack Overflow answers, tutorials, and blog posts — code written for demonstrations, not production. Tutorial code optimizes for clarity and brevity, not security. The AI learns insecure patterns because that is what most examples show.

Why AI Gets Security Wrong

Optimizing for "works" not "secure." When you prompt an AI to build a login system, it generates code that accomplishes the functional requirement. It does not think about threat models, attack vectors, or defense in depth.

No threat modeling context. A human security engineer thinks about deployment environment, data sensitivity, threat actors, and compliance requirements. An AI has none of this context. It generates the same code whether you are building a personal blog or a banking application.

The Vibe Coding Workflow Problem

The fundamental issue is not that AI generates insecure code — it is that the developer does not review what they ship. When you accept AI-generated code without understanding it, you assume the AI considered all the same things you would have. It did not.

The false confidence of "it compiles and passes tests" is particularly dangerous. Functional correctness and security correctness are entirely different dimensions. Code can pass every unit test while being trivially exploitable. Tests check that valid inputs produce correct outputs. They almost never check that malicious inputs are properly rejected, that rate limiting works, that secrets are handled correctly, or that error messages do not leak internal state.

A Secure AI-Assisted Development Workflow

Vibe coding does not have to be insecure. But it requires intentional practices.

Security-first prompting. The way you prompt an AI dramatically affects the security of its output. Instead of "Build a login endpoint for my Express app," specify: "Build a secure login endpoint. Use bcrypt with cost factor 12. Implement rate limiting at 5 attempts per 15 minutes. Use consistent error messages to prevent enumeration. Set tokens in httpOnly secure cookies. Follow OWASP authentication best practices." When you specify security requirements, AI models produce dramatically better output.

Mandatory code review checklist. Every piece of AI-generated code should clear a minimum bar before acceptance:

  • Are database queries parameterized? No string concatenation in SQL.
  • Are secrets loaded from environment variables?
  • Is all user input validated for type, range, and length?
  • Are dependencies audited for known CVEs?
  • Is authentication implemented with established libraries? No custom crypto.
  • Are error messages generic? No stack traces in responses.
  • Is there rate limiting on sensitive endpoints?
  • Are cookies set with secure and httpOnly flags?

Automated security scanning. Integrate static analysis (SAST), dynamic testing (DAST), and software composition analysis (SCA) into your CI/CD pipeline. Tools like Semgrep, CodeQL, Snyk, and OWASP ZAP catch what human review misses. Run npm audit or pip-audit on every build and fail on high-severity findings. None of this is optional if you are shipping AI-generated code.

The human-in-the-loop requirement. A human who understands security must review every piece of AI-generated code that touches authentication, authorization, data access, or external integrations. This does not mean you cannot use AI to write code faster. It means you treat AI output the same way you would treat code from a junior developer — assume it works, but verify it is secure.

The Path Forward: Discipline Over Vibes

Vibe coding does not have to be a security disaster. AI is an extraordinarily powerful tool for writing code faster. But speed without review is recklessness, not productivity.

The developers who will thrive in the AI era are those who use AI to generate code faster but review every line for security implications, understand common vulnerability patterns well enough to spot them in generated code, integrate automated scanning into every stage of their pipeline, and treat AI output as a first draft — not a finished product.

The most dangerous code is code that works perfectly — until an attacker finds it. AI-generated code is particularly dangerous because it looks confident, compiles cleanly, and passes basic tests while hiding vulnerabilities that only a security-conscious human would catch.

The vibes might feel great. But your users deserve better than vibes. They deserve code that was actually reviewed.

Related Posts

Prompt Injection in 2026: Still OWASP's Number One LLM Vulnerability

Prompt Injection in 2026: Still OWASP's Number One LLM Vulnerability

Prompt injection appears in 73% of production AI deployments and remains OWASP's top LLM vulnerability. Here is a developer's complete guide to understanding and defending against it.

AI Writes the Code Now. What Is Left for Software Engineers?

AI Writes the Code Now. What Is Left for Software Engineers?

With 51,000+ tech layoffs in 2026 and AI writing production code, the future of software engineering is being redefined. Here is what actually matters now.

The Productivity Panic: Why AI Coding Tools Are Burning Out Developers

The Productivity Panic: Why AI Coding Tools Are Burning Out Developers

Bloomberg, Harvard Business Review, and UC Berkeley all agree: AI coding tools are making developers more stressed, not less. The expectations tripled, but real productivity barely moved.