AI-Assisted Coding

When the Vibes Go Wrong

Common failure patterns in AI-generated code, and the checks that catch them before they matter.

The Problem of Overconfidence

AI-generated code does fail. The hard part is that the failures often look like working output. The formatting is clean and the structure looks plausible. A human collaborator tends to signal uncertainty, hedging or flagging the parts they are less sure about. AI does none of that, so working code and fabricated code are delivered in the same polished tone.

There is a structural reason for this. The model does not check code the way a programmer or a compiler (the program that translates code into something the machine can run) would. It predicts what tokens (roughly, the next word or symbol) are likely to come next based on patterns in its training data, which is enough to produce plausible-looking output but not enough to guarantee the function exists, the library is real, or the logic is sound. Correct code and broken code are produced by the same prediction process, so the confidence level never varies.

AI confidence level vs. actual code quality:

Correct code

100%

Buggy code

100%

Fabricated code

100%

The confidence dial is always at maximum, whether the code works or not.

Common Failure Patterns

AI-generated code tends to break in predictable ways. Once you know what to look for, you can check for those patterns specifically.

Common

AI recommends packages or APIs that don't actually exist

Subtle

Security vulnerabilities that survive casual review because the code runs fine

Hallucinated APIs

The AI generates code that imports libraries or calls functions that do not actually exist. The result can look completely plausible, with naming conventions that feel right and no visible sign that the package was never published. (A "package" or "library" is a bundle of pre-written code that other programs can reuse, distributed through a public registry where anyone can look it up.)

fabricated import import { sanitize } from 'string-sanitizer';

// Looks reasonable. The name follows real conventions.
// But this package doesn't exist in npm (the JavaScript
// package registry). The AI invented it from patterns
// in its training data.

This is one of the easier failures to catch because the code will simply refuse to run. It can still waste time, though, if you build several layers of logic around the phantom library before discovering it does not exist. A quick check of the package registry early on heads that off.

Security Vulnerabilities That Still Work

This pattern is harder to catch precisely because the code is correct, or at least appears to be. It does what you asked, produces the expected output, and passes casual inspection. But it may leave security holes behind it: a way for a malicious user to inject commands into a database query, a missing login check, user-submitted data that gets rendered on the page without being cleaned first. There is no error message to tip you off, and the "does it run?" test that catches hallucinated APIs will not help here.

sql injection risk // The AI wrote this line to look up a user by name.
// It builds a database query by dropping the user's
// input directly into the command:
const query = `SELECT * FROM users WHERE name = '${userInput}'`;

// It works fine with normal input. But a malicious
// user could type: ' OR 1=1 --
// That rewrites the query to return every record
// in the database. Safe code would use a placeholder
// that keeps user input separate from the command.

Happy-Path-Only Error Handling

AI tends to generate code for the scenario where everything goes right: the database responds, the form field has a value, the file is exactly where it should be. When conditions are less cooperative, the code may fail silently or crash outright. You can often surface this by asking the AI directly: "What happens if the input is missing or malformed?" That single question forces it to confront the cases it skipped on the first pass.

Outdated Patterns

AI models are trained on code from a specific time window, so they sometimes generate code using deprecated functions (ones the language or library has officially retired), outdated library versions, or patterns that the community has since replaced. The code might work today, but if the approach was abandoned because of a known security flaw, the AI will cheerfully reintroduce it. Asking the AI to check the current documentation for the libraries it used is a straightforward way to surface this kind of drift, and models with web access can do the lookup themselves.

The Fix-It Loop

When you repeatedly ask the AI to "fix this" without explaining the actual problem, each round gives the model another chance to take the path of least resistance. Sometimes that means removing the safety check that surfaced the error. The error message disappears, the code looks cleaner, and a protection you needed is gone. Models are getting better at holding context across rounds, but the risk still increases when you go several iterations without reviewing what actually changed.

Why Security Problems Slip Through

The failure patterns above are not equally easy to catch. Hallucinated APIs reveal themselves the moment you try to install them, and outdated patterns often surface through deprecation warnings. Security problems produce no warning. The code runs, the output looks right, and you only find out a protection is missing when someone exploits it.

AI models are trained on feedback that rewards acceptance, meaning code the user does not reject. Security features add friction by design. They include authentication flows, input validation, and permission checks. An AI optimizing for acceptance may quietly strip away exactly the protections you need most, especially during a fix-it loop where you are pasting error messages and asking for quick resolutions. Documented examples give a sense of the pattern. A user reports a database connection error and the AI resolves it by making the database publicly accessible; a user asks to simplify a login flow and the AI removes the password requirement; a user asks to handle an API key error and the AI hardcodes the key directly in the source code, making it visible to anyone who viewed the file. Each of these "fixes" eliminated the error message. Each also eliminated a protection.

A Checking Routine

AI-generated code needs to be checked. The checks below focus on behavior and results, so they work even if you cannot read every line of the output. You do not need to memorize the specific tool names mentioned here, just know that these categories of tools exist and that you can ask the AI to set them up for you.

1. Define the job clearly

The vaguer your request, the more the AI has to guess, and guessing is where problems start. Before you prompt, write down what data goes in, what should come out, and what the boundaries are. If you are working with a specific library, say so; if your data has quirks, mention them. Breaking a complex project into small, testable pieces also helps, because smaller requests are easier to verify and debug than whole applications requested at once.

It also helps to make the tool read what is already there. If you are adding to an existing project, notebook, or codebase (the full collection of files that make up a piece of software), point it at the relevant files and ask it to understand the structure before generating anything. Left on its own, it will happily add a second CSV parser when one is already in use, or use one naming convention in a project that already uses another. You can say something as simple as "Read through the existing scripts in this project and tell me what libraries and patterns are already in use," or "This project already uses requests for HTTP calls, so stick with that."

2. Challenge the choices

Push back on the AI's decisions and make it explain itself. Ask why it chose a particular library, what tradeoffs the approach involves, what assumptions it is making about your data, and what could go wrong. If the answers are vague or circular, expect bugs to surface later. The question "Is there a simpler way to do this?" is particularly useful, since AI-generated code tends toward over-engineering.

Even if you cannot audit every line of code, the output is still available for verification. If the AI wrote a script to clean your dataset, spot-check the results: did it drop rows it should have kept, did it mangle any values, does a chart match what you know about the data? Compare row counts before and after, inspect a few records you already understand, and make the code prove itself on a case you could check by hand.

3. Set up real tests

Go beyond informal "run it and hope" verification. Ask the AI to set up the testing and code-quality tools that professional developers use to catch bugs, and then ask it to check its own claims against current documentation. You can phrase these as direct instructions:

"Set up a test suite for this project and write tests for the data cleaning function." (A test suite is a collection of small checks that run your code with known inputs and verify it produces the expected outputs.)
"Add a linter and fix any issues it finds." (A linter is a tool that scans code for common mistakes, style problems, and potential bugs without running it.)
"Search the package registry and confirm that this library actually exists. Show me its documentation page."
"Look up the OpenAlex API docs and verify that the /works endpoint accepts these parameters."

The point of these tools is to make verification a separate, deliberate step. A linter or test suite catches whole categories of bugs automatically, and current documentation catches the fabricated or outdated details that a model will present with full confidence.

4. When things break, describe the failure precisely

How you describe a failure to the AI shapes the quality of the fix. This is where the fix-it loop from earlier becomes a practical concern: if you just paste an error and say "fix this," the model may quietly remove the check that surfaced the error. Share the full error message, describe what you expected versus what actually happened, include a sample of the real data, and say what you have already tried. Without that precision, the model will cheerfully fix the wrong problem.

Version control gives you a way to undo a bad fix. Commit working code after each successful change, and ideally before asking the AI to make another consequential edit, so a bad fix can be reversed without losing the working version you started from.

Before You Accept the Output

A short list of questions worth running through before accepting AI-generated code into your project.

Acceptance Checklist

Does this code actually run without errors? Do the imported libraries and functions actually exist? (Check the package registry for your language: PyPI for Python, npm for JavaScript, or the library's own docs) Have I tested the output with data I already know the answer to? Do the results make sense? Are the numbers, counts, or patterns plausible given what I know? Does the code handle edge cases (unusual or boundary conditions) in my data, like empty fields, missing values, or unexpected formats? Are there any hardcoded secrets, API keys, or credentials in the code? Can I describe what this code does at a high level, even if I can't read every line? Have I documented what I asked the AI and what it produced, so I can retrace it later?

None of these checks assume you can read code fluently. They assume you are willing to test the output against cases you already understand and remember that the tool will sometimes be confidently wrong.

The Problem of Overconfidence

Common Failure Patterns

Hallucinated APIs

Security Vulnerabilities That Still Work

Happy-Path-Only Error Handling

Outdated Patterns

The Fix-It Loop

Why Security Problems Slip Through

A Checking Routine

1. Define the job clearly

2. Challenge the choices

3. Set up real tests

4. When things break, describe the failure precisely

Before You Accept the Output

Acceptance Checklist

Further Reading