STACK

STORIES

Your AI Is not Dumb, Your Prompt Is

The engineering mindset behind better AI outputs

AHMED ASHRAF·2025-12-19·10 min
Article Banner
AHMED ASHRAF

AHMED ASHRAF

Introduction


Most developers waste 30% of their time rewriting prompts that should've worked the first time. I've been there. You ask an LLM for an Angular component, and it gives you something that almost works—wrong lifecycle hooks, missing dependency injection, or code that looks like it's from AngularJS days. The problem isn't the AI. It's how we're talking to it.

After taking Sabrina Goldfarb's Prompt Engineering course on Frontend Masters, I've documented what actually works for frontend developers—the structure, context, and constraints that turn vague requests into production-ready Angular code. I've applied these techniques daily in my workflow, and they've transformed how I interact with AI tools. No fluff. Just practical prompt engineering fundamentals that will save you hours of debugging and rewrites. If you're tired of playing "prompt roulette" every time you need help with code, this breakdown is for you.

Key takeaways

  1. What is Prompt Engineering?

  2. Temperature & Top P

  3. Token Limits & Context Windows

  4. Standard Prompt


What is Prompting Engineering?

Prompt Engineering: Definition and Nature

Prompt engineering (PE) is defined by OpenAI as the process of writing effective instructions for a model such that it consistently generates content that meets your requirements.

Prompting to get the desired output is considered a mix of art and science.

  • The science involves techniques and best practices that have been empirically proven to work, often based on research papers.

  • The art acknowledges that individual outputs will differ, even if users enter the exact same prompt into the exact same model, demonstrating that the output is not strictly determined.

What Prompt Engineering Is Not:

  • It is not magic; it cannot change the fundamental limitations of LLMs or suddenly make them deterministic.

  • It is not the only tool available (other tools include RAG or MCP), but it is the most accessible tool. Focusing only on prompting can achieve 70% or 80% of the desired results.

  • While PE cannot make LLMs deterministic, utilizing systematic approaches can lead to measurable results and more consistent outputs.


What are Large Language Models (LLMs)

LLMs are Large Language Models, systems commonly referred to as AI, such as GPT-4 and Claude. The "Large" component refers to the substantial amount of data they were trained on, and the "Language" part relates to their training on natural language.

LLMs as Pattern Predictors:

  • LLMs are pattern predictors that generate one token at a time.

  • They predict the next most likely token based on the input.

  • Because they generate output token by token, there is no planning ahead; LLMs only "think" while they are actively typing.


Nondeterminism

Nondeterminism is a crucial characteristic of LLMs.

  • A calculator is deterministic: for a given input (like 2 + 2), it should always return the same correct output (4).

  • LLMs are nondeterministic: they may not always predict the single next most likely token.

  • If the same prompt is entered multiple times, different answers will likely result, even if using the exact same model at the exact same time.

  • For example, asking "What color is the sky" might return "blue," "gray," or a paragraph describing orange and pink sunsets.

  • Generally, if you enter the same prompt ten times, you will receive different answers ten out of ten times.


Training and Data Cutoff Dates

LLMs are trained on data collected up to a specific training cutoff date.

  • Information collected after that cutoff date may be slightly less reliable.

  • Developers should be aware of this, especially when using new frameworks or languages.

  • LLMs are constantly updated, but a cutoff date always exists.

  • Many LLMs now have multi-modality, allowing them to search the internet for newer information.


The Transformer Architecture

LLMs work similarly to a phone’s autocomplete feature, which predicts the next token.

  • Traditional autocomplete remembers about five or ten tokens.

  • The transformer architecture allows attention over thousands of tokens.

  • Introduced in the 2017 paper "Attention Is All You Need".

  • The attention mechanism helps models understand which tokens matter most.


Scaling Laws

Alongside transformers, scaling laws showed that increasing model size by 10X led to models becoming 100X more capable.

  • This explains rapid AI adoption.

  • Context windows scaled from ~4,000 tokens to over a million.


Temperature & Top P

(Temperature and Top P) control randomness and determinism in LLMs.


1. Temperature

Temperature controls how often the model chooses the most likely next token.

  • Average: 1

  • 0 → near-deterministic

  • 2 → chaotic and unusable

Optimal Uses

Temperature

Use Case

Description

Low (0–0.5)

Factual, code, data extraction

Precise, low-risk outputs

High (up to 1.3)

Creative writing, brainstorming

More varied and friendly outputs

Availability: Adjustable via APIs, not standard chat UIs.


2. Top P

Top P is a cumulative probability cutoff.

  • 1 → all tokens considered

  • Lower values remove low-probability tokens

Example:

  • Blue (75%)

  • Gray (20%)

  • Orange (5%)

Top P = 0.5 → only "blue" considered.

Combined Use:
Often used with temperature in production systems.


Analogy

  • Temperature = how adventurous the choice is.

  • Top P = how many menu items are even allowed.


Token Limits & Context Windows

I. Tokenization

  • Tokens ≈ 0.75 words

  • Case-sensitive

  • Spaces count

  • Code tokenizes differently


II. Memory and the Context Window

  • LLMs have no true memory

  • They rely on the context window

Key Points:

  • Entire conversation is resent every turn

  • Older messages drop silently

  • Early instructions can be forgotten

  • Long contexts increase hallucination risk

Codebases:

  • Large pasted codebases overwhelm context

  • Prefer minimal relevant files


What is The System Message?

The system message is invisible instructions that define:

  • Who the AI is

  • How it behaves

  • What it must avoid

What is its Use?

  • Consistency

  • Safety

  • Customization

  • Provider Control


Relation to the Context Window

  • System message always exists

  • Takes context space

  • Never dropped

  • User messages drop first

Notebook Analogy:

  • Pages 1–10: System message

  • Pages 11–100: Conversation

  • Old user pages erased first

The system message:

  • Defines AI behavior

  • Is invisible but permanent

  • Is controlled by the provider

  • Stronger than user prompts

  • Set by providers

  • Adjustable via APIs

  • Can be jailbroken

  • Should never contain secrets


The Standard Prompt

  • Simplest form of prompting

  • Direct question or instruction

  • Better structure = better output

Standard Prompt Examples

  1. "What color is the sky?"

    • Demonstrates nondeterminism

    • Typical answer: blue (scattering)

    • Other tokens: orange, pink, gray, etc.

  2. "Why is thunder so scary?"

    • Reasons: surprise, loudness, instincts, danger

    • Follow-up question encouraged


Outro

No more back-and-forth. No more "close but not quite" components that need heavy refactoring. The difference between frustrating AI interactions and productive ones comes down to how you structure your prompts. Give it the right context, constraints, and examples—and you'll get code that actually ships. These techniques have saved me countless hours of debugging AI-generated code. But here's the thing—this is just Part 1.

In Part 2, I'm diving into advanced prompt engineering techniques: chain-of-thought prompting, few-shot learning for complex components, and how to build reusable prompt templates that scale across your entire codebase.

What's your biggest challenge when prompting AI for code? Drop it in the comments—I might cover it in Part 2.

If this helped you, share it with a fellow developer who's still fighting with their prompts.

References:
Write Better Prompts for Cursor, Claude & Copilot | Frontend Masters