Introduction
Most developers waste 30% of their time rewriting prompts that should've worked the first time. I've been there. You ask an LLM for an Angular component, and it gives you something that almost works—wrong lifecycle hooks, missing dependency injection, or code that looks like it's from AngularJS days. The problem isn't the AI. It's how we're talking to it.
After taking Sabrina Goldfarb's Prompt Engineering course on Frontend Masters, I've documented what actually works for frontend developers—the structure, context, and constraints that turn vague requests into production-ready Angular code. I've applied these techniques daily in my workflow, and they've transformed how I interact with AI tools. No fluff. Just practical prompt engineering fundamentals that will save you hours of debugging and rewrites. If you're tired of playing "prompt roulette" every time you need help with code, this breakdown is for you.
Key takeaways
What is Prompt Engineering?
Temperature & Top P
Token Limits & Context Windows
Standard Prompt
What is Prompting Engineering?
Prompt Engineering: Definition and Nature
Prompt engineering (PE) is defined by OpenAI as the process of writing effective instructions for a model such that it consistently generates content that meets your requirements.
Prompting to get the desired output is considered a mix of art and science.
The science involves techniques and best practices that have been empirically proven to work, often based on research papers.
The art acknowledges that individual outputs will differ, even if users enter the exact same prompt into the exact same model, demonstrating that the output is not strictly determined.
What Prompt Engineering Is Not:
It is not magic; it cannot change the fundamental limitations of LLMs or suddenly make them deterministic.
It is not the only tool available (other tools include RAG or MCP), but it is the most accessible tool. Focusing only on prompting can achieve 70% or 80% of the desired results.
While PE cannot make LLMs deterministic, utilizing systematic approaches can lead to measurable results and more consistent outputs.
What are Large Language Models (LLMs)
LLMs are Large Language Models, systems commonly referred to as AI, such as GPT-4 and Claude. The "Large" component refers to the substantial amount of data they were trained on, and the "Language" part relates to their training on natural language.
LLMs as Pattern Predictors:
LLMs are pattern predictors that generate one token at a time.
They predict the next most likely token based on the input.
Because they generate output token by token, there is no planning ahead; LLMs only "think" while they are actively typing.
Nondeterminism
Nondeterminism is a crucial characteristic of LLMs.
A calculator is deterministic: for a given input (like 2 + 2), it should always return the same correct output (4).
LLMs are nondeterministic: they may not always predict the single next most likely token.
If the same prompt is entered multiple times, different answers will likely result, even if using the exact same model at the exact same time.
For example, asking "What color is the sky" might return "blue," "gray," or a paragraph describing orange and pink sunsets.
Generally, if you enter the same prompt ten times, you will receive different answers ten out of ten times.
Training and Data Cutoff Dates
LLMs are trained on data collected up to a specific training cutoff date.
Information collected after that cutoff date may be slightly less reliable.
Developers should be aware of this, especially when using new frameworks or languages.
LLMs are constantly updated, but a cutoff date always exists.
Many LLMs now have multi-modality, allowing them to search the internet for newer information.
The Transformer Architecture
LLMs work similarly to a phone’s autocomplete feature, which predicts the next token.
Traditional autocomplete remembers about five or ten tokens.
The transformer architecture allows attention over thousands of tokens.
Introduced in the 2017 paper "Attention Is All You Need".
The attention mechanism helps models understand which tokens matter most.
Scaling Laws
Alongside transformers, scaling laws showed that increasing model size by 10X led to models becoming 100X more capable.
This explains rapid AI adoption.
Context windows scaled from ~4,000 tokens to over a million.
Temperature & Top P
(Temperature and Top P) control randomness and determinism in LLMs.
1. Temperature
Temperature controls how often the model chooses the most likely next token.
Average: 1
0 → near-deterministic
2 → chaotic and unusable
Optimal Uses
Temperature | Use Case | Description |
|---|---|---|
Low (0–0.5) | Factual, code, data extraction | Precise, low-risk outputs |
High (up to 1.3) | Creative writing, brainstorming | More varied and friendly outputs |
Availability: Adjustable via APIs, not standard chat UIs.
2. Top P
Top P is a cumulative probability cutoff.
1 → all tokens considered
Lower values remove low-probability tokens
Example:
Blue (75%)
Gray (20%)
Orange (5%)
Top P = 0.5 → only "blue" considered.
Combined Use:
Often used with temperature in production systems.
Analogy
Temperature = how adventurous the choice is.
Top P = how many menu items are even allowed.
Token Limits & Context Windows
I. Tokenization
Tokens ≈ 0.75 words
Case-sensitive
Spaces count
Code tokenizes differently
II. Memory and the Context Window
LLMs have no true memory
They rely on the context window
Key Points:
Entire conversation is resent every turn
Older messages drop silently
Early instructions can be forgotten
Long contexts increase hallucination risk
Codebases:
Large pasted codebases overwhelm context
Prefer minimal relevant files
What is The System Message?
The system message is invisible instructions that define:
Who the AI is
How it behaves
What it must avoid
What is its Use?
Consistency
Safety
Customization
Provider Control
Relation to the Context Window
System message always exists
Takes context space
Never dropped
User messages drop first
Notebook Analogy:
Pages 1–10: System message
Pages 11–100: Conversation
Old user pages erased first
The system message:
Defines AI behavior
Is invisible but permanent
Is controlled by the provider
Stronger than user prompts
Set by providers
Adjustable via APIs
Can be jailbroken
Should never contain secrets
The Standard Prompt
Simplest form of prompting
Direct question or instruction
Better structure = better output
Standard Prompt Examples
"What color is the sky?"
Demonstrates nondeterminism
Typical answer: blue (scattering)
Other tokens: orange, pink, gray, etc.
"Why is thunder so scary?"
Reasons: surprise, loudness, instincts, danger
Follow-up question encouraged
Outro
No more back-and-forth. No more "close but not quite" components that need heavy refactoring. The difference between frustrating AI interactions and productive ones comes down to how you structure your prompts. Give it the right context, constraints, and examples—and you'll get code that actually ships. These techniques have saved me countless hours of debugging AI-generated code. But here's the thing—this is just Part 1.
In Part 2, I'm diving into advanced prompt engineering techniques: chain-of-thought prompting, few-shot learning for complex components, and how to build reusable prompt templates that scale across your entire codebase.
What's your biggest challenge when prompting AI for code? Drop it in the comments—I might cover it in Part 2.
If this helped you, share it with a fellow developer who's still fighting with their prompts.

