The Geometry Behind The Dot Product: Unit Vectors, Projections, And Intuition

This article is the first of three parts. Each part stands on its own, so you don’t need to read the others to understand it.

The dot product is one of the most important operations in machine learning – but it’s hard to understand without the right geometric foundations. In this first part, we build those foundations:

· Unit vectors

· Scalar projection

· Vector projection

Whether you are a student learning Linear Algebra for the first time, or want to refresh these concepts, I recommend you read this article.

In fact, we will introduce and explain the dot product in this article, and in the next article, we will explore it in greater depth.

The vector projection section is included as an optional bonus: helpful, but not necessary for understanding the dot product.

The next part explores the dot product in greater depth: its geometric meaning, its relationship to cosine similarity, and why the difference matters.

The final part connects these ideas to two major applications: recommendation systems and NLP.

A vector $large mathbf{vec{v}}$ is called a unit vector if its magnitude is 1:

$LARGE mathbf{|vec{v}|} = 1$

To remove the magnitude of a non-zero vector while keeping its direction, we can normalize it. Normalization scales the vector by the factor:

$LARGE frac{1}{|mathbf{vec{v}}|}$

The normalized vector $large mathbf{hat{v}}$ is the unit vector in the direction of $large mathbf{vec{v}}$ :

$LARGE begin{array}{|c|} hline mathbf{hat{v}} = frac{mathbf{vec{v}}}{|mathbf{vec{v}}|} \ hline end{array}$

Notation 1. From now on, whenever we normalize a vector $large mathbf{vec{v}}$ , or write $large mathbf{hat{v}}$ , we assume that $large mathbf{vec{v}} neq 0$ . This notation, along with the ones that follow, is also relevant to the following articles.

This operation naturally separates a vector into its magnitude and its direction:

Contents

𝐯→=|𝐯→|⏟magnitude⋅𝐯^⏟directionLARGE begin{array}{|c|} hline rule{0pt}{2.5em} mathbf{vec{v}} = underbrace{|mathbf{vec{v}}|}_{text{magnitude}} cdot underbrace{mathbf{hat{v}}}_{text{direction}} \[4.5em] hline end{array}

Similarity of unit vectors
- The Shadow Analogy
Calculation
- Deriving the formula
How is the unit vector related?
The Trail Analogy

$LARGE begin{array}{|c|} hline rule{0pt}{2.5em} mathbf{vec{v}} = underbrace{|mathbf{vec{v}}|}_{text{magnitude}} cdot underbrace{mathbf{hat{v}}}_{text{direction}} \[4.5em] hline end{array}$

Figure 1 illustrates this idea: ${mathbf{v}}$ and $large mathbf{hat{v}}$ point in the same direction, but have different magnitudes.

Figure 1-Separating “How Much” from “Which Way”. Any vector can be written as the product of its magnitude and its unit vector, which preserves direction but has length 1. Image by Author (created using Claude).

Similarity of unit vectors

In two dimensions, all unit vectors lie on the unit circle (radius 1, centered at the origin). A unit vector that forms an angle θ with the x-axis has coordinates (cos θ, sin θ).

This means the angle between two unit vectors encodes a natural similarity score - as we will show shortly, this score is exactly cos θ: equal to 1 when they point the same way, 0 when perpendicular, and −1 when opposite.

Notation 2. Throughout this article, θ denotes the smallest angle between the two vectors, so $0° leq theta leq 180°$ .

In practice, we don’t know θ directly – we know the vectors’ coordinates.

We can show why the dot product of two unit vectors: $largehat{a}$ and $largehat{b}$ equals cos θ using a geometric argument in three steps:

1. Rotate the coordinate system until $largehat{b}$ lies along the x-axis. Rotation doesn’t change angles or magnitudes.

2. Read off the new coordinates. After rotation, $largehat{b}$ has coordinates (1 , 0). Since $largehat{a}$ is a unit vector at angle θ from the x-axis, the unit circle definition gives its coordinates as (cos θ, sin θ).

3. Multiply corresponding components and sum:

$Large begin{aligned} hat{a} cdot hat{b} = a_x cdot b_x + a_y cdot b_y = \ costheta cdot 1 + sintheta cdot 0 = costheta end{aligned}$

This sum of component-wise products is called the dot product:

$Large boxed{ begin{aligned} vec{a} cdot vec{b} = a_1 cdot b_1 + a_2 cdot b_2 \ + cdots + a_n cdot b_n end{aligned} }$

See the illustration of these three steps in Figure 2 below:

img 69d4c4a3e3019 — Figure 2- By rotating our perspective to align with the x-axis, the coordinate math simplifies beautifully to reveal why the two unit vectors’ dot product is equal to cos(θ). Image by Author (created using Claude).

Everything above was shown in 2D, but the same result holds in any number of dimensions. Any two vectors, no matter how many dimensions they live in, always lie in a single flat plane. We can rotate that plane to align with the xy-plane — and from there, the 2D proof applies exactly.

Notation 3. In the diagrams that follow, we often draw one of the vectors (typically $largevec{b}$ ) along the horizontal axis. When $largevec{b}$ is not already aligned with the x-axis, we can always rotate our coordinate system as we did above (the “rotation trick”). Since rotation preserves all lengths, angles, and dot products, every formula derived in this orientation holds for any direction of $largevec{b}$ .

A vector can contribute in many directions at once, but often we care about only one direction.

Scalar projection answers the question: How much of $large boldsymbol{vec{a}}$ lies along the direction of $large boldsymbol{vec{b}}$ ?

This value is negative if the projection points in the opposite direction of $largevec{b}$ .

The Shadow Analogy

The most intuitive way to think about scalar projection is as the length of a shadow. Imagine you hold a stick (vector $large vec{a}$ ) at an angle above the ground (the direction of $largevec{b}$ ), and a light source shines straight down from above.

The shadow that the stick casts on the ground is the scalar projection.

The animated figure below illustrates this idea:

img 69d4c4a627afa — Figure 3- **Scalar projection as a shadow.**
The scalar projection measures how much of vector a lies in the direction of b.
It equals the length of the shadow that a casts onto b (Woo, 2023). The GIF was created by Claude

Calculation

Imagine a light source shining straight down onto the line PS (the direction of $largevec{b}$ ). The “shadow” that $largevec{a}$ (the arrow from P to Q ) casts onto that line is exactly the segment PR. You can see this in Figure 4.

img 69d4c4a8f003f — Figure 4: **Measuring Directional Alignment.** The scalar projection (segment PR) visually answers the core question: “How much of vector a lies in the exact direction of vector b.” Image by Author (created using Claude).

Deriving the formula

Now look at the triangle $large PQR$ : the perpendicular drop from $large Q$ creates a right triangle, and its sides are:

$large PQ = |vec{a}|$ (the hypotenuse).
$large PR$ (the adjacent side – the shadow).
$large QR$ (the opposite side – the perpendicular component).

From this triangle:

The angle between $largevec{a}$ and $largevec{b}$ is θ.
$large cos(theta) = frac{PR}{|vec{a}|}$ (the most basic definition of cosine).
Multiply both sides by $large|vec{a}|$ :

$LARGE begin{array}{|c|} hline PR = |vec{a}| cos(theta) \ hline end{array}$

The Segment $boldsymbol{PR}$ is the shadow length – the scalar projection of $large boldsymbol{vec{a}}$ on $large boldsymbol{vec{b}}$ .

When θ > 90°, the scalar projection becomes negative too. Think of the shadow as flipping to the opposite side.

How is the unit vector related?

The shadow’s length (PR) doesn’t depend on how long $largevec{b}$ is. It depends on $large|vec{a}|$ and on θ.

When you compute $largevec{a} cdot hat{b}$ , you are asking: how much of $largevec{a}$ lies along $largevec{b}$ direction? This is the shadow length.

The unit vector acts like a direction filter: multiplying $largevec{a}$ by it extracts the component of $largevec{a}$ along that direction.

Let’s see it using the rotation trick. We place b̂ along the x-axis:

$Large vec{a} = (|vec{a}|costheta, |vec{a}|sin(theta))$

and:

$Large hat{b} = (1, 0)$

Then:

$Large begin{aligned} vec{a} cdot hat{b} = |vec{a}|costheta cdot 1 \ + |vec{a}|sin(theta) cdot 0 = |vec{a}|costheta end{aligned}$

The scalar projection of $large boldsymbol{vec{a}}$ in the direction of $large boldsymbol{vec{b}}$ is:

$LARGE renewcommand{arraystretch}{2} begin{array}{|c|} hline begin{aligned} |vec{a}|costheta &= vec{a} cdot hat{b} \ &= frac{vec{a} cdot vec{b}}{|vec{b}|} end{aligned} \ hline end{array}$

We apply the same rotation trick one more time, now with two general vectors: $largevec{a}$ and $largevec{b}$ .

After rotation:

$Large vec{a} = (|vec{a}|costheta, |vec{a}|sintheta)$ ,

$Large vec{b} = (|vec{b}|, 0)$

so:

$Large begin{aligned} vec{a} cdot vec{b} = |vec{a}|costheta cdot |vec{b}| \ + |vec{a}|sintheta cdot 0 = |vec{a}||vec{b}|costheta end{aligned}$

The dot product of $large boldsymbol{vec{a}}$ and $large boldsymbol{vec{b}}$ is:

$Large renewcommand{arraystretch}{2} begin{array}{|l|} hline vec{a} cdot vec{b} = a_1 b_1+ dots + a_n b_n \ = sum_{i=1}^{n} a_i b_i = |vec{a}||vec{b}|costheta \ hline end{array}$

Vector projection extracts the portion of vector $large boldsymbol{vec{a}}$ that points along the direction of vector $large boldsymbol{vec{b}}$ .

The Trail Analogy

Imagine two trails starting from the same point (the origin):

Trail A leads to a whale-watching spot.
Trail B leads along the coast in a different direction.

Here’s the question projection answers:

You’re only allowed to walk along Trail B. How far should you walk so that you end up as close as possible to the endpoint of Trail A?

You walk along B, and at some point, you stop. From where you stopped, you look toward the end of Trail A, and the line connecting you to it forms a perfect 90° angle with Trail B. That’s the key geometric fact – the closest point is always where you’d make a right-angle turn.

The spot where you stop on Trail B is the projection of A onto B. It represents “the part of A that goes in B’s direction.

The remaining gap - from your stopping point to the actual end of Trail A – is everything about A that has nothing to do with B’s direction. This example is illustrated in Figure 5 below: The vector that starts at the origin, points along Trail B, and ends at the closest point –is the vector projection of $largevec{a}$ onto $largevec{b}$ .

img 69d4c4ac4c46d — Figure 5 — **Vector projection as the closest point to a direction.**
Walking along trail B, the closest point to the endpoint of A occurs where the connecting segment forms a right angle with B. This point is the projection of A onto B. Image by Author (created using Claude)..

Scalar projection answers: “How far did you walk?”

That’s just a distance, a single number.

Vector projection answers: “Where exactly are you?”

More precisely: “What is the actual movement along Trail B that gets you to that closest point?”

Now “1.5 kilometers” isn’t enough, you need to say “1.5 kilometers east along the coast.” That’s a distance plus a direction: an arrow, not just a number. The arrow starts at the origin, points along Trail B, and ends at the closest point.

The distance you walked is the scalar projection value. The magnitude of the vector projection equals the absolute value of the scalar projection.

Unit vector answers : “Which direction does Trail B go?”

It is exactly what $largehat{b}$ represents. It’s Trail B stripped of any length information - just the pure direction of the coast.

$begin{aligned} &text{vector projection} = \ &underbrace{(text{how far you walk})}_{text{scalar projection}} times underbrace{(text{B direction})}_{hat{b}} end{aligned}$

I know the whale analog is very specific; it was inspired by this good explanation (Michael.P, 2014)

Figure 6 below shows the same shadow diagram as in Figure 4, with PR drawn as an arrow, because the vector projection is a vector (with both length and direction), not just a number.

img 69d4c4add40cc — Figure 6 — **Vector projection as a directional shadow.**
Unlike scalar projection (a length), the vector projection is an arrow along vector b. Image by Author (created using Claude).

Since the projection must lie along $largevec{b}$ , we need two things for $largevec{PR}$ :

Its magnitude is the scalar projection: $large|vec{a}|costheta$
Its direction is: $largehat{b}$ (the direction of $largevec{b}$ )

Any vector equals its magnitude times its direction (as we saw in the Unit Vector section), so:

$large begin{array}{|c|} hline hspace{10pt} vec{PR} = underbrace{|vec{a}| cos theta}_{text{scalar projection}} cdot underbrace{hat{b}}_{text{direction of } vec{b}} hspace{20pt} \ hline end{array}$

This is already the vector projection formula. We can rewrite it by substituting $largehat{b} = frac{vec{b}}{|vec{b}|}$ , and recognizing that $large|vec{a}||vec{b}|costheta = vec{a} cdot vec{b}$

The vector projection of $large boldsymbol{vec{a}}$ in the direction of $large boldsymbol{vec{b}}$ is:

$Large renewcommand{arraystretch}{1.5} begin{array}{|c|} hline begin{aligned} text{proj}_{vec{b}}(vec{a}) &= (|vec{a}|costheta)hat{b} \ &= left(frac{vec{a} cdot vec{b}}{|vec{b}|^2}right)vec{b} \ &= (vec{a} cdot hat{b})hat{b} end{aligned} \ hline end{array}$

A unit vector isolates a vector’s direction by stripping away its magnitude.

$LARGE begin{array}{|c|} hline mathbf{hat{v}} = frac{mathbf{vec{v}}}{|mathbf{vec{v}}|} \ hline end{array}$

The dot product multiplies corresponding components and sums them. It is also equal to the product of the magnitudes of the two vectors multiplied by the cosine of the angle between them.

$renewcommand{arraystretch}{2} begin{array}{|l|} hline vec{a} cdot vec{b} = a_1 b_1+ dots + a_n b_n \ = sum_{i=1}^{n} a_i b_i = |vec{a}||vec{b}|costheta \ hline end{array}$

Scalar projection uses the dot product to measure how far one vector reaches along another’s direction - a single number, like the length of a shadow

$Large begin{array}{|c|} hline |vec{a}|costheta = vec{a} cdot hat{b} = frac{vec{a} cdot vec{b}}{|vec{b}|} \ hline end{array}$

Vector projection goes one step further, returning an actual arrow along that direction: the scalar projection times the unit vector.

$Large renewcommand{arraystretch}{2} begin{array}{|l|} hline (|vec{a}|costheta)hat{b} = (vec{a} cdot hat{b})hat{b} \ hline end{array}$

In the next part, we will use the tools we learned in this article to truly understand the dot product.

The Geometry Behind the Dot Product: Unit Vectors, Projections, and Intuition

$LARGE begin{array}{|c|} hline rule{0pt}{2.5em} mathbf{vec{v}} = underbrace{|mathbf{vec{v}}|}_{text{magnitude}} cdot underbrace{mathbf{hat{v}}}_{text{direction}} \[4.5em] hline end{array}$

Similarity of unit vectors

The Shadow Analogy

Calculation

Deriving the formula

How is the unit vector related?

$Large vec{a} = (|vec{a}|costheta, |vec{a}|sin(theta))$

We apply the same rotation trick one more time, now with two general vectors: $largevec{a}$ and $largevec{b}$ .

$Large renewcommand{arraystretch}{2} begin{array}{|l|} hline vec{a} cdot vec{b} = a_1 b_1+ dots + a_n b_n \ = sum_{i=1}^{n} a_i b_i = |vec{a}||vec{b}|costheta \ hline end{array}$

The Trail Analogy

Leave a Comment Cancel Reply

𝐯→=|𝐯→|⏟magnitude⋅𝐯^⏟directionLARGE begin{array}{|c|} hline rule{0pt}{2.5em} mathbf{vec{v}} = underbrace{|mathbf{vec{v}}|}_{text{magnitude}} cdot underbrace{mathbf{hat{v}}}_{text{direction}} \[4.5em] hline end{array}

Similarity of unit vectors

The Shadow Analogy

Calculation

Deriving the formula

How is the unit vector related?

a→=(|a→|cos⁡θ, |a→|sin⁡(θ))Large vec{a} = (|vec{a}|costheta, |vec{a}|sin(theta))

We apply the same rotation trick one more time, now with two general vectors: a→largevec{a} and b→largevec{b}.

a→⋅b→=a1b1+⋯+anbn=∑i=1naibi=|a→||b→|cos⁡θLarge renewcommand{arraystretch}{2} begin{array}{|l|} hline vec{a} cdot vec{b} = a_1 b_1+ dots + a_n b_n \ = sum_{i=1}^{n} a_i b_i = |vec{a}||vec{b}|costheta \ hline end{array}

The Trail Analogy

Leave a Comment Cancel Reply

$LARGE begin{array}{|c|} hline rule{0pt}{2.5em} mathbf{vec{v}} = underbrace{|mathbf{vec{v}}|}_{text{magnitude}} cdot underbrace{mathbf{hat{v}}}_{text{direction}} \[4.5em] hline end{array}$

$Large vec{a} = (|vec{a}|costheta, |vec{a}|sin(theta))$

We apply the same rotation trick one more time, now with two general vectors: $largevec{a}$ and $largevec{b}$ .

$Large renewcommand{arraystretch}{2} begin{array}{|l|} hline vec{a} cdot vec{b} = a_1 b_1+ dots + a_n b_n \ = sum_{i=1}^{n} a_i b_i = |vec{a}||vec{b}|costheta \ hline end{array}$