The Geometry Behind the Dot Product: Unit Vectors, Projections, and Intuition



This article is the first of three parts. Each part stands on its own, so you don’t need to read the others to understand it.

The dot product is one of the most important operations in machine learning – but it’s hard to understand without the right geometric foundations. In this first part, we build those foundations:

· Unit vectors

· Scalar projection

· Vector projection

Whether you are a student learning Linear Algebra for the first time, or want to refresh these concepts, I recommend you read this article.

In fact, we will introduce and explain the dot product in this article, and in the next article, we will explore it in greater depth.

The vector projection section is included as an optional bonus: helpful, but not necessary for understanding the dot product.

The next part explores the dot product in greater depth: its geometric meaning, its relationship to cosine similarity, and why the difference matters.

The final part connects these ideas to two major applications: recommendation systems and NLP.


A vector 𝐯large mathbf{vec{v}} is called a unit vector if its magnitude is 1:

|𝐯|=1LARGE mathbf{|vec{v}|} = 1

To remove the magnitude of a non-zero vector while keeping its direction, we can normalize it. Normalization scales the vector by the factor:

1|𝐯|LARGE frac{1}{|mathbf{vec{v}}|}

The normalized vector 𝐯^large mathbf{hat{v}}  is the unit vector in the direction of 𝐯large mathbf{vec{v}}

𝐯^=𝐯|𝐯|LARGE begin{array}{|c|} hline mathbf{hat{v}} = frac{mathbf{vec{v}}}{|mathbf{vec{v}}|} \ hline end{array}

Notation 1. From now on, whenever we normalize a vector 𝐯large mathbf{vec{v}},  or write 𝐯^large mathbf{hat{v}}, we assume that 𝐯0large mathbf{vec{v}} neq 0. This notation, along with the ones that follow, is also relevant to the following articles.

This operation naturally separates a vector into its magnitude and its direction:

𝐯=|𝐯|magnitude𝐯^directionLARGE begin{array}{|c|} hline rule{0pt}{2.5em} mathbf{vec{v}} = underbrace{|mathbf{vec{v}}|}_{text{magnitude}} cdot underbrace{mathbf{hat{v}}}_{text{direction}} \[4.5em] hline end{array}

Figure 1 illustrates this idea: 𝐯{mathbf{v}} and 𝐯^large mathbf{hat{v}} point in the same direction, but have different magnitudes.

Figure 1-Separating “How Much” from “Which Way”. Any vector can be written as the product of its magnitude and its unit vector, which preserves direction but has length 1. Image by Author (created using Claude).

Similarity of unit vectors

In two dimensions, all unit vectors lie on the unit circle (radius 1, centered at the origin). A unit vector that forms an angle θ with the x-axis has coordinates (cos θ, sin θ).

This means the angle between two unit vectors encodes a natural similarity score - as we will show shortly, this score is exactly cos θ: equal to 1 when they point the same way, 0 when perpendicular, and −1 when opposite.

Notation 2. Throughout this article, θ denotes the smallest angle between the two vectors, so 0°θ180°0° leq theta leq 180° .

In practice, we don’t know θ directly – we know the vectors’ coordinates.

We can show why the dot product of two unit vectors: a^largehat{a} and b^largehat{b} equals cos θ using a geometric argument in three steps:

Read Also:  5 Powerful Python Decorators to Optimize LLM Applications

1. Rotate the coordinate system until b^largehat{b} lies along the x-axis. Rotation doesn’t change angles or magnitudes.

2. Read off the new coordinates. After rotation, b^largehat{b} has coordinates (1 , 0). Since a^largehat{a} is a unit vector at angle θ from the x-axis, the unit circle definition gives its coordinates as (cos θ, sin θ).

3. Multiply corresponding components and sum:

a^b^=axbx+ayby=cosθ1+sinθ0=cosθLarge begin{aligned} hat{a} cdot hat{b} = a_x cdot b_x + a_y cdot b_y = \ costheta cdot 1 + sintheta cdot 0 = costheta end{aligned}

This sum of component-wise products is called the dot product:

ab=a1b1+a2b2++anbnLarge boxed{ begin{aligned} vec{a} cdot vec{b} = a_1 cdot b_1 + a_2 cdot b_2 \ + cdots + a_n cdot b_n end{aligned} }

See the illustration of these three steps in Figure 2 below:

img 69d4c4a3e3019
Figure 2- By rotating our perspective to align with the x-axis, the coordinate math simplifies beautifully to reveal why the two unit vectors’ dot product is equal to cos(θ). Image by Author (created using Claude).

Everything above was shown in 2D, but the same result holds in any number of dimensions. Any two vectors, no matter how many dimensions they live in, always lie in a single flat plane. We can rotate that plane to align with the xy-plane — and from there, the 2D proof applies exactly.

Notation 3. In the diagrams that follow, we often draw one of the vectors (typically blargevec{b}) along the horizontal axis. When blargevec{b} is not already aligned with the x-axis, we can always rotate our coordinate system as we did above (the “rotation trick”). Since rotation preserves all lengths, angles, and dot products, every formula derived in this orientation holds for any direction of blargevec{b}.


A vector can contribute in many directions at once, but often we care about only one direction.

Scalar projection answers the question: How much of 𝒂large boldsymbol{vec{a}} lies along the direction of 𝒃large boldsymbol{vec{b}}?

This value is negative if the projection points in the opposite direction of blargevec{b}.

The Shadow Analogy

The most intuitive way to think about scalar projection is as the length of a shadow. Imagine you hold a stick (vector alarge vec{a}) at an angle above the ground (the direction of blargevec{b}), and a light source shines straight down from above.

The shadow that the stick casts on the ground is the scalar projection.

The animated figure below illustrates this idea:

img 69d4c4a627afa
Figure 3- Scalar projection as a shadow.
 The scalar projection measures how much of vector a lies in the direction of b.
 It equals the length of the shadow that a casts onto b (Woo, 2023). The GIF was created by Claude

Calculation

Imagine a light source shining straight down onto the line PS (the direction of blargevec{b}). The “shadow” that alargevec{a} (the arrow from P to Q ) casts onto that line is exactly the segment PR. You can see this in Figure 4.

img 69d4c4a8f003f
Figure 4: Measuring Directional Alignment. The scalar projection (segment PR) visually answers the core question: “How much of vector a lies in the exact direction of vector b.” Image by Author (created using Claude).

Deriving the formula

Now look at the triangle  PQRlarge PQR: the perpendicular drop from Qlarge Q creates a right triangle, and its sides are:

  •  PQ=|a|large PQ = |vec{a}| (the hypotenuse).
  •  PRlarge PR (the adjacent side – the shadow).
  •  QRlarge QR (the opposite side – the perpendicular component).

From this triangle:

  1. The angle between alargevec{a} and blargevec{b} is θ.
  2. cos(θ)=PR|a|large cos(theta) = frac{PR}{|vec{a}|} (the most basic definition of cosine).
  3. Multiply both sides by |a|large|vec{a}| :

PR=|a|cos(θ)LARGE begin{array}{|c|} hline PR = |vec{a}| cos(theta) \ hline end{array}

The Segment 𝑷𝑹boldsymbol{PR} is the shadow length – the scalar projection of 𝒂large boldsymbol{vec{a}} on 𝒃large boldsymbol{vec{b}}.

Read Also:  Google’s latest AI audio model

When θ > 90°, the scalar projection becomes negative too. Think of the shadow as flipping to the opposite side.

How is the unit vector related?

The shadow’s length (PR) doesn’t depend on how long blargevec{b} is. It depends on |a|large|vec{a}| and on θ.

When you compute ab^largevec{a} cdot hat{b}, you are asking: how much of alargevec{a} lies along blargevec{b} direction?  This is the shadow length.

The unit vector acts like a direction filter: multiplying alargevec{a} by it extracts the component of alargevec{a} along that direction.

Let’s see it using the rotation trick. We place b̂ along the x-axis:

a=(|a|cosθ, |a|sin(θ))Large vec{a} = (|vec{a}|costheta, |vec{a}|sin(theta))

and:

b^=(1,0)Large hat{b} = (1, 0)

Then:

ab^=|a|cosθ1+|a|sin(θ)0=|a|cosθLarge begin{aligned} vec{a} cdot hat{b} = |vec{a}|costheta cdot 1 \ + |vec{a}|sin(theta) cdot 0 = |vec{a}|costheta end{aligned}

The scalar projection of 𝒂large boldsymbol{vec{a}} in the direction of 𝒃large boldsymbol{vec{b}} is:

|a|cosθ=ab^=ab|b|LARGE renewcommand{arraystretch}{2} begin{array}{|c|} hline begin{aligned} |vec{a}|costheta &= vec{a} cdot hat{b} \ &= frac{vec{a} cdot vec{b}}{|vec{b}|} end{aligned} \ hline end{array}


We apply the same rotation trick one more time, now with two general vectors: alargevec{a} and blargevec{b}.

After rotation:

a=(|a|cosθ, |a|sinθ)Large vec{a} = (|vec{a}|costheta, |vec{a}|sintheta) ,

b=(|b|, 0)Large vec{b} = (|vec{b}|, 0)

so:

ab=|a|cosθ|b|+|a|sinθ0=|a||b|cosθLarge begin{aligned} vec{a} cdot vec{b} = |vec{a}|costheta cdot |vec{b}| \ + |vec{a}|sintheta cdot 0 = |vec{a}||vec{b}|costheta end{aligned}

The dot product of 𝒂large boldsymbol{vec{a}} and 𝒃large boldsymbol{vec{b}} is:

ab=a1b1++anbn=i=1naibi=|a||b|cosθLarge renewcommand{arraystretch}{2} begin{array}{|l|} hline vec{a} cdot vec{b} = a_1 b_1+ dots + a_n b_n \ = sum_{i=1}^{n} a_i b_i = |vec{a}||vec{b}|costheta \ hline end{array}


Vector projection extracts the portion of vector 𝒂large boldsymbol{vec{a}} that points along the direction of vector 𝒃large boldsymbol{vec{b}}.

The Trail Analogy

Imagine two trails starting from the same point (the origin):

  • Trail A leads to a whale-watching spot.
  • Trail B leads along the coast in a different direction.

Here’s the question projection answers:

You’re only allowed to walk along Trail B. How far should you walk so that you end up as close as possible to the endpoint of Trail A?

You walk along B, and at some point, you stop. From where you stopped, you look toward the end of Trail A, and the line connecting you to it forms a perfect 90° angle with Trail B. That’s the key geometric fact – the closest point is always where you’d make a right-angle turn.

The spot where you stop on Trail B is the projection of A onto B. It represents “the part of A that goes in B’s direction.

The remaining gap -  from your stopping point to the actual end of Trail A  –  is everything about A that has nothing to do with B’s direction. This example is illustrated in Figure 5 below: The vector that starts at the origin, points along Trail B, and ends at the closest point is the vector projection of alargevec{a} onto blargevec{b} .

Read Also:  The Automation Trap: Why Low-Code AI Models Fail When You Scale
img 69d4c4ac4c46d
Figure 5 — Vector projection as the closest point to a direction.
 Walking along trail B, the closest point to the endpoint of A occurs where the connecting segment forms a right angle with B. This point is the projection of A onto B. Image by Author (created using Claude)..

Scalar projection answers: “How far did you walk?”

That’s just a distance, a single number.

Vector projection answers: “Where exactly are you?”

More precisely: “What is the actual movement along Trail B that gets you to that closest point?”

Now “1.5 kilometers” isn’t enough, you need to say “1.5 kilometers east along the coast.” That’s a distance plus a direction: an arrow, not just a number. The arrow starts at the origin, points along Trail B, and ends at the closest point.

The distance you walked is the scalar projection value. The magnitude of the vector projection equals the absolute value of the scalar projection.

Unit vector  answers : “Which direction does Trail B go?”

It is exactly what b^largehat{b} represents. It’s Trail B stripped of any length information  - just the pure direction of the coast.

vector projection=(how far you walk)scalar projection×(B direction)b^begin{aligned} &text{vector projection} = \ &underbrace{(text{how far you walk})}_{text{scalar projection}} times underbrace{(text{B direction})}_{hat{b}} end{aligned}

I know the whale analog is very specific; it was inspired by this good explanation (Michael.P, 2014)

Figure 6 below shows the same shadow diagram as in Figure 4, with PR drawn as an arrow, because the vector projection is a vector (with both length and direction), not just a number.

img 69d4c4add40cc
Figure 6 — Vector projection as a directional shadow.
 Unlike scalar projection (a length), the vector projection is an arrow along vector b. Image by Author (created using Claude).

Since the projection must lie along blargevec{b} , we need two things for PRlargevec{PR} :

  1. Its magnitude is the scalar projection: |a|cosθlarge|vec{a}|costheta
  2. Its direction is: b^largehat{b} (the direction of blargevec{b})

Any vector equals its magnitude times its direction (as we saw in the Unit Vector section), so:

PR=|a|cosθscalar projectionb^direction of blarge begin{array}{|c|} hline hspace{10pt} vec{PR} = underbrace{|vec{a}| cos theta}_{text{scalar projection}} cdot underbrace{hat{b}}_{text{direction of } vec{b}} hspace{20pt} \ hline end{array}

This is already the vector projection formula. We can rewrite it by substituting b^=b|b|largehat{b} = frac{vec{b}}{|vec{b}|} , and recognizing that |a||b|cosθ=ablarge|vec{a}||vec{b}|costheta = vec{a} cdot vec{b}

The vector projection of 𝒂large boldsymbol{vec{a}} in the direction of 𝒃large boldsymbol{vec{b}} is:

projb(a)=(|a|cosθ)b^=(ab|b|2)b=(ab^)b^Large renewcommand{arraystretch}{1.5} begin{array}{|c|} hline begin{aligned} text{proj}_{vec{b}}(vec{a}) &= (|vec{a}|costheta)hat{b} \ &= left(frac{vec{a} cdot vec{b}}{|vec{b}|^2}right)vec{b} \ &= (vec{a} cdot hat{b})hat{b} end{aligned} \ hline end{array}


  • A unit vector isolates a vector’s direction by stripping away its magnitude.

𝐯^=𝐯|𝐯|LARGE begin{array}{|c|} hline mathbf{hat{v}} = frac{mathbf{vec{v}}}{|mathbf{vec{v}}|} \ hline end{array}

  • The dot product multiplies corresponding components and sums them. It is also equal to the product of the magnitudes of the two vectors multiplied by the cosine of the angle between them.

 ab=a1b1++anbn=i=1naibi=|a||b|cosθ renewcommand{arraystretch}{2} begin{array}{|l|} hline vec{a} cdot vec{b} = a_1 b_1+ dots + a_n b_n \ = sum_{i=1}^{n} a_i b_i = |vec{a}||vec{b}|costheta \ hline end{array}

  • Scalar projection uses the dot product to measure how far one vector reaches along another’s direction - a single number, like the length of a shadow

|a|cosθ=ab^=ab|b|Large begin{array}{|c|} hline |vec{a}|costheta = vec{a} cdot hat{b} = frac{vec{a} cdot vec{b}}{|vec{b}|} \ hline end{array}

  • Vector projection goes one step further, returning an actual arrow along that direction: the scalar projection times the unit vector.

(|a|cosθ)b^=(ab^)b^Large renewcommand{arraystretch}{2} begin{array}{|l|} hline (|vec{a}|costheta)hat{b} = (vec{a} cdot hat{b})hat{b} \ hline end{array}

In the next part, we will use the tools we learned in this article to truly understand the dot product.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top