From: 3blue1brown

The fundamental question of “What are vectors?” serves as a starting point for understanding their true nature in linear algebra [00:00:22]. Initially, one might consider a two-dimensional vector as either an arrow on a flat plane described by coordinates or fundamentally as a pair of real numbers visualized as an arrow [00:00:24].

Defining vectors as lists of numbers offers a clear and unambiguous approach, making concepts like four-dimensional vectors or 100-dimensional vectors seem concrete [00:00:42]. However, those working with linear algebra often feel they are dealing with a space independent of the coordinates given to it [00:01:05]. Coordinates are seen as somewhat arbitrary, depending on the chosen basis vectors [00:01:16]. Core linear algebra topics like determinants and eigenvectors are indifferent to the choice of coordinate systems [00:01:24].

Functions as Vectors

The concept of a “vector” extends beyond arrows or lists of numbers to include functions, which possess “vector-ish qualities” [00:02:06]. Functions can be added together (e.g., f + g) [00:02:19], where the output of the sum function (f + g)(x) is the sum of the individual function outputs f(x) + g(x) [00:02:45]. This resembles adding vectors coordinate by coordinate, albeit with infinitely many coordinates [00:03:00]. Similarly, functions can be scaled by a real number by scaling all their outputs [00:03:11].

Given that vectors primarily support addition and scaling, the tools and techniques of linear algebra can be applied to functions [00:03:28]. This includes the notion of a linear transformation for functions, which takes one function and turns it into another [00:03:46].

The Derivative as a Linear Transformation

A familiar example of a linear transformation (often called an operator in this context) is the derivative from calculus [00:03:59]. The derivative transforms one function into another [00:04:03].

A transformation is considered linear if it satisfies two properties: additivity and scaling [00:04:39]:

  • Additivity: Applying a transformation to the sum of two vectors v and w yields the same result as adding the transformed versions of v and w (i.e., T(v + w) = T(v) + T(w)) [00:04:46].
  • Scaling: Scaling a vector v by a number, then applying the transformation, is equivalent to scaling the transformed v by the same amount (i.e., T(c * v) = c * T(v)) [00:05:04].

These properties mean linear transformations preserve the operations of vector addition and scalar multiplication [00:05:21]. A key consequence is that a linear transformation is completely described by where it takes the basis vectors [00:05:44].

Calculus students implicitly use the additivity and scaling properties of the derivative:

  • The derivative of a sum of functions is the sum of their derivatives [00:06:28].
  • The derivative of a scaled function is the scaled derivative of the function [00:06:40].

Representing the Derivative with a Matrix

Even for functions, the derivative can be described with a matrix, though function spaces tend to be infinite-dimensional [00:06:56]. Considering only polynomials, a basis can be chosen using powers of x (e.g., 1, x, x^2, x^3, and so on) [00:07:28]. This infinite set of basis functions means polynomials will have infinitely many coordinates [00:08:05]. For example, x^2 + 3x + 5 would have coordinates (5, 3, 1, 0, 0, ...) [00:08:15].

In this coordinate system, the derivative is represented by an infinite matrix that is mostly zeros, with positive integers counting down on an offset diagonal [00:09:06]. This matrix [[linear_transformations_in_linear_algebra | linearly]] transforms the coordinates of a polynomial into the coordinates of its derivative [00:09:24].

This demonstrates that matrix-vector multiplication and taking a derivative are fundamentally related as members of the same mathematical family [00:11:03]. Many concepts from linear algebra, such as the dot product or eigenvectors, have direct analogs in the world of functions (e.g., inner product or eigenfunction) [00:11:14].

What is a Vector Space?

There are numerous “vectorish things” in mathematics [00:11:31]. As long as a set of objects allows for reasonable notions of scaling and adding, the tools of linear algebra (regarding vectors, linear transformations, etc.) can be applied [00:11:35]. These sets of “vectorish things” – like arrows, lists of numbers, or functions – are formally called vector spaces [00:12:13].

Axioms of Vector Spaces

To ensure broad applicability, mathematicians establish a list of rules, called axioms, that vector addition and scaling must follow [00:12:29]. In modern linear algebra, there are eight such axioms that any vector space must satisfy for the theory to apply [00:12:36]. These axioms essentially provide a checklist to ensure that the defined operations behave as expected [00:12:51].

The axioms serve as an interface between the mathematician developing the theory and others who might apply those results to new types of vector spaces [00:12:58]. By proving results in terms of these axioms, mathematicians don’t need to consider every conceivable vector space; instead, anyone whose definitions satisfy the axioms can confidently apply the results [00:13:34]. This leads to abstract phrasing in textbooks, defining linear transformations in terms of additivity and scaling rather than more intuitive, but specific, geometric interpretations [00:14:01].

Ultimately, the modern mathematical view “ignores the question” of what vectors fundamentally are [00:14:22]. Their specific form (arrows, lists of numbers, functions) doesn’t matter, as long as addition and scaling obey the axioms [00:14:27]. Like the number 3 being an abstraction for all possible triplets of things, vectors are an abstraction for various embodiments, unified by the single, intangible notion of a vector space [00:14:41]. While starting with concrete, visualizable settings like 2D space is helpful for intuition, understanding the general applicability of linear algebra tools requires grasping this abstract definition [00:15:08].