From: 3blue1brown

Taylor series are considered one of the most powerful mathematical tools for approximating functions [00:00:25]. They frequently appear across various fields, including mathematics, physics, and engineering [00:00:20]. The primary purpose of Taylor series is to find polynomials that approximate non-polynomial functions near a specific input [00:01:41]. Polynomials are generally easier to work with than other functions, being simpler to compute, differentiate, and integrate [00:01:48].

Motivation through a Physics Problem

The utility of function approximation becomes clear in problems where complex functions make calculations unwieldy [00:00:57]. For instance, when studying the potential energy of a pendulum, the height of the pendulum’s weight above its lowest point is proportional to 1 - cos(θ), where θ is the angle between the pendulum and the vertical [00:00:35]. The cos(θ) term can complicate the problem and obscure relationships with other oscillating phenomena [00:01:02].

By approximating cos(θ) as 1 - θ²/2, the problem significantly simplifies [00:01:07]. Graphing cos(θ) alongside 1 - θ²/2 shows they are very close for small angles near zero [00:01:23]. The question then becomes how to systematically find such a polynomial approximation [00:01:33].

Constructing a Quadratic Approximation for cos(x) near x=0

To construct a polynomial approximation, such as c₀ + c₁x + c₂x², that resembles cos(x) near x=0, we match the function’s value and its derivatives at that point [00:02:10].

  1. Matching the Value (c₀):

    • At x=0, cos(x) is 1 [00:02:33].
    • For the polynomial, plugging in x=0 yields c₀ [00:02:45].
    • Therefore, c₀ must be 1 to ensure the approximation equals 1 at x=0 [00:02:48].
    • c₀ is responsible for matching the output of the approximation with cos(x) at x=0 [00:05:49].
  2. Matching the First Derivative (c₁):

    • The derivative of cos(x) is -sin(x), which is 0 at x=0, indicating a flat tangent line [00:03:18].
    • The derivative of the quadratic c₀ + c₁x + c₂x² is c₁ + 2c₂x [00:03:26].
    • At x=0, this derivative is c₁ [00:03:35].
    • Setting c₁ to 0 ensures the approximation has the same flat tangent line at x=0 [00:03:47].
    • c₁ is in charge of making sure the derivatives match at x=0 [00:05:59].
  3. Matching the Second Derivative (c₂):

    • cos(x) curves downward around x=0, indicating a negative second derivative [00:04:04].
    • The second derivative of cos(x) (-cos(x)) is -1 at x=0 [00:04:21].
    • The second derivative of the polynomial c₀ + c₁x + c₂x² is 2c₂ [00:04:54].
    • Setting 2c₂ = -1 means c₂ should be -1/2 [00:05:04].
    • This ensures the polynomial’s slope changes at the same rate as cos(x)’s [00:04:41].
    • c₂ is responsible for making sure the second derivatives match up [00:06:05].

This process yields the quadratic approximation 1 + 0x - ½x², or 1 - ½x² [00:05:16]. For example, cos(0.1) is estimated as 0.995, which is very close to the true value [00:05:27]. The constants c₀, c₁, and c₂ control the approximation’s value, slope, and curvature, respectively [00:05:42].

Extending to Higher Order Terms

To improve the approximation, one can add more terms to the polynomial and match higher-order derivatives [00:06:24].

  • Third-order term (c₃x³): The third derivative of cos(x) is sin(x), which is 0 at x=0 [00:06:56]. The third derivative of c₃x³ is 1 * 2 * 3 * c₃ [00:06:45]. Thus, c₃ must be 0 [00:07:03].
  • Fourth-order term (c₄x⁴): The fourth derivative of cos(x) is cos(x) itself, which is 1 at x=0 [00:07:27]. The fourth derivative of c₄x⁴ is 1 * 2 * 3 * 4 * c₄, or 24c₄ [00:07:45]. So, c₄ must be 1/24 [00:07:51].

The resulting fourth-order approximation 1 - ½x² + 1/24x⁴ is a very close approximation for cos(x) around x=0 [00:07:59].

Key Observations:

  • Factorials: When taking n successive derivatives of xⁿ, the result is n! (n factorial) [00:08:30]. Therefore, the coefficient of each xⁿ term is the nth derivative of the function divided by n! to cancel out this effect [00:08:49].
  • Independence of Coefficients: Adding new higher-order terms does not change the values of previously determined lower-order coefficients [00:09:12]. This is because when evaluating derivatives at x=0, any term with an x factor will “wash away” [00:09:30].
  • Approximation Around a Point ‘a’: If approximating near an input a other than 0, the polynomial should be written in terms of powers of (x-a) [00:09:52]. All derivatives of the function would then be evaluated at a [00:13:07].

General Form of Taylor Polynomials

Taylor polynomials translate derivative information at a single point into approximation information around that point [00:10:27].

For a function f(x) approximated near x=0 (also known as a Maclaurin polynomial), the coefficient of each xⁿ term is the value of the nth derivative of the function evaluated at 0, divided by n! [00:12:09]. This ensures:

  • The constant term matches the function’s value [00:12:34].
  • The x term matches the function’s slope [00:12:39].
  • The term matches how the slope changes [00:12:43].
  • And so on for higher terms [00:12:48].

The more terms chosen, the closer the approximation, but the polynomial becomes more complicated [00:12:54].

In full generality, for an approximation near an input a, Taylor polynomials are written in terms of powers of (x-a), and all derivatives of f are evaluated at a [00:13:02]. Changing a shifts where the approximation “hugs” the original function [00:13:24].

Example: Taylor Polynomials for e^x near x=0

The function e^x provides a simple example [00:13:35].

  • All derivatives of e^x are e^x [00:13:42].
  • At x=0, all derivatives evaluate to 1 [00:13:54].
  • Therefore, the Taylor polynomial approximation for e^x near x=0 looks like: 1 + 1x + 1/2! x² + 1/3! x³ + ... [00:14:05].

Geometric Understanding of Taylor Polynomials (Second Order Term)

A geometric interpretation of the second-order term can be derived from the Fundamental Theorem of Calculus [00:14:41]. Consider a function f(x) that represents the area under some graph from a fixed left point to a variable right point x [00:14:47]. The graph itself represents the derivative of this area function [00:15:10].

To approximate the change in this area function f(x) from a to x:

  • The first-order term corresponds to the area of a rectangle with height f'(a) (the value of the graph at a) and width (x-a) [00:16:49]. This matches f'(a)(x-a).
  • The second-order term approximates the “triangular” portion of the area above this rectangle [00:15:39]. The base of this “triangle” is (x-a), and its height is the change in the graph’s value over that interval, approximately f''(a)(x-a) (slope of the graph times the base) [00:15:58].
  • The area of this approximate triangle is ½ * base * height = ½ * (x-a) * f''(a)(x-a) = ½ * f''(a) * (x-a)² [00:16:18]. This exactly matches the second-order term in a Taylor polynomial [00:16:30].

This geometric interpretation clearly shows how each term in a Taylor polynomial accounts for different aspects of the function’s behavior around the point of approximation [00:17:02].

From Taylor Polynomials to Taylor Series

While Taylor polynomials use a finite number of terms, an infinite sum of terms is called a Taylor series [00:17:31]. An infinite sum doesn’t literally mean adding infinitely many things, but rather considering whether the sum of more and more terms approaches a specific value [00:17:48].

Series Convergence and Divergence

  • Convergence: If adding more terms gets increasingly close to a specific value, the series is said to converge to that value [00:17:57]. In such cases, the infinite series is considered equal to the value it converges to [00:18:10].
    • For example, plugging x=1 into the Taylor polynomial for e^x and adding more terms causes the sum to converge towards e [00:18:27]. Similarly, for any x, the Taylor series for e^x converges to e^x [00:18:47]. This is also true for sin(x) and cos(x) [00:19:28].
  • Divergence: Sometimes, the series only converges within a specific range around the input where the derivative information was gathered [00:19:32]. Outside this range, the series might fail to approach anything, with the sum bouncing wildly as more terms are added [00:20:06]. In this case, the series diverges [00:20:36].
    • For instance, the Taylor series for ln(x) around x=1 converges for x values between 0 and 2, but diverges outside this range [00:19:41].
  • Radius of Convergence: The maximum distance between the approximation point and where the series converges is called the radius of convergence [00:20:44].

The fundamental intuition behind Taylor series is that they translate derivative information at a single point into approximation information around that point [00:21:28].