What Are Infinitesimals – Simple Version
Table of Contents
Introduction
When I learned calculus, the intuitive idea of infinitesimal was used. These are real numbers so small that, for all practical purposes (say 1/trillion to the power of a trillion) can be thrown away because they are negligible. That way, when defining the derivative, for example, you do not run into 0/0, but when required, you can throw infinitesimals away as being negligible.
This is fine for applied mathematicians, physicists, actuaries etc., who want it as a tool to use in their work. But mathematicians, while conceding it is OK to start that way, eventually will need to rectify using handwavey arguments and be logically sound. The usual way of doing it is using limits.
Instead, I will justify the idea of infinitesimals as legitimate. Not with full rigour; I leave that to specialist texts, but enough to satisfy those interested in the fundamental ideas. About 1960, mathematicians (notably Abraham Robinson) did something nifty. They created hyperreal numbers, which have real numbers plus actual infinitesimals. I have also written an advanced version that goes into more detail, including an introduction to real analysis. That would be best read after studying a calculus text. This can be read as preparation for an infinitesimal-based calculus text.
Infinitesimals are numbers x with a very strange property. If X is any positive real number -X<x<X or |x|<X. Normally zero is the only number with that property – but in the hyperreals, there are actual numbers not equal to zero whose absolute value is less than any positive real number. We can legitimately neglect x if |x| < X for any positive real X. It also aligns with how many are likely to do calculus in practice. Even though I know calculus with limits, I hardly ever use it – instead, I use infinitesimals. After reading this, you can continue doing it, knowing it is logically sound.
I will be writing another insights article using calculus as a supplement to a US Algebra 2 and Trigonometry course. Somewhat ironic – Calculus to prepare for doing Calculus. Those that have followed this sequence will be well prepared to study an infinitesimal based Calculus textbook. Many are available cheaply on Amazon, but here I suggest a free one (the paper version is cheaply available on Amazon) that uses an intuitive approach to infinitesimals – Full Frontal Calculus:
https://www.bravernewmath.com/
Another good one is Calculus Made Even Easier cheaply available from Amazon.
The Hyperrationals
The hyperrationals are all the sequences of rational numbers. Two hyperrationals, A and B, are equal if An = Bn except for a finite number of terms. However hyperrationals, unless specifically referred to as sequences, are considered a single object. It is what is called a Urelement. It is part of formal set theory the reader can investigate if desired – there is a Wikipedia article on it. When two sequences are equal they are considered the same object. Often this is expressed by saying they belong to the same equivalence class and the equivalence class is considered a single object. But, being a beginners article I do not want to delve further into set theory, so will just use the idea of a Urelement which is easy to grasp. A < B is defined as Am < Bm except for a finite number of terms. Similarly, for A > B. Note there are pathological sequences such as 1 0 1 0 1 0 that are neither =, >, or less than 1. We will require that all sequences are either =, >, < all rationals. If not it will be equal to zero.
If F(X) is a rational function defined on the rationals, then that can easily be extended to the hyperrationals by F(X) = F(Xn). A + B = An + Bn, A*B = An*Bn. Division will not be defined because of the divide by zero issue; instead 1/X is defined as the extension 1/Xn and throw away terms that are 1/0. If that doesn’t work then 1/X is undefined. If X is a rational number, then the sequence Xn = X X X …… is the hyperrational of the rational number X ie all terms are the rational number X. Obviously B is also rational if according to the definition of equality above they are equal.
We will show that the hyperrationals contain actual infinitesimals. Let X be any positive rational number. Let B be the hyperrational Bn = 1/n. Then regardless of what value X is, an N can be found such that 1/n < X for any n > N. Hence, by the definition of < in the hyperrationals, |B| < X for any positive rational number, hence B is an actual infinitesimal.
Also, we have infinitesimals smaller than other infinitesimals, eg 1/n^2 < 1/n, except when n = 1.
Note if a and b are infinitesimal so is a+b, and a*b. To see this; if X is any positive rational |a| < X/2, |b| < X/2 then|a+b| < X. Similarly |a*b| < |a*1| = |a| < |X|.
Hyperrationals also contain infinite numbers larger than any rational number. Let A be the sequence An=n. If X is any rational number there is an N such for all n > N, then An > X. Again we have infinitely large numbers greater than other infinitely large numbers because except for n = 1, n^2 > n. Even 1 + n > n for all n.
If a hyperrational is not infinitesimal or infinitely large it is called finite or bounded. Formally they are hyperrationals, X, such that |X| < Q for some rational Q.
Also note if a is a positive infinitesimal a/a = 1. 1/a can’t be infinitesimal because then a/a would be infinitesimal. Similarly it cant be finite because there would be an N, |1/a| < N and a/a would be infinitesimal. Hence 1/a is infinitely large.
Real Numbers
As a beginners article the reader likely has not seen precise definitions of integers, rationals and real numbers:
http://www.math.uni-konstanz.de/~krapp/research/Presentation_Contruction_of_the_real_numbers_1
The above is more advanced than the audience I had in mind for this article. It uses technical terms a beginner probably would not know. However I was not able to locate one at the appropriate level. A beginner however would probably be able to read it and get the general gist. I can see I will need to do an insights article at a more appropriate level.
As can be seen there are a number of ways of defining real numbers. The construction methods of finite hyperrationals and Dedekind Cuts will be used here.
A Dedekind Cut is a partition of the rational numbers into two sets A and B, such that all elements of A are less than all elements of B, and A contains no greatest element. Any real number R, is defined by a Dedekind Cut. In fact since B is all the rationals not in A, a Dedekind Cut is defined by A alone. A set A of rationals that has no largest element and every element not in A is greater than any element in A defines a Dedekind Cut and real number R. Let X be any finite hyperrational. Let A be the set of rationals < X. A is a Dedekind Cut. Hence X defines as a real number R. If Y is infinitesimally close to X then the set of rationals < Y is also A hence defines the same real, R. Only if Y is finitely different to X does it define a different real, S. That is because the difference is a finite hyperreal and defines a real number Z. R≠S. This leads to a new definition of the reals. Two finite hyperreals are equal if they are infinitesimally close. The hyperreals infinitesimally close to each other are denoted by the same object. These objects are the reals.
The Hyperreals
Now we know what reals are we can extend hyperrationals to hyperreals ie all the sequences of reals. The hyperrationals are a proper subset of the hyperreals. As before the real number A is the sequence An = A A A A…………… Similar to hyperrationals if F(X) is a function defined on the reals then that can easily be extended to the hyperreals by F(X) = F(Xn). A + B = An + Bn. A*B = An*Bn. Two hyperreals, A and B, are equal if An = Bn except for a finite number of terms. As usual they are treated as a single object. We define A < B and A > B similarly ie differing by only a finite number of terms. A + B = An + Bn. A*B = An*Bn. We have infinitesimals and infinitely large hyperreal numbers. Again pathological sequences are set to zero.
We want to show if X is a finite hyperreal then X has a real infinitesimally close to it called the standard part of X, denoted by st(X). Let A be the set of all rationals < X. A is a Dedekind Cut that defines a real, R. R = st(X). Hence any finite hyperreal X is the sum of R + r where R is a real number st(X) and r an infinitesimal. r, being an infinitesimal can legitimately be thrown away when required.
This is just an overview of a rich subject. I have also written an insights article at a more advanced level. This article is just meant to give a simplified account of infinitesimals for those interested in seeing how they are justified. The more advanced article goes deeper and gives an introduction also to real analysis. This article is best read after the Calculus and Algebra 2 article. The more advanced version, including an introduction to real analysis, would best be read after reading an infinitesimal based calculus text like Full Frontal Calculus or Calculus Made Even Easier.
How It Is Applied
This part is taken from the more advanced article. It is given here to show how it is used in practice and how some of the arguments in infinitesimal calculus texts can be justified. It instructive and fun to go through the infinitesimal arguments in a calculus text and see how hyperreals are used to make intuitive arguments sound while studying the text. Certainly it would be a good idea to do it after reading the text.
For example d(x^2) = (x+dx)^2 – x^2 = 2xdx + dx^2 = dx*(2x +dx). But since dx is smaller than any real number it can be neglected in (2x+dx) to give simply 2x. d(x^2) = 2xdx or d(x^2)/dx = 2x.
The definition of derivative is easy. dy/dx = st((y(x+dx) – y(x))/dx)
The antiderivative of a function f(x) is simply a function F(x) such that dF/dx = f(x). The indefinite integral, ∫f(x)*dx is defined as F(x) + C where F(x) is an antiderivative of f(x). All antiderivatives have the form F(x) + C where C is any constant. It actually is not a function, but a family of functions, each differing by a constant that is different for each function. Not only that but if F(x) is a member of the family so is F(x) + C where C is any constant. All members of this family are antiderivatives of f(x). This notation allows the easy derivation of the important change of variables formula. ∫f*dy = ∫f*(dy/dx)*dx. It is used often in actually calculating integrals – or to be more exact antiderivatives.
Application to Area
Without having any idea of what area is, from the definition of indefinite integral ∫1*dA = ∫dA = A + C where A is this thing called area. Doing a change of variable ∫dA = ∫(dA/dx)*dx. Let f(x) = dA/dx. ∫f(x)*dx = A(x) + C. We do not have a definition of A from this because of the arbitrary constant C. But note something interesting. A(b) – A(a) = A(b) + C – (A(a) + C). Now the arbitrary constant C has gone. This leads to the following unique definition of the area A between a and b. If A(x) is an antiderivative of a function f(x) the area between a and b = A(b) – A(a). It is given a special name – the definite integral denoted by ∫(a to b)f(x)dx = A(b) – A(a) where A(x) is an antiderivative of f(x). We know to good approximation, if Δx is small the area under f(x) from x to x+Δx is f(x)*Δx. It is exact if Δx = 0, but then the area is zero. f(x)dx can be thought of as an infinitesimal area. By this is meant to good approximation ΔA = f(x)Δx. The approximation gets better as Δx get smaller. It would be exact when Δx = 0, except for one problem, ΔA = 0. To circumvent this we extend ΔA to the hyperreals and da = f(x)dx. But dx can be neglected. So we can have our cake and eat it to. dx is effectively zero, so the approximation is exact, but it isn’t zero so dv is not zero. In this way other things like volume of rotation can be defined. If Δx is small the volume of rotation about f(x), ΔV, is f(x)^2*Π*Δx to good approximation, with the approximation getting better as Δx gets smaller. In order to be exact Δx would need to be zero, but then ΔV the volume of rotation is zero. Similar to area we want is Δx to be effectively zero, but not zero. Extending the formula to the hyperreals dV would be dV = f(x)*Π*r^2*dx. ∫dV = ∫f(x)^2*Π*dx and the volume can be calculated. Same with surface area.
My favourite interest is exactly how can we view the world so what science tells us is intuitive.
One would expect that the emphasis on semantics over syntax favored the classical model with infinitesimals instead of the rather syntactic epsilontic. Infinitesimals were common, and epsilontic is an obstacle till today. Even I have to ensure myself each time I use it that the order of quantifiers is correct: ##\forall\;\exists\;\forall## – not very intuitive.
Sorry for the confusion.
This all grew out of studies in logical model theory (see the section on Ultraproducts):
https://en.wikipedia.org/wiki/Model_theory
Thanks
Bill
I think this is a serious misunderstanding. My link to the youtube techno song titled "Hyper, hyper" was the joke, not your article. The article is fine. Maybe I should stop to make fun of everything.
I would be interested in the transition process. Infinitesimals were so common that all physicists and mathematicians used them as we use ordinary numbers today. However, modern textbooks switched to epsilontic. Did it come before, with, or because of Bourbaki? Or was it parallel to the rise of topology? What triggered the transition?
Hi Fresh
I can see why it looks like a joke. The idea is to use the concept of infinitesimals; the reader can make it less of a joke.
I am a bit concerned about having two articles – a simple version and an advanced version. Also, the advanced version has a link to how natural numbers, integers, rationals and reals are constructed. It is a bit advanced for the audience I had in mind, so am working on an article at a more basic level. You may be interested in that.
It also goes into a bit of the history of why these more formal approaches were devised, and ZFC set theory (or the ZFCA version I use with Urelements). As you would know the axiom of infinity is basically Von Neumann's construction of the naturals.
You may find it interesting. For me, it may allow the combining of the more advanced article and simplified version by separating out the more advanced material.
Thanks
Bill
Here is Noether's article from 1918:
https://de.wikisource.org/wiki/Invariante_Variationsprobleme (German transcription)
https://arxiv.org/pdf/physics/0503066.pdf (English transcription)
and I'm sorry that I didn't find the facsimile on the server of the University of Göttingen right now. Anyway, it shows – and the facsimile shows it even more – that the entire Lie theory was developed along the concept of infinitesimals.
Btw., I didn't know that there were also hyperrationals (TIL).