An introduction to Galois theory
This is a short introduction to Galois theory. The level of this article is necessarily quite high compared to some NRICH articles, because Galois theory is a very difficult topic usually only introduced in the final year of an undergraduate mathematics degree. This article only skims the surface of Galois theory and should probably be accessible to a 17 or 18 year old school student with a strong interest in mathematics. There is a short and very vague overview of a two important applications of Galois theory in the introduction below. If you want to know more about Galois theory the rest of the article is more in depth, but also harder.
1 Introduction
1.1 Motivation
1.2 History
1.3 Overview
1.4 Notation
Throughout this article, I'll use the following notation. The set of integers will be written $Z$, so writing $n\in Z$ means that $n$ is in $Z$, the set of integers, i.e. $n$ is an integer. The set of rational numbers is $Q$, the set of real numbers is $R$ and the set of complex numbers is $C$.
1.5 Advice on reading this article
2 Groups and Fields
2.1 Groups
Definition (Group): A group $G$ is a collection of objects with an operation $\cdot$ satisfying the following rules (axioms): (1) For any two elements $x$ and $y$ in the group $G$ we also have $x\cdot y$ in the group $G$.
(2) There is an element (usually written $1$ or $e$, but sometimes $0$) called the identity in $G$ such that for any $x$ in the group $G$ we have $1\cdot x=x=x\cdot 1$.
(3) For any elements $x$, $y$, $z$ in $G$ we have $(x\cdot y)\cdot z=x\cdot(y\cdot z)$ (so it doesn't matter what order we do the calculations in). This property is called associativity ; it means we can write $x\cdot y\cdot z$ unambiguously (otherwise it would not be clear what we meant by $x\cdot y\cdot z$: would it be $x\cdot(y\cdot z)$ or $(x\cdot
y)\cdot z$?).
(4) Every element $x$ in $G$ has a unique inverse $y$ (sometimes written $-x$ or $x^{-1}$) so that $x\cdot y=y\cdot x=1$. |
For example, the integers $Z$ are a group with the operation of addition (we write this group$(Z,+)$ or sometimes, lazily, just $Z$). We can check thefour axioms: (1) If $n$, $m$ are integers then $n+m$ is an integer, so we're OK here. (2) $n+0=n=0+n$ so $0$ is the identity for the integers. (3) $(n+m)+p=n+m+p=n+(m+p)$ so $+$ is associative. (4) $n+(-n)=0=(-n)+n$ so we have inverses.
However, the integers are not a group with multiplication, because the identity on the integers with multiplication is $1$, and there is no integer $n$ with $2n=1$.
Definition (Cyclic Group): Important finite groups are things like $C_p$ which is the cyclic group of order $p$ . This is the set of elements $1$, $x$, $x^2$, $\ldots$, $x^{p-1}$ with the operation $x^n\cdot x^m=x^{n+m}$ and also the relation that $x^p=1$. So, for example, in $C_5$ we have that $x^2\cdot x^4=x^6=x^5\cdot x=x$. You can tell this is a group because the inverse of $x^n$ is $x^{p-n}$. |
Definition (Symmetric Group): Another important example of a finite group is $S_n$, the symmetric group on $n$ elements . Suppose we rearrange the numbers $1$, $2$, $\ldots$, $n$. For example, we could rearrange $1$, $2$, $3$ to $2$, $3$, $1$. In other words, we take $1$ to $2$, $2$ to $3$ and $3$ to $1$. The collection of all of these rearrangements forms a group. The operation is do the second one, then the first. So, if we write $\sigma$ for the rearrangement $1$, $2$, $3$ goes to $2$, $3$, $1$ and $\tau$ for the rearrangement $1$, $2$, $3$ goes to $3$, $2$, $1$ then the rearrangement $\sigma\cdot\tau$ does the following: it rearranges $1$, $2$, $3$ to $3$, $2$, $1$ (that's $\tau$) then it rearranges this to $1$, $3$, $2$ (because $\sigma$ takes $3$ to $1$, $2$ to $3$ and $1$ to $2$). So the group $S_n$ is the collection of rearrangements of $1$, $2$, $3$, $\ldots$, $n$. Another way of thinking about it, for those who are happy with the ideas of sets and functions, is to define the symmetric group on a set $X$ to be $S_X=\{f:X\rightarrow X | f \textrm{is invertible}\}$ with the operation that for the functions $f$, $g\in S_X$ we have the function $f\cdot g$ defined to be $(f\cdot g)(x)=f(g(x))$. The symmetric group above, $S_n=S_{\{1,2,\ldots,n\}}$, is the symmetric group on a set with $n$ elements. |
At this point, you may want to check you've followed so far. See if you can prove that $S_n$ is a group and that it has $n!$ elements. If you're happy with the idea of sets and functions then you can prove that $S_X$ is a group even if $X$ is an infinite set.
2.2 Fields
Definition (Field): A field $F$ is a bit like a group, but we have two operations, usually written $\cdot$ and $+$. $F$ is a field if $F$ has elements $0$ and $1$ such that $F$ with the operation $+$ is agroup (i.e. $(F,+)$ is a group), the set $F$ without the element $0$ is a group with the operation $\cdot$ (i.e. $(F\setminus\{0\},\cdot)$ is a group) and we have relations like $(x+y)\cdot z=x\cdot z+y\cdot z$ (we say that $\cdot$ is distributive over $+$), $0\cdot x=0=x\cdot 0$, $x\cdot y=y\cdot x$ and $x+y=y+x$ (which isn't always true for a group) and so on. The definition of a field above is quite abstract, all it means is that a field is a set in which you can add, subtract and multiply any elements, and you can divide by any element other than $0$. |
A good example of a field is the real numbers or the rational numbers. (Check the axioms.)
A less obvious example of a field (the important example for Galois theory) is $Q[\sqrt{2}]$. This is the set of all numbers which can be written $a+b\sqrt{2}$ for $a$ and $b$ rational numbers. It is not immediately obvious that this is a field, because we do not know, for example, if $1/(a+b\sqrt{2})$ can be written as $c+d\sqrt{2}$. However, you can always do this. If $x=1/(a+b\sqrt{2})$ then
(multiplying the top and bottom by $a-b\sqrt{2}$):$$ x=\frac{a-b\sqrt{2}}{(a+b\sqrt{2})(a-b\sqrt{2})}$$ And $(a+b\sqrt{2})(a-b\sqrt{2})=a^2-2b^2=p$ say. So we have that $x=a/p-(b/p)\sqrt{2}$. So $Q[\sqrt{2}]$ really is a field (the other axioms are clearly true, check them if you like).
Definition (Algebraic Number): More generally, if $\alpha$ is a real number with the property that $p(\alpha)=0$ for some polynomial $p(x)$, then we say that $\alpha$ is an algebraic number. |
If $\alpha$ is an algebraic number then $Q[\alpha]$ is a field. We can think of $Q[\alpha]$ in two ways. Firstly, as the set of elements $a_0+a_1\alpha+\ldots+a_{n-1}\alpha^{n-1}$ where each $a_i$ is a rational number and $n$ is the smallest integer such that there is a polynomial $p(x)$ of degree $n$ with $p(\alpha)=0$. The second way is that $Q[\alpha]$ is the smallest field extension of $Q$
containing $\alpha$, this is explained in the next section. You can try to prove that $Q[\alpha]$ is a field if you like, but you need to know a theorem called the Remainder Theorem.
This gives us lots of examples of fields. For example, $Q[\sqrt[3]{2}]=\{a+b\sqrt[3]{2}+c\sqrt[3]{2}^2: a, b, c\in Q\}$ is a field.
You can extend this idea to define, for $\alpha$, $\beta$ both algebraic, $Q[\alpha, \beta]$ to be the set of all expressions like $2\alpha\beta$, $\alpha+\alpha^2\beta$, and so on.
To test yourself, you might like to see if you can show that $Q[\alpha,\beta]=Q[\alpha][\beta]$ (the right hand side makes sense because $Q[\alpha][\beta]=K[\beta]$ where $K=Q[\alpha]$ which is a field). This shows that $Q[\alpha,\beta]$ is a field.
This gives us even more examples of fields, for example $Q[\sqrt{2},\sqrt{3}]=\{a+b\sqrt{2}+c\sqrt{3}+d\sqrt{6}:a,b,c,d\in Q\}$.
2.3 Field extensions
Definition (Field Extension): A field extension of a field $F$ is a field $K$ containing $F$ (we write a field extension as $F\subseteq K$ or $K/F$). For example, the real numbers are a field extension of the rational numbers, because the reals are a field and every rational is also a real number. |
The example above, $Q[\sqrt{2}]$ is a field extension of $Q$ since if $a\in Q$ then $a+0\sqrt{2}\in Q[\sqrt{2}]$, so $Q\subseteq Q[\sqrt{2}]$. More generally we have that $Q[\alpha]$ is a field extension of $Q$ for $\alpha$ an algebraic number.
2.4 Splitting Fields
Definition (Splitting Field): Given a polynomial $p(x)$ we have what is called the splitting field of $p(x)$ which is the smallest field extension of $Q$ that contains all the roots of $p(x)$. So, if $p(x)=x^2-2$ then the splitting field of $p(x)$ is $Q[\sqrt{2}]$ (it contains all the roots of $p(x)$ and if it had fewer elements it either wouldn't contain all the roots or wouldn't be a field). |
Another example is that the splitting field of $p(x)=x^4-5x^2+6$ is $Q[\sqrt{2},\sqrt{3}]$. Can you see why?
3 Automorphisms and Galois Groups
3.1 Automorphisms
Definition (Field Automorphism): A field automorphism $f$ has to be an invertible function (which the $f$ above clearly is) such that $f(x+y)=f(x)+f(y)$, $f(a x)=f(a)f(x)$ and $f(1/x)=1/f(x)$. |
You can check that for the function $f$ above really does satisfy all the conditions.
The idea of a field automorphism is that it is just a way of relabelling the elements of the field without changing the structure at all. In other words, we can replace the symbol $\sqrt{2}$ with the symbol $-\sqrt{2}$, do all our calculations and then change the symbol $-\sqrt{2}$ back to $\sqrt{2}$ and we get the right answer. Field automorphisms are the right way of expressing this idea,
because the conditions that $f(x+y)=f(x)+f(y)$ preserve multiplication, addition and so on.
Definition (F-Automorphism): More specifically, if we have a field extension $K$ of a field $F$, then an $F$-automorphism of $K$ is an automorphism $f$ of $K$ with the additional property that $f(x)=x$ for all $x$ in $F$. |
This is the precise way of defining the symmetry of the roots that I talked about above, because the $F$-automorphism leaves all elements of $F$ unchanged and only relabels the new elements we added to form $K$. It turns out that for $Q[\sqrt{2}]$ the function $f$ I defined above is the only $Q$-automorphism other than the obvious $g(x)=x$.
If $p(x)$ is any polynomial (with rational coefficients, as always), $K/Q$ is a field extension, and $f$ is a $Q$-automorphism of $K$ then $f(p(x))=p(f(x))$, see if you can prove this.
The reason this is useful is that it shows that a $Q$-automorphism of a splitting field $K$ of a polynomial $p(x)$ rearranges the roots of $p(x)$. If $p(\alpha)=0$ then $p(f(\alpha))=f(p(\alpha))=f(0)=0$, so $f(\alpha)$ is then a root of $p(x)$.
In fact, we can go further than this and show that knowing how a $Q$-automorphism of a splitting field rearranges the roots of $p(x)$ is enough to tell us precisely what that $Q$-automorphism does to every element of the splitting field. However, not every rearrangement of the roots of $p(x)$ comes from a $Q$-automorphism. For example, if $p(x)=x^4-5x^2+6$ (which we showed has splitting field
$K=Q[\sqrt{2},\sqrt{3}]$) which has roots $\pm \sqrt{2}$ and $\pm \sqrt{3}$ then there is no $Q$-automorphism $f$ of $K$ with $f(\sqrt{2})=\sqrt{3}$. Suppose there was, then $f(\sqrt{2})^2 =f(\sqrt{2}^2)=f(2)=2$ because $f$ preserves multiplicative structure and $f(x)=x$ for rational $x$. But if $f(\sqrt{2})=\sqrt{3}$ then $f(\sqrt{2})^2=\sqrt{3}^2$, i.e. $2=3$ which is clearly nonsense.
So now we can see why a $Q$-automorphism of a splitting field gives us exactly the right idea of a symmetry of the roots which doesn't matter (i.e. doesn't change the structure at all).
So for the polynomial $p(x)=x^2-2$ we have the following:
(a) The splitting field of $p(x)$ is $Q[\sqrt{2}]$.
(b) The $Q$-automorphisms of $p(x)$, which we can think of as the symmetries of the roots, are $f(a+b\sqrt{2})=a-b\sqrt{2}$ and $g(x)=x$.
At this point, you may want to see if you can find the splitting field and the $Q$-automorphisms of$p(x)=x^2-5$ (two $Q$-automorphisms), and if you know about complex numbers, you could try $x^4-1$ (also two $Q$-automorphisms).
3.2 The Galois Group
Definition (Galois Group): Now, if we have a field $F$ which is a field extension of $Q$ then we have a collection $G$ of $Q$-automorphisms of $F$. This collection $G$ is a group (with the operation defined by: if $f$ and $g$ are in $G$, i.e. they are $Q$-automorphisms of $F$, then $f\cdot g$ is a $Q$-automorphism defined by $(f\cdot g)(x)=f(g(x))$ - check that this really is a group). It is called the Galois group of the field extension $F$ over $Q$ , usually written $\mathrm{Gal}(F/Q)$. If $F$ is the splitting field of a polynomial $p(x)$ then $G$ is called the Galois group of the polynomial $p(x)$, usually written $\mathrm{Gal}(p)$. |
So, taking the polynomial $p(x)=x^2-2$, we have $G=\mathrm{Gal}(p)=\{f,g\}$ where $f(a+b\sqrt{2})=a-b\sqrt{2}$ and $g(x)=x$. Here, $g$ is the identity element of the group, and we have that $f\cdot f=g$, because $(f\cdot f)(a+b\sqrt{2})=f(f(a+b\sqrt{2})=f(a-b \sqrt{2})=a+b\sqrt{2}=g(a+b\sqrt{2})$. So, the group $G$ is the same as $C_2$, the cyclic group of order 2, or $S_2$, the symmetric group
of order 2, because we have a single element $f$ with $f^2=f\cdot f=1$ the identity on the group.
As an exercise, you might like to find the Galois group of $p(x)=a x^2+bx=c$. [Hint: there are two cases to consider, $b^2-4a c =r^2$ for some rational $r$ or $b^2-4a c\neq r^2$ for any rational $r$.]
If you know a bit about complex numbers (specifically, roots of unit) and you're quite adventurous, you might like to try and show that for $p(x)=x^q-1$ with $q$ a prime number, $\mathrm{Gal}(p)=C_{q-1}$ the cyclic group of order $q-1$.
If you know about subgroups, you can use the fact that the $Q$-automorphisms of a splitting field rearrange the roots (and that the rearrangement of the roots alone tells us what the $Q$-automorphism is) is to show that $\mathrm{Gal}(p)\leq S_n$ where $n$ is the degree of $p(x)$. In particular, all polynomials have finite Galois group.
4 Solubility by Radicals
Definition (Cyclotomic Field Extension): First, you define a cyclotomic field extension to be a field extension of $F$ where you take an element $x$ in $F$ and add the $n^{\textrm{th}}$ root. So, $Q[\sqrt{2}]$ is a cyclotomic field extension of $Q$. |
Definition (Radical Field Extension): Second, you define a radical field extension $K$ of a field $F$ to be a field extension which you can get to only using cyclotomic field extensions. So, $Q[\sqrt{1+\sqrt{2}}]$ is a radical field extension because you can start with $Q$, add $\sqrt{2}$ to form $Q[\sqrt{2}]$. Now, $1+\sqrt{2}$ is in $Q[\sqrt{2}]$, so taking the square root of this you get $Q[\sqrt{1+\sqrt{2}}]$. If the polynomial $p(x)$ is soluble by radicals, then the splitting field $F$ of $p(x)$ is a radical field extension of $Q$ (can you see why?). |
Third, you prove that the Galois group of any radical field extensionis soluble. This is the hardest part by a long, long way. In fact, I'm not even going to attempt to explain what a soluble group is here, because it would take too long.
Fourth, you prove that the group $S_5$ (the symmetric group on $5$ elements) is not soluble. If you know a bit of group theory, this isn't very difficult.
Fifth, you find a polynomial $p(x)$ whose Galois groupis $S_5$. The splitting field of this polynomial cannot be a radical field extension (because all radical field extensions have soluble Galois groups, so the roots of $p(x)$ cannot be built up from $+$, $-$, $\times$, $/$ and the $n^{\textrm{th}}$ roots.
5 Trisecting Angles
Definition (Constructible Numbers and Constructible Field Extensions): The basic idea is to define a constructible number to be a real number that can be found using geometric constructions with an unmarked ruler and a compass. You can show that any constructible number must lie in a field extension $Q[\sqrt{\alpha_1}, \sqrt{\alpha_2}, \ldots, \sqrt{\alpha_n}]$ with each $\alpha_i\in Q[\sqrt{\alpha_1}, \ldots, \sqrt{\alpha_{i-1}}]$. We'll call a field extension that looks like this a constructible field extension . So, for example, $Q[\sqrt{2}]$ is a constructible field extension, and so is $Q[\sqrt{1+\sqrt{2}}]$, because you can write $Q[\sqrt{1+\sqrt{2}}]=Q[\sqrt{2},\sqrt{1+\sqrt{2}}]$. |
It's not obvious that any constructible number must lie in a field extension of this form, but we can sort of see why because given line segments of length $x$, $y$, it is possible to construct other line segments of length $x+y$, $x y$ and $1/x$ using geometric constructions. Moreover, you can construct a line segment of length $\sqrt{x}$ using only geometric constructions. In fact, you can also
show that these are the only things you can do with geometric constructions. (If you want to try, the way to prove this is to use the fact that all you can do with unmarked rulers and compasses is to find the intersection between two lines, which only gives you arithmetical operations, find the intersection between a line and a circle, which gives you square roots, and intersections between
circles and circles, which gives you square roots.) Can you see why this means that a number in a constructible field extension (as defined above) can be constructed using only an unmarked ruler and compass, and that only numbers in constructible field extensions can be made in this way?
Next, you show that if you have a cubic polynomial $p(x)=a x^3+b x^2+c x +d$ whose roots are not rational numbers then the roots are not constructible? This isn't very difficult to prove but requires some knowledge beyond what I'm assuming for this article.
Here's the clever part. Suppose you could construct a $20^{\circ}$ angle, then the number $\cos(20^{\circ})$ would be constructible (you can just drop a perpendicular from a point on a line at $20^{\circ}$ to the horizontal, distance $1$ from the origin). However, you can show that $\alpha=\cos(20^{\circ})$ is a root of the equation $8x^3-6x-1=0$ (by expanding $\cos(60^{\circ})$ in terms of
$\cos(20^{\circ})$ using the addition formula). It is easy to show that this has no rational roots, and so the roots are not constructible. This means that we couldn't have constructed a $20^{\circ}$ angle, because then we would be able to construct $\cos(20^{\circ})$ which is impossible. So a $60^{\circ}$ angle cannot be trisected.
You can use methods like this to prove other results about what shapes can or can't be constructed and so forth.
6 Further Reading
- http://mathworld.wolfram.com/CubicEquation.html (lots about solving polynomials of degree 3, quite hard)
- http://mathworld.wolfram.com/QuarticEquation.html (lots about solving polynomials of degree 4, quite hard)
- http://mathworld.wolfram.com/Group.html (information about group theory, quite hard but lots of links to interesting things about group theory)
- http://members.tripod.com/~dogschool/ (long introduction to group theory, seems quite good and not too difficult)
- https://mathshistory.st-andrews.ac.uk/HistTopics/Development_group_theory/ (history of work on group theory, quite a lot about Galois theory)
- https://mathshistory.st-andrews.ac.uk/HistTopics/Abstract_groups/ (history of the development of the concept of a group)
- https://mathshistory.st-andrews.ac.uk/Biographies/Galois/ (biography of Galois, whose life story is very dramatic - involving duels and political riots)
- https://mathshistory.st-andrews.ac.uk/Biographies/Abel/ (biography of Abel, another important person in the development of Galois theory)
- https://mathshistory.st-andrews.ac.uk/Biographies/Ruffini/ (biography of Ruffini, who is the first person to have come up with a proof that there are quintic equations which are not soluble by radicals, although his work was little recognised at the time)
- http://mathworld.wolfram.com/Trisection.html (trisecting angles, no proofs)
- http://mathworld.wolfram.com/ConstructiblePolygon.html (constructible polygons, no proofs)
- http://www.cut-the-knot.org/arithmetic/rational.shtml (constructible numbers, with proofs)
- http://www.cut-the-knot.com/arithmetic/cubic.shtml (trisecting angles, with proofs)