Calculus of variations: a detailed
introduction
By Brad Rodgers (P1930) on Saturday,
October 28, 2000 - 01:34 am :
How would I differentiate the expression:
A=yL/(1+(dy/dx)2 )1/2
where L is a constant, and y is a function of x?
Any help is much appreciated
Thanks,
Brad
By Michael Doré (Md285) on Saturday, October
28, 2000 - 01:34 pm :
So you want to differentiate A with
respect to x?
Let u = yL, v = sqrt(1 + (dy/dx)2 ) so A = u/v.
Now:
Whether that is going to simplify nicely I'm not sure. By the
way, what is the context of this problem? Perhaps there's an
easier way to do it...
By Brad Rodgers (P1930) on Saturday,
October 28, 2000 - 03:51 pm :
Yes, you are right that I wanted to differentiate that with
respect to x.
I was actually trying to prove the theorem for the maximum area
for a given length of a closed curve. But looking back at this,
I've realized that there is no maximum given only the constraints
of:
dA=ydx
dL=(1+(dy/dx)2 )1/2
I could perhaps change the problem to
L=ya+yb+òab(1+(dy/dx)2)1/2
Where ya , yb signify y at a and b
respectively.
But I don't know that this would be any easier to work
with.
Thanks,
Brad
By Dan Goodman (Dfmg2) on Saturday,
October 28, 2000 - 04:07 pm :
Brad, what you're trying to do (I think)
is called the Calculus of Variations, which is covered here at
Cambridge in the second year. However, it's not really THAT
difficult, I can give you a brief introduction if you'd like?
Basically, what happens is that you end up with a differential
equation (called the Euler-Lagrange equation), the solution of
which is a curve which maximises the area. You can generalise the
problem to maximise "functionals" of a curve (of which area is
one example) subject to functional constraints (of which length
is an example). For your example, the solution is (IIRC) an arc
of a circle going through (a,ya ) and (b,yb
) with the length specified. Of course, there's no fun in just
being told, I'll post the (reasonably long) introduction to C of
V (which I wrote for someone else who wanted to solve a very
similar problem) if you'd like?
By Dan Goodman (Dfmg2) on Saturday,
October 28, 2000 - 04:17 pm :
Here it is. Have a go at working your
way through it if you like, although it is quite hard.
Partial Derivatives
A partial derivative is basically the same as a normal
derivative, but for functions of more than one variable. For
instance, if f(x,y)=x2 +y2
then
(treat y as constant).
Maximizing Functions
Okay, you know how to maximize f(x), you solve df/dx=0 and find
which values of x satisfying this equation corresponds to the
maximum f and you also need to check that f doesn't increase
above this value as f tends to infinity. For many variable
functions, e.g. f(x,y), we have to solve the set of simultaneous
equations
and that's it.
Lagrange Multipliers
We'll need the concept of Lagrange Multipliers to solve this
thorny problem. Consider the problem: maximize f(x) subject to
g(x)=0 (x could be a vector here). Later our problem will reduce
to one of this form and we need to know how to solve it.
Mathematicians don't know a direct way of solving this problem so
they consider an alternative problem. Let h(x,w)=f(x)+wg(x).
Maximize h(x,w) subject to no constraints. You do this by solving
the set of simultaneous eqns:
The same method also works for minimizing. For instance to
minimize f(x,y)=x2 +y2 subject to
g(x,y)=y-x2 +1=0. h(x,y,w)=x2
+y2 +w(y-x2 +1), we need to solve
That is solve:
(1) 2x(1-w)=0
(2) 2y+w=0
(3) y-x2 +1=0
From (2) and (3) we can eliminate w, as from (2) w=-2y, putting
this into (1) we get 2x(1+2y)=0. The solutions are x=0 or y=-1/2.
From (3) we also have y=-1 or x=±sqrt(1/2). Plug each of
the possibilites into f and take the one which gives the
maximum.
Why does this method work? Well, we need
, that is g(x)=0, the constraint!
So, solving for the unconstrained maximum of this function
automatically solves for the constrained maximum of the first
function! Neat. OK, onwards.
Variational Calculus
The first idea you need here is that of a functional. A
functional is basically a function which takes one function as
its argument and produces a numerical result. For instance,
J[y]=y(0)+y(1). Then if y=x2 , then J[y]=02
+12 =1. The functional we will be considering is
If F(x,y,y')=y then J[y] is the area under the curve y between
points a and b. If J[y]=sqrt(1+y'2 ) then J[y] is the
length of the curve y between points a and b (do you know this
result? Basically, if L(x) is the perimeter of the curve y from a
to x, then for very small lengths, dL2 =dx2
+dy2 , dividing through by dx2 we get
and integrating we get L(x)=sqrt(1+y'2 ).)
At this point, we will assume that our function F is twice
differentiable wrt x, this won't be a problem for our case as I
have remarked in an earlier email I think. We will also assume
that y(a)=ya and y(b)=yb , i.e. fixed end
points.
To work out what the maximal function y is, we introduce a new
function z with the properties that z is also twice
differentiable with z(a)=z(b)=0. Now, we introduce a new
parameter e (this is actually epsilon), where e> 0 and e is
very small (it will tend to 0 soon). Now, we let w(x)=y(x)+ez(x).
In other words, w is a path near to y with the same endpoints.
Here is a picture of the sort of thing I mean:

The black line is y, the blue line is w=y+ez. Now let
dJ=J[w]-J[y], this is the increase in the functional when you
change y to w. If F=y then dJ is the increase in the area due to
changing from y to w. At this point the mathematics becomes
rather hairy, so prepare yourself. Now, for small e, we can do a
Taylor series approximation about y wrt e, and assume that
e2 is negligible. In case you don't know, a Taylor
series is an approximation to a function using a polynomial. You
can show that if f(x)=a0 +a1 (x-b) +
a2 (x-b)2 / 2! + a3
(x-b)3 / 3! + ... then a0 = f(b),
a1 =f'(b), a2 =f''(b), an is the
nth derivative of f at b) From this, we get
approximately. When J is at a maximum, then dJ=0 (this is a
generalisation of the idea of maximizing a function given above).
The next steps I don't really want to justify here as it would
mean going into rather a lot of detail about the differences
between partial derivatives and total derivatives. I can
photocopy my relevant notes and post them to you if you like.
Basically, we work out dJ precisely as an integral and set it
equal to 0, simplify a bit and we get the Euler-Lagrange
equation:
Any maximal function must satisfy this equation. There are some
special cases, if there is no dependence of y' in F, e.g.
F(x,y,y')=xy, then you only need
. If there is no y dependence,
e.g. F(x,y,y')=xy', then
. If x is absent, e.g.
F(x,y,y')=yy', then
.
Here is a simple example of the ideas. If we want to find the
curve which connects two points for minimum length (this is
obviously a line) then we let F(x,y,y')=sqrt(1+y'2 )
so J[y] is the length of the curve. This is a special case
corresponding to
.
Therefore y'2 =k2 (1+y')2 .
Simplifying, we get y'=+-k/sqrt(1-k2 )=K say. Then
integrating we get y=Kx+C for some constant C. This is the
equation for a straight line. So we have proved that the straight
line is the curve of minimum length connecting two points.
Nice.
Integral Constraints
This is the last part of theory we need. Now we want to solve
maximize J[y] such that K[y]=constant. If J[y] is the area under
the graph and K[y] is the length of the curve, this is our
problem. Introduce a lagrange multiplier w, and say
I[y]=J[y]+wK[y]. If the function F is in the functional J and the
function G is in the functional K, then the Euler-Lagrange
equation implies:
(i.e. the same as the above equation with F replaced with F+wG).
This has the same special cases as before. Now, this can be
solved in exactly the same way as before, but you have two
variables, you also need to use the constraint K[y]=constant to
specify and eliminate w. When you do this you have the solution
to the constrained problem.
You can try and use this to solve the problem if you like, or I
can send you the solution to that as well. The question you
should try to answer is this:
When you have solved this question (which was on one of our
question sheets by the way) then you have solved the circle
problem (just set k=pi(b-a)/2). Don't worry if you can't do it,
it's not easy, just reply to this mail and I'll give you the
answer.
I hope you can make some sense of that!
By Brad Rodgers (P1930) on Saturday,
October 28, 2000 - 09:23 pm :
I think I understand all of it spare the formulation of the
Euler-Langrange eqn. and the very last part (Integral Constants).
But, I do believe that I was able to follow you up to these
points.
Thanks,
Brad
By Dan Goodman (Dfmg2) on Saturday,
October 28, 2000 - 09:38 pm :
Looking over it again, I see that I
didn't actually properly derive the Euler-Lagrange equation, but
I'm a little busy to go into the whole details of this right now,
so I'm going to post the lecture notes for this
course.
[Unfortunately these are no longer
available - The Editor]
By Brad Rodgers (P1930) on Tuesday,
October 31, 2000 - 07:52 pm :
I'm having a bit of trouble accessing your file. If I can't
download it for some reason, don't worry too much, I should be
able to find out about formulation elseware.
Thanks,
Brad
By Brad Rodgers (P1930) on Saturday,
November 4, 2000 - 01:03 am :
Ok, I can view it . But, I am finding a little difficulty
understanding it - I get lost with the notation (this is my best
to represent it, it's found near the bottom of page 25)
[dy¶F/¶y ' ]x1x2}=0
What does this mean? I can follow it up to this point
though.
Thanks,
Brad
By Dan Goodman (Dfmg2) on Saturday,
November 4, 2000 - 03:41 am :
OK, the square bracked notation means evaluate the definite
integral. i.e. [F(x)]x1x2=F(x2)-F(x1). It usually crops up in
integrals, i.e. òab f ' (x) dx=[f(x)]ab=f(b)-f(a). The }=0 bit
means (in the context of the bottom of page 25) that this evaluates to 0. I'm
assuming that was your problem rather than the symbols d and ¶,
which pop up before there. The reason the above evaluates to zero is that it
is equal to dy(x1)×something -dy(x2)×something
and from the above section in the notes you've assumed dy(x1) = dy(x2)=0.
By Brad Rodgers (P1930) on Friday,
November 10, 2000 - 02:39 am :
Could you post the proof (or outline it)? I've been at this
problem for about a week now, with not much luck. I'm not sure
how to get the two equations in the needed form to evaluate.
Perhaps a worked example would help (the two examples in the text
just seem to give the answers and not the method).
Thanks,
Brad
By Brad Rodgers (P1930) on Tuesday,
November 14, 2000 - 04:08 am :
Also, what does the equation describing geodesics on a sphere
represent? Is it a polar equation?
Thanks,
Brad
By Dan Goodman (Dfmg2) on Tuesday,
November 14, 2000 - 07:44 pm :
Sorry Brad, I somehow seemed to have missed your message of
10th November amongst all the other NRICH messages. What exactly did you want
me to prove? The circle result? If yes, can you tell me if you're happy about
the basic ideas in the notes I sent you, i.e. can I assume this stuff in my
answer?
For your second question: The equation for ds2 on a sphere is given in plane
polar coordinates. This is where you specify a point in 3D by three numbers,
r, q, j. Conventions differ somewhat on what these mean, but
r is always the distance from the point to 0. q is usually an angle
between 0 and 2p, it is the angle of the line you get when you project the
line from 0 to the point onto the xy-plane, the angle is measured relative to
the x-axis. j is between 0 and p and is the angle between the line
from 0 to the point and the z-axis. (The conventions differ on which angle is
q or fand which axes you use, but the difference is usually not
important.) Using this and a bit of geometry, you should be able to work out
the cartesian coordinates of a point from it's spherical polar coordinates.
In the notes, they're assuming the sphere is of radius 1, so you can ignore
the r term completely.
By Brad Rodgers (P1930) on Tuesday,
November 14, 2000 - 11:23 pm :
The equation I can come up with is
d/dx(¶/¶y ' (ò0x y dx+l(ò0x( 1+(y ' )2)1/2dx-L)))-¶/¶y( ò0x y dx+l(ò0x(1+(y ' )2)1/2 dx-L)) = 0
I doubt that this is right. I'm not quite sure how to apply this
concept to integrals, or how to simplify it once given an
equation like the one above.
Thanks,
Brad
By Brad Rodgers (P1930) on Tuesday,
November 14, 2000 - 11:31 pm :
And, I think I understand everything well but the application.
I'm not sure I really understand where everything comes from in
the euclidean geodesic example.
Thanks,
Brad
By Dan Goodman (Dfmg2) on Wednesday,
November 15, 2000 - 02:26 am :
Hi Brad, I'll get back to you tomorrow
on this one, I need to get to bed now :)
By Dan Goodman (Dfmg2) on Sunday,
November 19, 2000 - 03:18 am :
Yikes! Looks like I forgot to get back
to you, oops. Sorry about that. Unfortunately I have a bit of a
work problem at the moment, if anyone else can help here please
chime in now! If nobody else replies, I'll try and get back to
you as soon as possible (hopefully within the next week or
so).
By Dan Goodman (Dfmg2) on Friday,
December 1, 2000 - 08:58 pm :
Hello again Brad, I've finally finished
lectures and (most of) the work I need to do this term, so I've
written up the solution to the Euclidean Geodesic problem. It's
quite notation intensive, so I've written it up as a postscript
file, which you can download here:
Euclidean
Geodesic Example
I'm not entirely sure if this is what you wanted (I've assumed
that the derivation of the Euler-Lagrange equations you were
happy with, and I've just done the particular example of the
Euclidean Geodesic). If you'd like me to comment further on the
theory or the Euclidean example, or if you'd like me to do
another example for you (for instance, the geodesics on a
sphere), post again.
By Brad Rodgers (P1930) on Monday,
December 4, 2000 - 08:11 pm :
I understand the euclidean geodesic example, but I am having
trouble using the langrange multipliers in the final eqn.
Here is the equation I have been able to get. I am not sure what
to do with it or if it's right:
y+w((1+y'2 )1/2 -y'2
/(1+y'2 )1/2 )=C
I'm not sure what to do with this (or even solve an eqn. of this
sort for that matter).
Thanks,
Brad
By Kerwin Hui (Kwkh2) on Monday,
December 4, 2000 - 09:23 pm :
Try manipulating the equation as
follows:
y+w(1+y'2 )-½ (1+y'2
-y'2 )=C
Which gives
w2 /(C-y)2 -1=y'2
which you can separate the variables and integrate.
Kerwin
By Dan Goodman (Dfmg2) on Tuesday,
December 5, 2000 - 04:23 am :
Nice try Brad, a small error in the
algebra I think, here is my worked solution (again, using
postscript for clarity):
Worked
solution to circle problem
Hope that's OK.
By Brad Rodgers (P1930) on Thursday,
December 7, 2000 - 08:00 pm :
Ah, I see. I was putting in (extremize)=F+wG rather than
(extremize)=F-wG. The latter makes much more sense for
extremizing, so I'm not sure why I wanted to use the one I did.
Anyways, I believe I understand the logic behind it -it's a very
elegant proof once you are able to see how the Euler-Lagrange
eqn. works.
Thanks,
Brad
By Dan Goodman (Dfmg2) on Thursday,
December 7, 2000 - 09:35 pm :
I don't think that there should be a
difference in extremizing F+wG instead of F-wG because an
extremum of F-wG with w=w0 would be an extremum of
F+wG with w=-w0 (negative values aren't ruled out for
w). However, intuitively it does make more sense to extremize
F-wG which is why I did it that way round :). I'm glad I helped,
it's a very difficult thing to explain without a year or so of
university maths to call on!