Calculus of variations: a detailed introduction


By Brad Rodgers (P1930) on Saturday, October 28, 2000 - 01:34 am :

How would I differentiate the expression:

A=yL/(1+(dy/dx)2 )1/2

where L is a constant, and y is a function of x?

Any help is much appreciated

Thanks,

Brad


By Michael Doré (Md285) on Saturday, October 28, 2000 - 01:34 pm :

So you want to differentiate A with respect to x?

Let u = yL, v = sqrt(1 + (dy/dx)2 ) so A = u/v.

Now:

begin{eqnarray*} frac{du}{dx} &=& L frac{dy}{dx} frac{dv}{dx} &=& frac{... ...ac{d^2y}{dx^2}}{sqrt{1+(frac{dy}{dx})^2}}}{1+(frac{dy}{dx})2} end{eqnarray*}








Whether that is going to simplify nicely I'm not sure. By the way, what is the context of this problem? Perhaps there's an easier way to do it...
By Brad Rodgers (P1930) on Saturday, October 28, 2000 - 03:51 pm :

Yes, you are right that I wanted to differentiate that with respect to x.

I was actually trying to prove the theorem for the maximum area for a given length of a closed curve. But looking back at this, I've realized that there is no maximum given only the constraints of:

dA=ydx

dL=(1+(dy/dx)2 )1/2

I could perhaps change the problem to


L=ya+yb+òab(1+(dy/dx)2)1/2
Where ya , yb signify y at a and b respectively.

But I don't know that this would be any easier to work with.

Thanks,

Brad
By Dan Goodman (Dfmg2) on Saturday, October 28, 2000 - 04:07 pm :

Brad, what you're trying to do (I think) is called the Calculus of Variations, which is covered here at Cambridge in the second year. However, it's not really THAT difficult, I can give you a brief introduction if you'd like? Basically, what happens is that you end up with a differential equation (called the Euler-Lagrange equation), the solution of which is a curve which maximises the area. You can generalise the problem to maximise "functionals" of a curve (of which area is one example) subject to functional constraints (of which length is an example). For your example, the solution is (IIRC) an arc of a circle going through (a,ya ) and (b,yb ) with the length specified. Of course, there's no fun in just being told, I'll post the (reasonably long) introduction to C of V (which I wrote for someone else who wanted to solve a very similar problem) if you'd like?


By Dan Goodman (Dfmg2) on Saturday, October 28, 2000 - 04:17 pm :

Here it is. Have a go at working your way through it if you like, although it is quite hard.

Partial Derivatives

A partial derivative is basically the same as a normal derivative, but for functions of more than one variable. For instance, if f(x,y)=x2 +y2 then

begin{displaymath}frac{partial f}{partial x} = 2x end{displaymath}



(treat y as constant).

Maximizing Functions

Okay, you know how to maximize f(x), you solve df/dx=0 and find which values of x satisfying this equation corresponds to the maximum f and you also need to check that f doesn't increase above this value as f tends to infinity. For many variable functions, e.g. f(x,y), we have to solve the set of simultaneous equations
begin{displaymath}frac{partial f}{partial x} = 0, frac{partial f}{partial y} = 0 end{displaymath}


and that's it.

Lagrange Multipliers

We'll need the concept of Lagrange Multipliers to solve this thorny problem. Consider the problem: maximize f(x) subject to g(x)=0 (x could be a vector here). Later our problem will reduce to one of this form and we need to know how to solve it. Mathematicians don't know a direct way of solving this problem so they consider an alternative problem. Let h(x,w)=f(x)+wg(x). Maximize h(x,w) subject to no constraints. You do this by solving the set of simultaneous eqns:
begin{displaymath}frac{partial h}{partial x\_1} = frac{partial h}{partial ... ...{partial h}{partial x\_n} = frac{partial h}{partial w} = 0 end{displaymath}


The same method also works for minimizing. For instance to minimize f(x,y)=x2 +y2 subject to g(x,y)=y-x2 +1=0. h(x,y,w)=x2 +y2 +w(y-x2 +1), we need to solve
begin{displaymath}frac{partial h}{partial x} = frac{partial h}{partial y} = frac{partial h}{partial w} = 0 end{displaymath}


That is solve:

(1) 2x(1-w)=0
(2) 2y+w=0
(3) y-x2 +1=0

From (2) and (3) we can eliminate w, as from (2) w=-2y, putting this into (1) we get 2x(1+2y)=0. The solutions are x=0 or y=-1/2. From (3) we also have y=-1 or x=±sqrt(1/2). Plug each of the possibilites into f and take the one which gives the maximum.

Why does this method work? Well, we need $ frac{partial h}{partial w} = 0 $, that is g(x)=0, the constraint! So, solving for the unconstrained maximum of this function automatically solves for the constrained maximum of the first function! Neat. OK, onwards.

Variational Calculus

The first idea you need here is that of a functional. A functional is basically a function which takes one function as its argument and produces a numerical result. For instance, J[y]=y(0)+y(1). Then if y=x2 , then J[y]=02 +12 =1. The functional we will be considering is
begin{displaymath}J[y] = int\_a^b F(x,y,y') dx end{displaymath}


If F(x,y,y')=y then J[y] is the area under the curve y between points a and b. If J[y]=sqrt(1+y'2 ) then J[y] is the length of the curve y between points a and b (do you know this result? Basically, if L(x) is the perimeter of the curve y from a to x, then for very small lengths, dL2 =dx2 +dy2 , dividing through by dx2 we get
begin{displaymath}Bigl(frac{partial L}{partial x}Bigr)^2 = 1 + Bigl(frac{partial y}{partial x}Bigr)^2 end{displaymath}


and integrating we get L(x)=sqrt(1+y'2 ).)

At this point, we will assume that our function F is twice differentiable wrt x, this won't be a problem for our case as I have remarked in an earlier email I think. We will also assume that y(a)=ya and y(b)=yb , i.e. fixed end points.

To work out what the maximal function y is, we introduce a new function z with the properties that z is also twice differentiable with z(a)=z(b)=0. Now, we introduce a new parameter e (this is actually epsilon), where e> 0 and e is very small (it will tend to 0 soon). Now, we let w(x)=y(x)+ez(x). In other words, w is a path near to y with the same endpoints. Here is a picture of the sort of thing I mean:

Local paths

The black line is y, the blue line is w=y+ez. Now let dJ=J[w]-J[y], this is the increase in the functional when you change y to w. If F=y then dJ is the increase in the area due to changing from y to w. At this point the mathematics becomes rather hairy, so prepare yourself. Now, for small e, we can do a Taylor series approximation about y wrt e, and assume that e2 is negligible. In case you don't know, a Taylor series is an approximation to a function using a polynomial. You can show that if f(x)=a0 +a1 (x-b) + a2 (x-b)2 / 2! + a3 (x-b)3 / 3! + ... then a0 = f(b), a1 =f'(b), a2 =f''(b), an is the nth derivative of f at b) From this, we get
begin{displaymath}dJ = e frac{partial }{partial e}(J[w]) end{displaymath}



approximately. When J is at a maximum, then dJ=0 (this is a generalisation of the idea of maximizing a function given above). The next steps I don't really want to justify here as it would mean going into rather a lot of detail about the differences between partial derivatives and total derivatives. I can photocopy my relevant notes and post them to you if you like. Basically, we work out dJ precisely as an integral and set it equal to 0, simplify a bit and we get the Euler-Lagrange equation:
begin{displaymath}frac{d }{dx}Bigl(frac{partial F}{partial y'}Bigr) - frac{partial F}{partial x} = 0 end{displaymath}




Any maximal function must satisfy this equation. There are some special cases, if there is no dependence of y' in F, e.g. F(x,y,y')=xy, then you only need $ frac{partial F}{partial y} = 0 $. If there is no y dependence, e.g. F(x,y,y')=xy', then $ frac{partial F}{partial y'} = constant $. If x is absent, e.g. F(x,y,y')=yy', then $ F - y'frac{partial F}{partial y'} = constant $.

Here is a simple example of the ideas. If we want to find the curve which connects two points for minimum length (this is obviously a line) then we let F(x,y,y')=sqrt(1+y'2 ) so J[y] is the length of the curve. This is a special case corresponding to $ frac{partial F}{partial y'} = k (constant) $.
begin{displaymath}frac{partial F}{partial y'} = frac{y'}{sqrt{1+y'^2}} = k end{displaymath}


Therefore y'2 =k2 (1+y')2 . Simplifying, we get y'=+-k/sqrt(1-k2 )=K say. Then integrating we get y=Kx+C for some constant C. This is the equation for a straight line. So we have proved that the straight line is the curve of minimum length connecting two points. Nice.

Integral Constraints

This is the last part of theory we need. Now we want to solve maximize J[y] such that K[y]=constant. If J[y] is the area under the graph and K[y] is the length of the curve, this is our problem. Introduce a lagrange multiplier w, and say I[y]=J[y]+wK[y]. If the function F is in the functional J and the function G is in the functional K, then the Euler-Lagrange equation implies:
begin{displaymath}frac{d }{dx}Bigl(frac{partial }{partial y'}(F+wG)Bigr) - frac{partial }{partial y}(F+wG) = 0 end{displaymath}


(i.e. the same as the above equation with F replaced with F+wG). This has the same special cases as before. Now, this can be solved in exactly the same way as before, but you have two variables, you also need to use the constraint K[y]=constant to specify and eliminate w. When you do this you have the solution to the constrained problem.

You can try and use this to solve the problem if you like, or I can send you the solution to that as well. The question you should try to answer is this:
begin{displaymath}Maximise int\_a^b y dx end{displaymath}



begin{displaymath}subject to int\_a^b sqrt{1+y'^2} dx = k (constant) end{displaymath}


When you have solved this question (which was on one of our question sheets by the way) then you have solved the circle problem (just set k=pi(b-a)/2). Don't worry if you can't do it, it's not easy, just reply to this mail and I'll give you the answer.

I hope you can make some sense of that!
By Brad Rodgers (P1930) on Saturday, October 28, 2000 - 09:23 pm :

I think I understand all of it spare the formulation of the Euler-Langrange eqn. and the very last part (Integral Constants). But, I do believe that I was able to follow you up to these points.

Thanks,

Brad


By Dan Goodman (Dfmg2) on Saturday, October 28, 2000 - 09:38 pm :

Looking over it again, I see that I didn't actually properly derive the Euler-Lagrange equation, but I'm a little busy to go into the whole details of this right now, so I'm going to post the lecture notes for this course.
[Unfortunately these are no longer available - The Editor]


By Brad Rodgers (P1930) on Tuesday, October 31, 2000 - 07:52 pm :

I'm having a bit of trouble accessing your file. If I can't download it for some reason, don't worry too much, I should be able to find out about formulation elseware.

Thanks,

Brad


By Brad Rodgers (P1930) on Saturday, November 4, 2000 - 01:03 am :

Ok, I can view it . But, I am finding a little difficulty understanding it - I get lost with the notation (this is my best to represent it, it's found near the bottom of page 25)


[dyF/y ' ]x1x2}=0
What does this mean? I can follow it up to this point though.

Thanks,

Brad
By Dan Goodman (Dfmg2) on Saturday, November 4, 2000 - 03:41 am :
OK, the square bracked notation means evaluate the definite integral. i.e. [F(x)]x1x2=F(x2)-F(x1). It usually crops up in integrals, i.e. òab f ' (x) dx=[f(x)]ab=f(b)-f(a). The }=0 bit means (in the context of the bottom of page 25) that this evaluates to 0. I'm assuming that was your problem rather than the symbols d and , which pop up before there. The reason the above evaluates to zero is that it is equal to dy(x1)×something -dy(x2)×something and from the above section in the notes you've assumed dy(x1) = dy(x2)=0.
By Brad Rodgers (P1930) on Friday, November 10, 2000 - 02:39 am :

Could you post the proof (or outline it)? I've been at this problem for about a week now, with not much luck. I'm not sure how to get the two equations in the needed form to evaluate. Perhaps a worked example would help (the two examples in the text just seem to give the answers and not the method).

Thanks,

Brad


By Brad Rodgers (P1930) on Tuesday, November 14, 2000 - 04:08 am :

Also, what does the equation describing geodesics on a sphere represent? Is it a polar equation?

Thanks,

Brad


By Dan Goodman (Dfmg2) on Tuesday, November 14, 2000 - 07:44 pm :
Sorry Brad, I somehow seemed to have missed your message of 10th November amongst all the other NRICH messages. What exactly did you want me to prove? The circle result? If yes, can you tell me if you're happy about the basic ideas in the notes I sent you, i.e. can I assume this stuff in my answer?

For your second question: The equation for ds2 on a sphere is given in plane polar coordinates. This is where you specify a point in 3D by three numbers, r, q, j. Conventions differ somewhat on what these mean, but r is always the distance from the point to 0. q is usually an angle between 0 and 2p, it is the angle of the line you get when you project the line from 0 to the point onto the xy-plane, the angle is measured relative to the x-axis. j is between 0 and p and is the angle between the line from 0 to the point and the z-axis. (The conventions differ on which angle is q or fand which axes you use, but the difference is usually not important.) Using this and a bit of geometry, you should be able to work out the cartesian coordinates of a point from it's spherical polar coordinates. In the notes, they're assuming the sphere is of radius 1, so you can ignore the r term completely.


By Brad Rodgers (P1930) on Tuesday, November 14, 2000 - 11:23 pm :

The equation I can come up with is


d/dx(/y ' (ò0x y dx+l(ò0x( 1+(y ' )2)1/2dx-L)))-/y( ò0x y dx+l(ò0x(1+(y ' )2)1/2 dx-L)) = 0
I doubt that this is right. I'm not quite sure how to apply this concept to integrals, or how to simplify it once given an equation like the one above.

Thanks,

Brad
By Brad Rodgers (P1930) on Tuesday, November 14, 2000 - 11:31 pm :

And, I think I understand everything well but the application. I'm not sure I really understand where everything comes from in the euclidean geodesic example.

Thanks,

Brad


By Dan Goodman (Dfmg2) on Wednesday, November 15, 2000 - 02:26 am :

Hi Brad, I'll get back to you tomorrow on this one, I need to get to bed now :)


By Dan Goodman (Dfmg2) on Sunday, November 19, 2000 - 03:18 am :

Yikes! Looks like I forgot to get back to you, oops. Sorry about that. Unfortunately I have a bit of a work problem at the moment, if anyone else can help here please chime in now! If nobody else replies, I'll try and get back to you as soon as possible (hopefully within the next week or so).


By Dan Goodman (Dfmg2) on Friday, December 1, 2000 - 08:58 pm :

Hello again Brad, I've finally finished lectures and (most of) the work I need to do this term, so I've written up the solution to the Euclidean Geodesic problem. It's quite notation intensive, so I've written it up as a postscript file, which you can download here:
Euclidean Geodesic Example



I'm not entirely sure if this is what you wanted (I've assumed that the derivation of the Euler-Lagrange equations you were happy with, and I've just done the particular example of the Euclidean Geodesic). If you'd like me to comment further on the theory or the Euclidean example, or if you'd like me to do another example for you (for instance, the geodesics on a sphere), post again.

By Brad Rodgers (P1930) on Monday, December 4, 2000 - 08:11 pm :

I understand the euclidean geodesic example, but I am having trouble using the langrange multipliers in the final eqn.

Here is the equation I have been able to get. I am not sure what to do with it or if it's right:

y+w((1+y'2 )1/2 -y'2 /(1+y'2 )1/2 )=C

I'm not sure what to do with this (or even solve an eqn. of this sort for that matter).

Thanks,

Brad


By Kerwin Hui (Kwkh2) on Monday, December 4, 2000 - 09:23 pm :

Try manipulating the equation as follows:

y+w(1+y'2 ) (1+y'2 -y'2 )=C

Which gives

w2 /(C-y)2 -1=y'2

which you can separate the variables and integrate.

Kerwin


By Dan Goodman (Dfmg2) on Tuesday, December 5, 2000 - 04:23 am :

Nice try Brad, a small error in the algebra I think, here is my worked solution (again, using postscript for clarity):
Worked solution to circle problem



Hope that's OK.

By Brad Rodgers (P1930) on Thursday, December 7, 2000 - 08:00 pm :

Ah, I see. I was putting in (extremize)=F+wG rather than (extremize)=F-wG. The latter makes much more sense for extremizing, so I'm not sure why I wanted to use the one I did. Anyways, I believe I understand the logic behind it -it's a very elegant proof once you are able to see how the Euler-Lagrange eqn. works.

Thanks,

Brad


By Dan Goodman (Dfmg2) on Thursday, December 7, 2000 - 09:35 pm :

I don't think that there should be a difference in extremizing F+wG instead of F-wG because an extremum of F-wG with w=w0 would be an extremum of F+wG with w=-w0 (negative values aren't ruled out for w). However, intuitively it does make more sense to extremize F-wG which is why I did it that way round :). I'm glad I helped, it's a very difficult thing to explain without a year or so of university maths to call on!