Minimising i=1 1381 ai 2 = a1 2 + a2 2 +...+ a1381 2


By Abbas Mehrabian on Monday, May 13, 2002 - 08:03 am:

Hi,
Here is my problem:


Suppose a1 , a2 , ..., a1380 are non-negative real numbers, such that:

i=1 1381 ai = a1 + a2 +...+ a1381 =2

i=1 1380 ai ai+1 = a1 a2 + a2 a3 +...+ a1380 a1381 =1

Let S= i=1 1381 ai 2 = a1 2 + a2 2 +...+ a1381 2

Find the maximum and minimum of S.

By squaring i=1 1381 ai , I've proved that the maximum of S is 2, and it happens when a1 = a2 =1, a3 = a4 =...= a1381 =0.

Using Cauchy inequality, I've also proved that the minimum of S, is more than 1. I guess it's 1.5.
Can anybody help me solving this problem?
Abbas.


By Andre Rzym on Saturday, May 25, 2002 - 07:01 pm:

I don't have an answer (yet), but I wonder if the way to tackle it is to focus on the fact that if the sum of the terms is 2, the sum of products of adjacent terms cannot exceed 1 (I think). By imposing 1 as an actual constraint appears to be limiting severly the range of values that the a's can take.

If we can prove that (for example) the only values that the a's could take were:

i) Two adjacent 1's and zeros everywhere else

or

ii) 0.5,1,0.5 as adjacent terms and zeros everywhere else

then the result follows (i.e. that the minimum value of the sum of squares is 1.5, which appears to be the case).

Andre


By Ian Short on Sunday, May 26, 2002 - 12:48 pm:

Well that sounds like a fair idea, but note that if n=3 rather than n=1381 then we have:

a+b+c=2
ab+bc=1

This is the same as:

b=1
a+c=1

So there are lots of solutions (although 1.5 is minimum). Why do you think minimum is 1.5, I have only checked n=3 case?

Ian


By Abbas Mehrabian on Monday, May 27, 2002 - 11:36 am:
If we have only 4 positive terms: a,b,c,d, a or d must be 0:

a+b+c+d = 2 => d = 2-a-b-c
ab+bc+cd = 1 => ab+bc+c(2-a-b-c) = 1 => ab-ac = c2 -2c+1
So,
a(b-c) = (c-1)2 > = 0
If a > 0 then b > = c.
Using a similiar work, we'll have:
If d > 0 then c > = b.
Hence if a,d > 0, then b = c = 1 which means a = d = 0 ! So a or d must be 0.
Maybe one can find a way to prove this using induction:
If such a1 to a1381 exist, only 3 adjacent terms can be positive.
Abbas.


By Andre Rzym on Tuesday, May 28, 2002 - 08:05 am:

We start by posing the question that I suggested above:


If g({ ai })= 1 1381 ai -2=0

what is the maximum value of

F({ ai })= 1 1380 ai ai+1

given the additional constraint that the ai are non-negative?

We first observe that F is bounded - it can't exceed 1380*4 for example.

We use lagrange multipliers to find the min/max/stationary points of F by constructing

H=F-λg

Then all partial derivatives with respect to ai are zero, so we must have:

(1) g=0

(2) dH/ da1 = a2 -λ=0

(3) dH/ da2 =( a1 + a3 )-λ=0

(4) dH/ da3 =( a2 + a4 )-λ=0

...

(1382) dH/ da1381 = a1380 -λ=0

(2) and (4) tell us that a4 =0, therefore the solution that minimises F has at least 1 zero.

These zeros may be to the left, the right but not in between the nonzero terms. For if the latter were the case, we could construct a larger F by removing the intervening zeros and putting them at one end.

Therefore, the ai 's that maximise F are those that maximise F but with only 1380 terms, plus a zero on one or the other end.

We can repeat the argument for the case with 1380 terms and so on. The argument fails when either we have 2 or 3 ai 's.

With 3 terms, we have

g({ ai })=( a1 + a2 + a3 )-2=0

F({ ai })= a1 a2 + a2 a3

H=F-λg

so again using lagrange multipliers:

(1) g=0

(2) dH/ da1 = a2 -λ=0

(3) dH/ da2 =( a1 + a3 )-λ=0

(4) dH/ da3 = a2 -λ=0

So

a1 =0.5+δ

a2 =1

a3 =0.5-δ

for some δ.

With 2 terms, we have

g({ ai })=( a1 + a2 )-2=0

F({ ai })= a1 a2

H=F-λg

so again using lagrange multipliers:

(1) g=0

(2) dH/ da1 = a2 -λ=0

(3) dH/ da3 = a2 -λ=0

So a1 =1

a2 =1

We have found the solutions that maximise F. The maximum is indeed 1. Now Abbas' problem specified that F=1. Therefore we know what the possible values of ai are.

Now with 3 nonzero terms, the minimum sum of squares is 1.5, the maximum is 2 (when δ=0).

With 2 nonzero terms, the sum of squares is 2.
Thus the minimum sum of squares is 1.5 and the maximum is 2.

Andre


By Abbas Mehrabian on Tuesday, May 28, 2002 - 05:26 pm:

Thanks Andre!
Please tell me about Lagrange multiplier, and partial derivative.
Abbas.


By Andre Rzym on Wednesday, May 29, 2002 - 09:08 pm:

An unfortunate property of this proof is that it has to rely on these fairly advanced concepts. I feel sure a more elementary solution must exist - I just can't find it.

Would anyone care to opine on whether the proof is correct? Ian?

As for partial derivatives, are you familiar with differentiation? The idea there is (given a function of x, F(x)), we compute dF/dx - the ratio of the change in F to a small change in x (a terse description but hopefully you've seen it before).

What, though, if F were a function of two variables. E.g. if F(x,y)=x2 y2 ? If we compute the ratio of the change in F to a small change in x keeping y constant then we are said to be taking the "partial derivative of F with respect to x". Similarly, if we compute the ratio of the change in F to a small change in y keeping x constant then we are said to be taking the "partial derivative of F with respect to y". In our example, the first partial derivative is 2xy2 and the second partial derivative is 2x2 y.


In the context of your problem, why are we interested in partial derivatives? Suppose that we move x by a small amount δx and y by a small amount δy. Then the corresponding change in F is given by (both derivatives being partial):

(dF/dx)δx+(dF/dy)δy

This is analagous to the first term of a Taylor expansion for functions of a single variable.

[as a complete aside, and unrelated to this post, I cannot resist pointing out that even for small δx, it is not correct to ignore higher derivatives if x is stochastic (random). This issue occurs all the time in the world of financial mathematics]

Now we come to Lagrange multipliers.

If we have a function of a single variable, Q(x), we can locate max/min/stationary points by solving dQ/dx=0. If Q was a function of 2 variables, Q(x,y), we can do the same by simultaneously solving dQ/dx=0; dQ/dy=0 (both partial derivatives).

What, though, if we had to maximise Q(x,y) subject to a constraint on (x,y), like g(x,y)=0?

If you think about Q as being represented graphically as a series of contours (each representing different values of Q), and the constraint g=0 being another contour, then we are looking for the point at which both contours touch but do not cross.

How do we represent this intuition mathematically?

Well if

g(x,y)=0

then for a small move in x, y for which g=0 still holds, we must have

(dg/dx)δx+(dg/dy)δy=0

We can interpret this equation as being the scalar product of 2 vectors: (dg/dx,dg/dy) and (δx,δy). The fact that the result is zero means that they are perpendicular. Now (δx,δy) runs along the contour, so (dg/dx,dg/dy) must be perpendicular to the contour.

So much for the constraint. What about Q? Well we agreed that at our max/min/stationary point, g=0 touches but does not cross a contour of Q. So at this point, (dQ/dx)δx+(dQ/dy)δy=0 as well. The conclusion is that (dQ/dx,dQ/dy) must be a vector parallel to (dg/dx,dg/dy), in other words, there exists a λ (the lagrange multiplier) such that

d(Q+λg)/dx=d(Q+λg)/dy=0

But this condition is just the stationary point of Q+λg with no constraints!

The recipe for this method is therefore as follows:

i) Write your constraint as g(x,y)=0

ii) Write the function to maximise as F(x,y)

iii) Construct G(x,y)=F(x,y)-λg(x,y)

iv) simultaneously solve:

g=0

dG/dx=0

dG/dy=0

And thus you can solve for x, y and λ
The method extends to as many independent variables as you wish.

Let me know if you want some examples.

Andre


By Abbas Mehrabian on Thursday, May 30, 2002 - 08:16 pm:

Thanks Andre,
I understand your solution now.
But isn't there any more straight way?
Abbas.


By Ian Short on Thursday, May 30, 2002 - 09:56 pm:

My friend proved the first part without Lagrange Multipliers. i.e. if

a1 + .. + an =2 (n > =3)

then

a1 a2 + .. + an-1 an < =1 (#)

with equality only if all but 2 or 3 of the ai s are non-zero.

To do this he drew a box and grid inside with a2 , a2 +a4 , a2 +a4 +a6 ,... (evens) along the bottom and odds up the side. He then shaded in areas corresponding to the sum (#) above- equality iff whole box shaded.

That's a very rough explanation. Needs a picture, I'll do this tomorrow when I have more time. Completely based on Andre's answer.

Ian


By Ian Short on Sunday, June 02, 2002 - 01:34 pm:

This idea was courtesy of my friend Ed Crane.

Andre solved the problem well and Ed proved the first part as I mentioned 2 messages ago. Let,

x=a1 +a3 +... (odds)
y=a2 +a4 +... (evens)

Then x+y=2 and a1 a2 +..+an-1 an < = xy < =1. Now, xy is the area of the rectangle drawn below and the shaded regions represent a1 a2 +..+an-1 an (for n=6).

rectangle with area xy

Equality can hold in above inequalities iff we do not get a staircase pattern appearing, i.e. only 3 consecutive non-zero terms.

That's the gist of the idea. Sorry for delay.

Ian


By Abbas Mehrabian on Sunday, June 02, 2002 - 05:45 pm:

Thanks a lot for your diagram Ian and for telling your friend's idea. It's a very good idea to simplifying Andre's solution. In fact, there is no need to a diagram!

Let
x = a1 + a3 + ... + a1381
y = a2 + a4 + ... + a1380
Then x + y = 2, so P = a1 a2 + a2 a3 + ... + a1380 a1381 < = xy < = 1.
To have maximum a1 a2 + a2 a3 + ... + a1380 a1381 , we must have:

P = xy = P
+ a1 (a4 + a6 + ... + a1380 )
+ a3 (a6 + a8 + ... + a1380 )
+ ...
+ a1381 (a2 + ... + a1378 )

a1 (a4 + a6 + ... + a1380 ) is zero so either a1 or (a4 + a6 + ... + a1380 ) is zero. If a1 is zero, we eliminate it and continue with the others. Else, a1381 a2 is zero. If a1381 is zero, we can continue with others, else a2 = 0. If we swap a2 and a1 , P won't decrease so we can let a1 = 0 and continue with others. Using this method several times, we'll understand there can be at most 3 positives in a s. Rest is simple.
Abbas.


By Ian Short on Sunday, June 02, 2002 - 07:44 pm:

Good, well explained. If maths monthly publish a different answer then please tell me!

Ian


By Andre Rzym on Tuesday, June 04, 2002 - 12:35 pm:

I like this proof, Abbas. It seems much more appropriate to the problem than my calculus solution.

Andre