Sunday 28 April 2019

Average Distance Between Two Points in a Square

Figure 1: a unit square with vertices as shown
The problem of finding the average distance between two points in a unit square was treated in a YouTube video that I'll link to later in this post. My first response to the problem was to find to find an experimental answer by generating pairs of random points, finding the distance between them and then averaging these distances. Naturally I turned to SageMath and in particular its implementation on the Internet at SageMathCell.

Setting up an appropriate algorithm is rather straightforward given that the only formulae needed are the distance between two points and the mean. Here is the algorithm that I created and a permalink:

# (a, b) and (c, d) are random points on unit square
set_random_seed()
a, b, c, d=var('a, b, c, d')
sum, count=0,100000
for i in [1..count]:
   a=RR.random_element(0,1)
   b=RR.random_element(0,1)
   c=RR.random_element(0,1)
   d=RR.random_element(0,1)
   sum+=sqrt((a-c)^2+(b-d)^2)
print n(sum/count)

The first three results for this experiment involving 100000 points were: 
  • 0.520990683987433
  • 0.521024671679294
  • 0.520961814850565
Clearly the exact value to close to 0.521 but what the YouTube video dealt with was a theoretical computation, leading to an exact result and not an approximate, experimental result. Here is the video and below I'll reproduce the solution (more for my own benefit as for anyone else) as it was explained there:



STEP 1: express the distance in terms of variables \(x_1\), \(x_2\), \(y_1 \) and \(y_2\)

Figure 2: application of the distance formula to two points

STEP 2: integrate the distance over the entire area $$ \int_0^1 \,  \int_0^1 \, \int_0^1 \, \int_0^1 \sqrt{(x_1-x_2)^2+(y_1-y_2)^2} \, dx_1 \, dx_2 \, dy_1 \, dy_2$$STEP 3: simplify into \(x\) and \(y\) distances and adjust for the change in probability density functions:$$ 4 \int_0^1 \, \int_0^1 \sqrt{x^2+y^2} \, (1-x)(1-y) \, dx \,dy$$\( x_1-x_2\) and \(y_1-y_2 \) collapse to \(x\) and \(y\) respectively but \(x_1\), \(x_2\), \(y_1 \) and \(y_2\) all had probably density functions of 1 e.g. \( \int_0^1 1 \, dx_1=[x]_0^1=1\). However, with \(x=|x_1-x_2| \), the probability distribution of \(x\) is given by \(2 |1-x|\) and \(2 |1-y| )\) for \(y \). This still sums to 1 because \( \int_0^1 2|1-x| \, dx =2 \times [|x-x^2/2|]_0^1=1 \). It's called a triangular probability density function. I've no idea why and at this point in time I don't understand the underlying theory, so I'm just accepting it for the moment. Later I'll try to investigate further.

STEP 4: change to polar coordinates

Figure 3: converting to polar coordinates

We make the substitutions:

\(x=r\cos \theta \) with \(0 \leq \theta \leq \pi/4 \) and
\(y=r \sin \theta \) with \( 0 \leq r \leq 1/ \cos \theta \).

Remember that the Jacobian for this change of coordinates is \(r\) and so this means that the result must be multiplied by \(r\). Integration only ranges over the lower half of the square so the integral will also need to be multiplied by 2 as well (so multiplied by 8 overall).

STEP 5: substitute the polar coordinates into the earlier integral

The new integral becomes:$$8 \int_0^{\pi/4} \, \int_0^{1/\cos \theta} \sqrt{r^2cos^2\theta + r^2sin^2 \theta} \, (1-r \, \cos \theta) \, (1-r \, \sin \theta) \, r \,dr \, d \theta$$which simplifies to:$$8 \int_0^{\pi/4} \, \int_0^{1/\cos \theta} r \, (1-r \, \cos \theta) \, (1-r \, \sin \theta) \, r \,dr \, d \theta$$which then becomes$$8 \int_0^{\pi/4} \, \int_0^{1/\cos \theta} r^2-r^3 \, \cos \theta -r^3 \sin \theta+r^4 \, \sin \theta \, \cos \theta \,dr \, d \theta$$Integrate with respect to r and substitute the limits of integration into the result.$$\int_0^{\pi/4} \bigg ( \frac{\sec^3 \theta}{12}-\frac{\sec^3 \theta \, \tan \theta}{4}-\frac{\sec^3 \theta}{4}+\frac{\sec^3 \theta \, \tan \theta}{5}\bigg ) \, d\theta $$STEP 6: solve the simplified integral

The integral can clearly be simplified into the following form:$$\int_0^{\pi/4} \bigg ( \frac{\sec^3 \theta}{12}-\frac{\sec^3 \theta \, \tan \theta}{20} \bigg ) \, d\theta$$According to the video, the result is:$$8 \bigg [\frac{ \sec \theta \, \tan \theta + \log |\sec \theta + \tan \theta \,|}{24}-\frac{\sec^3 \theta}{60} \bigg ]_0^{\pi/4}=\frac{2+\sqrt 2+5 \log (\sqrt 2 +1)}{15}$$Of course, off the cuff I wouldn't be able to carry out those integrations, so I'm just accepting them for the moment and should really try to work them out for myself. An approximation for the previous expression is 0.521405433164721 which agrees fairly closely to what I got earlier by experimentation.

No comments:

Post a Comment