Generating uniformly distributed numbers on a sphere

So, we want to generate uniformly distributed random numbers on a unit sphere. This came up today in writing a code for molecular simulations. Spherical coordinates give us a nice way to ensure that a point is on the sphere for any :

image

In spherical coordinates, is the radius, is the azimuthal angle, and is the polar angle.

A tempting way to generate uniformly distributed numbers in a sphere is to generate a uniform distribution of and , then apply the above transformation to yield points in Cartesian space , as with the following C++ code.

#include<random>
#include<cmath>
#include<chrono>

int main(int argc, char *argv[]) {
    // Set up random number generators
    unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
    std::mt19937 generator (seed);
    std::uniform_real_distribution<double> uniform01(0.0, 1.0);

    // generate N random numbers
    int N = 1000;

    // the incorrect way
    FILE * incorrect;
    incorrect = fopen("incorrect.csv", "w");
    fprintf(incorrect, "Theta,Phi,x,y,z\n");
    for (int i = 0; i < N; i++) {
        // incorrect way
        double theta = 2 * M_PI * uniform01(generator);
        double phi = M_PI * uniform01(generator);
        double x = sin(phi) * cos(theta);
        double y = sin(phi) * sin(theta);
        double z = cos(phi);
        fprintf(incorrect, "%f,%f,%f,%f,%f\n", theta, phi, x, y, z);
    }
    fclose(incorrect);
}

The distribution in the - plane in this strategy is uniform:

image

After mapping these points in the - plane to the sphere using the relationship between spherical and Cartesian coordinates above, this incorrect strategy yields the following distribution of points on the sphere. We see that the points are clustered around the poles ( and ) and sparse around the equator ().

image

The reason for this is that the area of a differential surface element in spherical coordinates is . This formula for the area of a differential surface element comes from treating it as a square of dimension by . These dimensions of the differential surface element come from simple trigonometry.

image

So, close to the poles of the sphere ( and ), the differential surface area element determined by and gets smaller since . Thus, we should include less points near and and more points near to achieve a uniform distribution on the sphere.

Our goal is to find and then draw samples from the probability distribution that maps from the - plane to a uniform distribution on the sphere.

Let be a point on the unit sphere . We want the probability density to be constant for a uniform distribution. Thus since and . We want to represent points using the parameterization in and and find the corresponding probability density function that maps to a uniform distribution on the sphere. We can obtain a uniform distribution by enforcing:

since is the probability of finding a point in an area about on the sphere. Because , it follows that .

Marginalizing the joint distribution to get the p.d.f of and separately:

We see that is a uniformly distributed variable and scales with ; we want more points around the equator, , which is where takes its maximum.

Now, how can we sample numbers that follow the distribution ? We’d like to use the readily available uniform random number generator in as before. Inverse Transform Sampling is a method that allows us to sample a general probability distribution using a uniform random number. For this, we need the cumulative distribution function of :

Keep in mind that is a monotonically increasing function from since it is a cumulative distribution function. Thus, it has an inverse function .

Let be the uniform random number in that we do know how to generate. To see how inverse transform sampling works, note that

This is a property of the uniform random variable , since for any number , . As is invertible and monotone, we can preserve this inequality by writing:

Aha! This shows that is the cumulative distribution function for the random variable ! Thus, follows the same distribution as . The algorithm for sampling the distribution using inverse transform sampling is then:

  • Generate a uniform random number from the distribution .

  • Compute such that , i.e. .

  • Take this as a random number drawn from the distribution .

In our case, .

The algorithm below in C++ shows how to generate uniformly distributed numbers on the sphere using this method:

#include<random>
#include<cmath>
#include<chrono>

int main(int argc, char *argv[]) {
    // Set up random number generators
    unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
    std::mt19937 generator (seed);
    std::uniform_real_distribution<double> uniform01(0.0, 1.0);

    // generate N random numbers
    int N = 1000;

    // the correct way
    FILE * correct;
    correct = fopen("correct.csv", "w");
    fprintf(correct, "Theta,Phi,x,y,z\n");
    for (int i = 0; i < N; i++) {
        // incorrect way
        double theta = 2 * M_PI * uniform01(generator);
        double phi = acos(1 - 2 * uniform01(generator));
        double x = sin(phi) * cos(theta);
        double y = sin(phi) * sin(theta);
        double z = cos(phi);
        fprintf(correct, "%f,%f,%f,%f,%f\n", theta, phi, x, y, z);
    }
    fclose(correct);
}

We then get the following distribution of points in the plane. There are more points around (the equator) than around the poles (), as we had hoped for.

image

And, finally, a uniform distribution of points on the sphere.

image

Alternative method 1

An alternative method to generate uniformly disributed points on a unit sphere is to generate three standard normally distributed numbers , , and to form a vector . Intuitively, this vector will have a uniformly random orientation in space, but will not lie on the sphere. If we normalize the vector , it will then lie on the sphere.

The following Julia code implements this. We have to be careful in the case that the vector has a norm close to zero, in which we must worry about floating point precision by dividing by a very small number. This is the reason for the while loop.

n = 100

f_normal = open("normal.csv", "w")
write(f_normal, "x,y,z\n")

for i = 1:n
    v = [0, 0, 0]  # initialize so we go into the while loop

    while norm(v) < .0001
        x = randn()  # random standard normal
        y = randn()
        z = randn()
        v = [x, y, z]
    end
    
    v = v / norm(v)  # normalize to unit norm

    @printf(f_normal, "%f,%f,%f\n", v[1], v[2], v[3])
end

To prove this, note that the standard normal distribution is:

As , , and each follow the standard normal distribution and are generated independently:

With some algebra:

This shows that the probability distribution of only depends on its magnitude and not any direction and . The vectors are thus indeed pointing in uniformly random directions. By finding where the ray determined by this vector intersects the sphere, we have a sample from a uniform distribution on the sphere.

Alternative method 2

Credit to FX Coudert for pointing this out in a comment, another method is to generate uniformly distributed numbers in the cube and ignore any points that are further than a unit distance from the origin. This will ensure a uniform distribution in the region . Next, normalize each random vector to have unit norm so that the vector retains its direction but is extended to the sphere of unit radius. As each vector within the region has a random direction, these points will be uniformly distributed on a sphere of radius 1.

Alternative method 1 can be used to efficiently generate uniformly distributed numbers on a hypersphere– i.e. in higher dimensions. On the other hand, as the number of dimensions grows, the ratio of the volume of the edges of a cube to the volume of the ball inside of it grows (see Wikipedia article). Hence, a larger and larger fraction of the uniformly generated numbers in the cube will be rejected using Alternative method 2, and so this algorithm will be inefficient.

comments powered by Disqus