# Building Up Perlin Noise

For my inaugural post, I’d like to discuss Perlin noise. There are a
number of explanations of Ken Perlin’s classic noise function, and most
tend to lean heavily on viewing it as a weighted average of gradients, but
I don’t think this is really the best way to get a constructive intuition
for how it works. Perlin once mentioned that, “In fact, the *noise*
function can be viewed as the surflet decomposition of a random density
function sampled at a rectangular grid.” Though cryptic, I believe that
this is the key to a far more useful understanding of Perlin noise. This
post will be partly an explorable explanation of that idea and will show
how to build up the noise function step-by-step.

## Theory

Let’s start with the interpolant. The original Perlin noise algorithm used
a cubic Hermite spline of the form
*s*(*t*) = 3*t*^{2} − 2*t*^{3}.
This particular function is also sometimes known as smoothstep. It
describes an s-shape, ramping smoothly up from 0 to 1 over the range of 0
to 1. It’s also symmetrical around the center of this square; that is,
*s*(*t*) = 1 − *s*(1 − *t*).

So let’s flip it around,
1 − *s*(*t*),
so that it’s 1 at 0 and falls off to 0 at 1. Then use the absolute value,
1 − *s*(|*t*|),
so that it’s symmetric and falls off smoothly to 0 at -1 too. Call this new
variation
*f*(*t*):

f(t) = 1 − (3 − 2|t|)t^{2} |

Next we’ll extend that to two dimensions by applying it separately to *x*
and *y* and then taking the product. (I focus exclusively on 2D noise in
this post, but it’s trivial to extend all this to higher dimensions.) The
figure below shows how this looks. In this and all the next figures, I’ve
used white for values of 1 and above, black for -1 and below, and grey for
0:

× | = | |||

f(x) |
f(y) |
f(x) f(y) |

There are a couple of things to note here. First, it obviously evaluates to 1 at the origin. But this falls off fairly smoothly to exactly 0 at the boundaries of the -1 to 1 square. The central lobe isn’t circular by any means, but it’s a reasonable approximation here.

Next, let’s multiply that falloff kernel by a gradient. This is the
gradient that will make this proper gradient noise and not just some other
kind of noise function. The gradient here is computed by taking the dot
product of an arbitrary vector, * G*, with each
(

*x*,

*y*) coordinate as a vector offset from the origin. (If you’re not familiar with the vector dot product, you multiply the

*x*-values from the two vectors, then the

*y*-values, etc. and then sum.) The figure below shows the falloff, the gradient and their product. Try clicking or dragging to change vector

*and see how it affects the result.*

**G**× | = | |||

f(x) f(y) |
⋅ (Gx,y) |
f(x) f(y) ( ⋅ (Gx,y)) |

If you played around with that, you may have noticed some things. First,
when looking at the gradient, there’s always a line for which it remains 0
(i.e., grey in these figures) and it always passes through the origin
perpendicular to the vector * G*, regardless of the magnitude of vector.
Rotating

*around changes the direction of the gradient, while changing the length controls the sharpness of the gradient (longer vectors make sharper gradients, shorter vectors make softer gradients). Nonetheless, the 2×2 square here is always divided into equal halves, one positive and one negative.*

**G**
When you factor in the falloff, there’s still a line of zeroes through the
origin, perpendicular to * G* and dividing the square into positive and
negative halves. But now it meets the borders of the square and those are
also zeroes. The equal-but-opposite minima and maxima lie within the
interior of each half of the square. Effectively, it’s a dipole. Note
that the direction of

*now exclusively determines the shape of the two regions (and the positions of the minima and maxima). The length of*

**G***makes no difference to the shape; it only determines the intensity.*

**G**
So given this, we can just focus on the direction of * G* and always use
unit length vectors. If we clamp the product of the falloff kernel and the
gradient to 0 at all points beyond the 2×2 square, this gives us the
surflet mentioned in that cryptic sentence.

Now that we’ve got that basic element, we can center one at each point on the rectangular integer grid (such that they’ll overlap like shingles) and then simply sum them all up. If we do that, we have our Perlin noise! The figure below shows a random patch of noise with the integer grid and the gradient vectors for the surflets overlayed. You can click or drag on it to change the vectors and see how it affects the noise.

Incidentally, this is why I’m not so fond of the reduced table of vectors
in Perlin’s Improving Noise paper. If one naively takes a 2D slice through
the 3D noise, along *z* = 0 for example, each gradient is only in one of
the 8 basic directions and it’s biased towards the cardinal directions.
This can produce ugly runs in the noise that go at 45° angles:

Instead of dealing with the complexity of gradients and surflets like this, one alternative is to just place random values between -1 and 1 on a grid and then upsample it using bilinear or bicubic interpolation. Some tutorials mistakenly call that Perlin noise, though it really isn’t; instead it’s considered “value noise”, as opposed to the “gradient noise” category that Perlin’s belongs to.

So why bother with gradient noise? Both value noise and gradient noise vary smoothly. The upper limit for how quickly they can change really depends on the density of the grid. As a result, they both have similar upper limits on the main frequencies they contain.

But value noise may give you runs where the values at the grid points are similar and the noise hardly changes (giving a “splotchy” appearance). In fact, a really unlucky case where all the values at the grid points are the same would contain only the zero frequency. Also, if you integrate a patch of value noise over some area (e.g., for downsampling it for mipmaps), it will approach 0 statistically but it’s highly unlikely to be exactly 0.

By contrast, for any given surflet, the line perpendicular to the gradient vector slices the surflet into equal but opposite halves that that are next to each other. No matter how the gradients lie, there will always be at least some wiggle to a gradient noise function, and again it will be dependent on the grid density. As a result, gradient noise has a lower limit on the main frequencies it contains; it is approximately bandlimited whereas value noise is not. Also, since the halves of each surflet are equal and opposite integrating over the whole surflet yields exactly 0. A periodic patch of gradient noise therefore also integrates to exactly 0.

## Code

That’s the theory. It’s time to turn that into working code. For this implementation, I’ll stick with very basic C++ in order to make this as easy as possible to port to other languages.

First, we’ll define three tables and a function to initialize them:

static int const size = 256; static int const mask = size - 1; int perm[ size ]; float grads_x[ size ], grads_y[ size ]; void init() { for ( int index = 0; index < size; ++index ) { int other = rand() % ( index + 1 ); if ( index > other ) perm[ index ] = perm[ other ]; perm[ other ] = index; grads_x[ index ] = cosf( 2.0f * M_PI * index / size ); grads_y[ index ] = sinf( 2.0f * M_PI * index / size ); } }

To give consistent results across each invocation of the noise function, the direction of the gradient vector at each point needs to remain constant. The traditional way to do this is to use a random permutation vector to help hash the point and then use the hash to look up a vector in a table. In this implementation, the table sizes needs to be a power of two so that we can use bitmasking to wrap indexes into the table. We could use the modulo operator instead, but then we’d have to add extra handling for negative coordinates.

The `perm`

table here is just the numbers 0, 1, 2, 3… in randomly
shuffled order, and `grad_x`

together with `grad_y`

give unit vectors
evenly distributed around the circle. Perlin’s original implementation
instead used random vectors from the 2×2 square around the origin and
normalized them, but this is both clumpy and biased towards diagonals.

Now for the noise evaluation itself:

float f( float t ) { t = fabsf( t ); return t >= 1.0f ? 0.0f : 1.0f - ( 3.0f - 2.0f * t ) * t * t; } float surflet( float x, float y, float grad_x, float grad_y ) { return f( x ) * f( y ) * ( grad_x * x + grad_y * y ); } float noise( float x, float y ) { float result = 0.0f; int cell_x = floorf( x ); int cell_y = floorf( y ); for ( int grid_y = cell_y; grid_y <= cell_y + 1; ++grid_y ) for ( int grid_x = cell_x; grid_x <= cell_x + 1; ++grid_x ) { int hash = perm[ ( perm[ grid_x & mask ] + grid_y ) & mask ]; result += surflet( x - grid_x, y - grid_y, grads_x[ hash ], grads_y[ hash ] ); } return result; }

This is pretty straightforward, given the approach outlined above. First
we define a falloff function, `f()`

, that evaluates to 1 at 0, and falls
off to 0 at -1 and 1. Note that it’s also clamped to return 0 beyond that.

Next, a `surflet()`

function that takes a point (relative to the center of
the surflet) and a gradient vector and returns the value of the surflet
there.

Finally, there’s the `noise()`

function itself which takes the point to
evaluate, loops over the nearest four grid points, hashes them to look up
their gradient vectors, and then sums the values of the surflets that are
centered on those grid points. Why the nearest four? Because those are
the only surflets that can influence the results; the others are too far
away. If you’ve played around with the noise patch in the previous figure,
this should be fairly easy to see.

So there it is. So long as the permutation and gradient tables match (and
floating point arithmetic notwithstanding) this should give exactly the
same results as the original Perlin noise function, even though it computes
it slightly differently. The code here is certainly not as optimized;
Perlin’s original code eliminates some redundant table lookups for hashing
and exploits the symmetry of
*s*(*t*) = 1 − *s*(1 − *t*)
to compute the falloffs more efficiently. It’s possible to turn the above
code step-by-step into something very like the original noise code.
However, the less optimized form here makes many modifications quite easy.

## Modifications

Since this approach to implementing Perlin noise tends to factor things out more, it’s quite easy to make small changes to the different building blocks. Here are some possible modifications.

*Stronger hash* — If you look carefully at how the points are hashed via
the permutation table to get a gradient, you’ll see that each column on the
grid repeats the same sequence of gradient vectors in order. They are
simply offset up and down from each other. It tends to be less noticeable
when the table is large enough, but it’s still possible to do better.

One easy solution is to have separate permutation tables, one for each dimension, and exclusive-or the lookups from each. This still wraps at the ends of the table, but at least it’s a 2D- rather than 1D-wrapping:

int hash = perm_x[ grid_x & mask ] ^ perm_y[ grid_y & mask ];

Another possibility is to dispense with the permutation table entirely. Zafar et al. suggest using the Tiny Encryption Algorithm to compute the hash algorithmically. This approach is nice because it doesn’t wrap at all in any reasonable period.

*Quintic interpolant* — Remember that we based `f()`

off of a cubic
polynomial,
*s*(*t*) = 3*t*^{2} − 2*t*^{3},
that we modified to be symmetrical around 0 and flipped upside down.
This has non-zero second derivatives at the edges that can cause visual
creases along grid lines. Perlin suggested an alternative quintic
polynomial,
*s*(*t*) = 6*t*^{5} − 15*t*^{4} + 10*t*^{3},
to fix this. Substituting this in is easy:

( ( 6.0f * t - 15.0f ) * t + 10.0f ) * t * t * t;

*Wide interpolant* — In an old tech report, I once suggested an alternative
quintic interpolant,
*s*(*t*) = 4(1 - *t*^{2}/4)^{5} - 3(1 - *t*^{2}/4)^{4},
designed to be used over twice the range (that is, for 4×4 sized
surflets). Here’s what the new body of `f()`

should like to implement this:

```
t = 1.0f - 0.25f * t * t;
return t < 0.0f ? 0.0f :
( 4.0f * t - 3.0f ) * t * t * t * t;
```

Note that because the surflets are now twice as wide, you’ll need to extend
the loop for `grid_y`

by 1 in each direction (i.e., from `cell_y - 1`

to
`cell_y + 2`

) and similarly for `grid_x`

. Otherwise, the surflets will be
clipped.

Here is a plot showing how the three versions of `f()`

compare:

Cubic Quintic Wide |

*Radial surflets* — We computed the falloff kernel by applying `f()`

to each dimension and multiplying the result. This makes it a a separable
filter. However, we can also apply it as a radial filter instead:

```
return f( hypotf( x, y ) ) * ( grad_x * x + grad_y * y );
```

When we do this, the surflets now fall off to 0 at the unit circle instead
of the 2×2 square. The nice thing about this is that the surflets now
rotate perfectly to align with the direction of their gradient, * G*.

*Improved vectors* — As I mentioned above, I’m not a fan of the reduced
set of 12 gradient vectors in Improving Noise and prefer the vectors from
the classic version. However, if you really want to use them, you can you
can set the tables explicitly rather than in `init()`

:

float grads_x[] = { -1.0f, -1.0f, -1.0f, -1.0f, 0.0f, 0.0f, 0.0f, 0.0f, 1.0f, 1.0f, 1.0f, 1.0f }; float grads_y[] = { -1.0f, 0.0f, 0.0f, 1.0f, -1.0f, -1.0f, 1.0f, 1.0f, -1.0f, 0.0f, 0.0f, 1.0f };

and then do the lookups from those tables in `noise()`

modulo 12, as long
as you don’t mind a slight bias.

*Value noise* — If you modify the `surflet()`

function to simply use a
single value like `grad_x`

rather than the dot product that produces the
gradient, you’ll get value noise:

```
return f( x ) * f( y ) * grad_x;
```

Note, however, that `grad_x`

is biased away from 0; if you try this, you
may also want to change `init()`

to set it to uniform random numbers
between -1 and 1.

*Jittered grid* — There’s no reason that the surflet centers have to be
confined to the points on the grid. We can add an offset to randomize
their positions within the grid cells:

result += surflet( x - ( grid_x + jit_x[ hash ] ), y - ( grid_y + jit_y[ hash ] ), grads_x[ hash ], grads_y[ hash ] );

Here, `jit_x`

and `jit_y`

are new tables initialized with random numbers
between 0 and 1. Note that the start of the loop over `grid_y`

will have
to be extended to include `cell_y - 1`

and similarly for `grid_x`

. One
nice thing about this is that it breaks up the regular pattern of the grid
points always being 0.

*Tilable noise* — At this point it should be fairly clear how to make
tilable noise: tile over an integer number of cells and just make sure the
choice of gradient directions wraps across the tile boundaries. If we want
to repeat every `repeat_x`

and `repeat_y`

cells in *x* and *y*,
respectively, (and where both are `size`

or smaller) then it will look
something like this, where the extra modulos help to handle negative grid
coordinates correctly:

int wrap_x = grid_x % repeat_x + repeat_x; int wrap_y = grid_y % repeat_y + repeat_y; int hash = perm[ ( perm[ wrap_x % repeat_x ] + wrap_y ) % repeat_y ];

*Simplicial grid* — Instead of a grid of unit squares, simplex noise uses
a grid of equilateral triangles (or tetrahedrons in 3D) with unit-length
sides. It’s a bit more involved to implement since you need to map
coordinates onto the simplex grid to find the neighborhood of nearest
surflets and then map their positions back to evaluate the them. Still,
if you don’t mind evaluating more surflets than strictly necessary it’s
not too hard:

int cell_x = roundf( x * 2.0f / sqrtf( 3.0f ) ); int cell_y = roundf( y + x / sqrtf( 3.0f ) ); for ( int grid_y = cell_y - 1; grid_y <= cell_y + 1; ++grid_y ) for ( int grid_x = cell_x - 1; grid_x <= cell_x + 1; ++grid_x ) { int hash = perm[ ( perm[ grid_x & mask ] + grid_y ) & mask ]; result += surflet( x - ( grid_x * sqrtf( 3.0f ) * 0.5f ), y - ( grid_y - grid_x * 0.5f ), grads_x[ hash ], grads_y[ hash ] ); }

## Design Your Own

With a little care many of these possible modifications can be combined to derive new variations on noise with different characteristics. Here’s a small sampler to explore some possibilities. You can click on the noise to generate new sets of random gradient vectors.

If you’re interested in reading more, one particularly interesting choice of elements is the Gabor Noise of Lagae et al. Their approach uses a Gabor kernel for the surflet function, multiple jittered surflets within each grid cell, and a selectable distribution for the gradients vectors that can shape them towards certain directions. Iñigo Quilez also uses jitter in a neat way with his voronoise.

*Edited Dec 19, 2015* - Added the plot of the three versions of `f()`

.