#### An Algorithm for Filling Valleys to Generate Rivers

Here I present a surprisingly easy way to fill valleys in a rasterized elevation map, for the purpose of creating realistic rivers quickly.

In my world-building excursions, I need to generate semi-realistic rivers over a landscape (such as the one below).

Although there are other approaches, this has involved modeling the flow of water, based on local topography: we start in the mountains and gradually collect water into larger and larger streams and rivers, until we eventually reach the coast. This sounds very straightfotward -- and indeed the flow calculations are. But a major hurdle is the presence of valleys, that is, local minima: flowing water will get trapped and have nowhere to go, cutting off river formation. Valleys have to somehow be filled. I tried a few approaches, which I'll discuss below.

(This piece follows the gradual development of my attempts to solve the problem: if you would like to skip to details of the working model, [you can].)

## The Simulated Water Drop Approach

One solution is to model units of water on each tile so that they move and accumulate naturally in the divets and dells, eventually filling them and causing additional water to flow out in a new path. This is computationally intensive, as you might imagine: you must track all the water, and recalculate flow directions every time-step based on its position. If you attempt to do this discretely, without differential equations, you are also liable to get oscillations and checker-board patterns as flowing water fails to catch up with itself evenly.

I implemented this type of system initially, but wanted to find an alternative, both because of speed and another unforseen drawback: as water moves off the mountains and into low-lying areas near the coast, it will tend to disperse into swamps. This is somewhat realistic, but the effect is that rivers often don't reach the coast. A solution would be modeling flow speed and water momentum, but that is a major increase in complexity. Instead I wanted to tackle the second major approach -- which is also simple, but required much greater cleverness to design than the first.

## The Pre-Fill Approach

The river model can be much faster if flows are only calculated a single time, in a master flow index, so that the movement of water in each time step always flows the same way. What's missing is a way for water to move past valleys. You can see that such a naive model does well on steep portions of the map with little topography, but fails on low-lying or rugged areas:

Without recalculating flow, we need to deal with valleys explicitly, prior to the rest of the model execution. In short, we must detect the valleys and then fill them to the correct height. In this way valleys becomes lakes, with a flat surface, which rivers can then flow over. This method takes more up-front processing, but is then very fast. It is more of a top-down (phenomenological) approach, interested in results more than simulating process.

Detecting tiny micro-valleys is very easy: find any cells with (orthagonal) neightbors that are all higher than it, then pick the lowest neighbor's value and fill to that height. This method doesn't work for anything larger, but a roughly equivalent process can be imagined: it would need to find bigger valleys and also find the height they should be filled to.

## Failed method: Isolating Every Valley

My initial attempt was too top-down and had to be abandoned. I thought I would identify and store the extents of every valley, and then fill them -- whereas the alternative was to simply apply rules to tiles, and not formally map valleys at all.

I started by determining flow locally, and used this to find a "final destination" for every tile, where its water would ultimately arrive. Any destination that wasn't ocean I treated as a valley bottom -- and with the tiles that drained into it, I had something like a watershed. This seemed very clever, but in fact I only had slices of watersheds, often part of the same valley.

These slices would need to be merged. After that, they would need to be filled...but to what height? Presumably the lowest neighboring tile not part of a valley. But valleys can easily abutt one another; they can even be completely surrounded by or contained within another valley. Whether two valleys should be merged into one really depends on the amount of water in them: if it's low enough, they each form a lake; high enough, and they spill into each other, merging into one lake. There is therefore no simple rule for when merging valleys.

The only way forward seemed to be focusing on the heights of water, not on valleys per se -- and on where that water would go.

## Using a 2-Dimensional Metaphor

A more bottom-up tack began by considering how I would solve the problem by hand in the much simpler case of a two-dimensional island -- where we are thinking first and foremost in terms of elevation profiles, and not the aerieal view. At least in a naive first pass, one could draw a series of lines across the landscape at different heights. These start at the ocean, and stop wherever they would intersect terrain -- leaving gaps where valleys are. This is basically generating the inverse of valleys.

Just how many lines (or 3d layers) should one draw? Since my landscapes are continuous in elevation, and not limited to integer values, an infinite number would be required. Clearly, the thought experiment must be refined.

The key realization is that one does not need to draw lines at arbitrary heights: the terrain itself supplies critical thresholds that incoming lines can or cannot cross. Therefore, we can have just one line that adjusts its height to match the terrain, as it is drawn in from the oceans. In particular, we will allow it to adjust its heigh up, but not down:

The exception is when one line meets another line, entering from the opposite side: then, at a given point, the lower of the two is preferred. This allows areas to drain at the lowest possible point -- but if there is no drain lower than the land, it fills up intead.

This is a good method that we could use on paper. And it can be extended into three dimensions fairly easily. But we have to phrase algorithms in terms of rules that govern individual cells -- not what we would draw.

## The Successful Fill Algorithm

The rules of cell-based algorithm are fairly simple, but I'll describe the process in English first, and why each step matters.

• We will track which cells are known to ultimately drain into the ocean. Initially, only ocean tiles themsleves will.
• We will have every cell find the lowest of its neighbors' values, and fill up to that new value. (Equivalent to the drawn line's height.) We repeat this process many times.
• But cells only consider neighbors that drain into the ocean. Any cell that gets adjusted is garaunteed to drain too, because its new height must match a draining cell (and we adjust its status accordingly).
• This gaurantee means that we will start at the coasts and gradually work inland (like our drawn line). Water in the cells we're working on will always have a path to the ocean: and so we will adjust fill heights only when it is necessary to establish that path.
• We keep examining all cells with draining neighbors, not just newly-draining ones, because a shorter path to ocean may yet be "discovered" (by working in from another direction), which would allow the cells to adjust its fill height down to a lower level. Thus we can avoid over-filled valleys.
• We can iterate this process until there are no more changes to be made, or for some number of steps. The time required depends on how convoluted the landscape is, but for most landscapes, it is complete within the linear length of the map (or the greater of its sides, if a rectangle).

Here is the output:

### Pseudo-code

The quick version for coders:

``````elev = a 2-d numeric array
fill = elev
drains = boolean array, elev == 0

timestep loop
for every cell
find neighboring cells and store fill values
set neighbor values to NULL where drain = 0
get max neighbor value
if max is higher than self fill
set fill to max
set drains to 1
output fill``````

## Interesting Features of the Output

We can consider how the model works on the example island, which as a reminder is here with no water:

### The Fill / Lakes

You can see here the difference between the filled map and the original. There is a spray of single cells filled in to a small degree, which smooths the effective surface. There are also four large lakes, which correspond clearly with depressions; one of them is actually quite deep, because the surrounding terrain was also fairly high, and so a high fill was needed to find a pathway to the ocean.

### Water Accumulation

I calculate the final river tiles based on the total amount of water that has passed through each tile; or, put another way, on the number of tiles upstream in the flow map. An arbitrary threshold in the display allows the user to exclude small rivers, but here we can see all the water at once. Particularly interesting are ridge-lines, that are almost black which hints at the different watersheds.

### Watersheds

Here we see the different zones that water drains into the ocean from. There are many smaller ones at the coasts, but we limit this arbitrarily to see the larger inland zones. Here, ridgelines are effectively outlined as well. I also like it because it calls to mind the suggestion that ecologically relevant municipalities would have borders matching their watershed, rather than arbitrary lines or major obstacles.

### Path Calculation

This graph shows the distance water must travel to flow most efficiently to the ocean. If we consider that water is doing pathfinding, this is shows the cost of moving into a neighboring tile: a river will prefer the darkest-colored tile available to it; where discontinuities in color show barriers to flow.

In fact, I do not exclusively use this map to determine river flow direction: first, local topography (steepness) is considered, with this pathfinding beind used as a tie-breaker (after that, arbitrary cell index is used).

## Conclusions

There has been a lot of academic writing about valley-detecting algorithms (as well as ridge-detecting), as it has many uses in image processing. I know geogrophers have also developed many tools for calculating flows, and that small-scale divit filling is part of it. I developed this solution solely out of a practical need, in terms of making rivers quickly, and a curiosity about how it might be done. The algorithm may exist already, and I'm happy to have gone through the process of indenedently generating it. It also might not, in which case I hope it is useful to other people.

One thing that may make it unique is that it depends on arbitrary start-points, i.e. the ocean. In many circumstances, probably generic image processing, that may not be available, or make any sense. So I could foresee that this algorithm is uniquely tailored to DEM modeling. It scales fairly well (I will do benchmarks another day) and produces nice looking features. It cannot be used very well for erosion modeling, the way the droplet approach can, but is essentially as good for simply creating rivers and lakes on artificially generated terrain.