Broadcasting

Davis Vaughan

2019-07-19

library(rray)

Introduction

The idea for rray sprung from frustration with the following example:

x <- matrix(1:6, nrow = 3)
x
#>      [,1] [,2]
#> [1,]    1    4
#> [2,]    2    5
#> [3,]    3    6

x + 1
#>      [,1] [,2]
#> [1,]    2    5
#> [2,]    3    6
#> [3,]    4    7

x + matrix(1)
#> Error in x + matrix(1): non-conformable arrays

To someone who has worked with matrices in R before, this error message is probably nothing new. It’s stating that because x has dimensions (3, 2) and is being added to an object with dimensions (1, 1), which is a matrix does not have exactly the same dimensions, the operation cannot be completed. My frustration lies with the fact that I had this “feeling” that I knew the answer to this operation. It shouldn’t be an error, it should have the same answer as x + 1. Why does this work, when the other doesn’t?

In one sentence, the answer is that R recycles, but I want it to broadcast.

This vignette’s goal is to introduce the concept of broadcasting. Broadcasting is the idea of repeating dimensions of one object to match the dimensions of another, so that an operation can be applied between the two inputs. Later you will learn two rules which formalize this idea.

What did I expect?

For this first example, I expected this result:

x
#>      [,1] [,2]
#> [1,]    1    4
#> [2,]    2    5
#> [3,]    3    6

x + 1
#>      [,1] [,2]
#> [1,]    2    5
#> [2,]    3    6
#> [3,]    4    7

y <- rray(1, c(1, 1))
y
#> <rray<dbl>[,1][1]>
#>      [,1]
#> [1,]    1

x + y
#> <rray<dbl>[,2][3]>
#>      [,1] [,2]
#> [1,]    2    5
#> [2,]    3    6
#> [3,]    4    7

You’ll come to learn that rray implements broadcasting throughout the package, which is what allows this to work. It solves this motivating example, but is generally useful in variety of ways.

So how does this happen? The mental model to understand broadcasting in this case is to alter the dimensions of y to match the dimensions of x by duplicating the rows. This is known as broadcasting y. After this alteration is done, then the addition is performed.

y_broadcast <- rray_broadcast(y, c(3, 2))

y_broadcast
#> <rray<dbl>[,2][3]>
#>      [,1] [,2]
#> [1,]    1    1
#> [2,]    1    1
#> [3,]    1    1

x + y_broadcast
#> <rray<dbl>[,2][3]>
#>      [,1] [,2]
#> [1,]    2    5
#> [2,]    3    6
#> [3,]    4    7

But how did we know (programmatically) how to alter y to match the dimensions of x here? That is where the broadcasting rules come in.

Rules

Broadcasting works by finding a set of common dimensions from the inputs, and then manipulating each of the inputs to that common shape. There are two rules for finding the common dimensions:

  1. The dimensionality of the inputs are matched by appending 1’s to the dimensions.
  2. The dimensions of the inputs are matched by recycling each dimension as needed.

In the next section you will see an example of these rules in action, but an explanation of “recycling” might also be useful. In this case, I am talking about the tidyverse recycling rules. When comparing two objects together, the recycling rules are:

  1. If the lengths of both are equivalent, do nothing.
  2. If the length of one object is 1, recycle that to the length of the other object.
  3. Otherwise, error.

There are slightly different from base R, where partial recycling is also occasionally used, which means that, for example, a length 2 vector could be recycled up to length 4 without issue.

Example - x + y

Let’s revisit the original example, but use the broadcasting rules to understand how to get the result.

x
#>      [,1] [,2]
#> [1,]    1    4
#> [2,]    2    5
#> [3,]    3    6

y
#> <rray<dbl>[,1][1]>
#>      [,1]
#> [1,]    1

When adding these two matrices together, it’s useful to write out the dimensions explicitly, using the general notation of:

(rows, cols) | object
(3, 2) | x
(1, 1) | y
-------|-------
(3, 2) | result

To understand how we get from x + y -> result, compare dimensions vertically (one at a time), and apply the broadcasting rules.

The common dimensions have now been determined, (3, 2). At this point, the actual “broadcasting” step is applied, which is the transformation of x and y to have the shape (3, 2). Then the addition is performed. This can be done manually with rray_broadcast(), and the common dimensions can be determined with rray_dim_common().

dim <- rray_dim_common(x, y)
dim
#> [1] 3 2

x_broadcast <- rray_broadcast(x, dim)
x_broadcast
#>      [,1] [,2]
#> [1,]    1    4
#> [2,]    2    5
#> [3,]    3    6

y_broadcast <- rray_broadcast(y, dim)
y_broadcast
#> <rray<dbl>[,2][3]>
#>      [,1] [,2]
#> [1,]    1    1
#> [2,]    1    1
#> [3,]    1    1

x_broadcast + y_broadcast
#> <rray<dbl>[,2][3]>
#>      [,1] [,2]
#> [1,]    2    5
#> [2,]    3    6
#> [3,]    4    7

rray applies these rules automatically when either object is an rray, as y is.

To get this behavior with base R objects, the underlying engine that powers + has been exposed: rray_add(). If you are designing a package and don’t know what your inputs are going to be, this might be a good way to ensure you are manipulating arrays and rrays consistently.

# Both are base R objects, but are added with broadcasting!
rray_add(x, as_matrix(y))
#>      [,1] [,2]
#> [1,]    2    5
#> [2,]    3    6
#> [3,]    4    7

Example - Implicit Dimensions

If y was a vector, not a matrix, what would have changed?

z <- rray(c(1L, 2L, 3L))
z
#> <rray<int>[3]>
#> [1] 1 2 3

x + z
#> <rray<int>[,2][3]>
#>      [,1] [,2]
#> [1,]    2    5
#> [2,]    4    7
#> [3,]    6    9

This operation is still valid, and represents an important broadcasting idea because it allows lower dimensional objects (1D) to be combined with higher dimensional objects (2D) in a formalized way. To start, write out the dimensions:

(3, 2) | x
(3)    | z

To find the common dimensions, Rule 1 is first applied because there are “implicit” dimensions. z has the 1st dimension (the rows), but is missing the 2nd dimension (the columns). When this occurs, add 1s to the right hand side until the dimensionalities match.

(3, 2) | x
(3, 1) | z

Now, apply the second rule regarding recycling:

The common dimensions are then (3, 2), and again we can manually manipulate the objects to visualize what is happening. For z, you might find it helpful to visualize the increase in dimensionality and the recycling of dimensions separately.

z
#> <rray<int>[3]>
#> [1] 1 2 3

rray_broadcast(z, c(3, 1))
#> <rray<int>[,1][3]>
#>      [,1]
#> [1,]    1
#> [2,]    2
#> [3,]    3

rray_broadcast(z, c(3, 2))
#> <rray<int>[,2][3]>
#>      [,1] [,2]
#> [1,]    1    1
#> [2,]    2    2
#> [3,]    3    3
dim <- rray_dim_common(x, z)
dim
#> [1] 3 2

x_broadcast <- rray_broadcast(x, dim)
z_broadcast <- rray_broadcast(z, dim)

x_broadcast + z_broadcast
#> <rray<int>[,2][3]>
#>      [,1] [,2]
#> [1,]    2    5
#> [2,]    4    7
#> [3,]    6    9

Example - Higher Dimensions

Broadcasting becomes more interesting when higher dimensional objects are used. Additionally, at this point only 1 dimension has been changing (the columns), and only 1 object at a time needed altering. This example will require both inputs to be changed.

a <- rray(1:3, c(1, 3))
a
#> <rray<int>[,3][1]>
#>      [,1] [,2] [,3]
#> [1,]    1    2    3

b <- rray(1:8, c(4, 1, 2))
b
#> <rray<int>[,1,2][4]>
#> , , 1
#> 
#>      [,1]
#> [1,]    1
#> [2,]    2
#> [3,]    3
#> [4,]    4
#> 
#> , , 2
#> 
#>      [,1]
#> [1,]    5
#> [2,]    6
#> [3,]    7
#> [4,]    8

Perhaps surprisingly, these inputs can be added together using broadcasting.

a + b
#> <rray<int>[,3,2][4]>
#> , , 1
#> 
#>      [,1] [,2] [,3]
#> [1,]    2    3    4
#> [2,]    3    4    5
#> [3,]    4    5    6
#> [4,]    5    6    7
#> 
#> , , 2
#> 
#>      [,1] [,2] [,3]
#> [1,]    6    7    8
#> [2,]    7    8    9
#> [3,]    8    9   10
#> [4,]    9   10   11

To understand this, write out dimensions and follow the rules, as usual.

(1, 3)    | a
(4, 1, 2) | b

Implicit dimensions are present for a, so make them explicit with 1s using the first rule.

(1, 3, 1) | a
(4, 1, 2) | b

And then apply the rest of the recycling rules:

So the common dimensions are (4, 3, 2). Generally, once you internalize the rules enough, it’s easier to write them compactly as:

(1, 3,  ) | a
(4, 1, 2) | b
--------- | ------
(4, 3, 2) | result

Here are the incremental changes made to a to go from (1, 3) to (4, 3, 2).

a
#> <rray<int>[,3][1]>
#>      [,1] [,2] [,3]
#> [1,]    1    2    3

rray_broadcast(a, c(1, 3, 1))
#> <rray<int>[,3,1][1]>
#> , , 1
#> 
#>      [,1] [,2] [,3]
#> [1,]    1    2    3

rray_broadcast(a, c(4, 3, 1))
#> <rray<int>[,3,1][4]>
#> , , 1
#> 
#>      [,1] [,2] [,3]
#> [1,]    1    2    3
#> [2,]    1    2    3
#> [3,]    1    2    3
#> [4,]    1    2    3

rray_broadcast(a, c(4, 3, 2))
#> <rray<int>[,3,2][4]>
#> , , 1
#> 
#>      [,1] [,2] [,3]
#> [1,]    1    2    3
#> [2,]    1    2    3
#> [3,]    1    2    3
#> [4,]    1    2    3
#> 
#> , , 2
#> 
#>      [,1] [,2] [,3]
#> [1,]    1    2    3
#> [2,]    1    2    3
#> [3,]    1    2    3
#> [4,]    1    2    3

And here are the changes made to b to go from (4, 1, 2) to (4, 3, 2).

b
#> <rray<int>[,1,2][4]>
#> , , 1
#> 
#>      [,1]
#> [1,]    1
#> [2,]    2
#> [3,]    3
#> [4,]    4
#> 
#> , , 2
#> 
#>      [,1]
#> [1,]    5
#> [2,]    6
#> [3,]    7
#> [4,]    8

rray_broadcast(b, c(4, 3, 2))
#> <rray<int>[,3,2][4]>
#> , , 1
#> 
#>      [,1] [,2] [,3]
#> [1,]    1    1    1
#> [2,]    2    2    2
#> [3,]    3    3    3
#> [4,]    4    4    4
#> 
#> , , 2
#> 
#>      [,1] [,2] [,3]
#> [1,]    5    5    5
#> [2,]    6    6    6
#> [3,]    7    7    7
#> [4,]    8    8    8

Altogether now!

dim <- rray_dim_common(a, b)

a_broadcast <- rray_broadcast(a, dim)
b_broadcast <- rray_broadcast(b, dim)

a_broadcast + b_broadcast
#> <rray<int>[,3,2][4]>
#> , , 1
#> 
#>      [,1] [,2] [,3]
#> [1,]    2    3    4
#> [2,]    3    4    5
#> [3,]    4    5    6
#> [4,]    5    6    7
#> 
#> , , 2
#> 
#>      [,1] [,2] [,3]
#> [1,]    6    7    8
#> [2,]    7    8    9
#> [3,]    8    9   10
#> [4,]    9   10   11

Addendum

Base R Recycling

The only exception to the strict behavior of base R is when an array is combined with a 1D vector. You saw this in the very first example with x + 1. “Scalar” operations work as one might expect:

But when a vector and an array are combined, the vector is recycled to fit the dimensions of the array. This is very different from broadcasting.

This example seems intuitive and harmless, but even partial recycling is allowed, which can be quite confusing and dangerous.

This is confusingly identical to the following, where vec_2 is repeatedly used to construct something that can be added to x.

rray never performs partial recycling when it is going through the steps of broadcasting, and will error if this example is attempted.

The most important thing to remember here is that base R’s recycling is not broadcasting even if it looks like it at first glance with simple cases!

Differences from Python

In the Python world, NumPy is used for broadcasting operations. The underlying C++ library used for rray is heavily based on the NumPy principles. However, one interesting thing to note is that when applying broadcasting rules for NumPy, implicit dimensions are prepended to the left hand side as 1s. This is used consistently in the NumPy world, even down to how they are printed. However, R often uses the concept of appended implicit dimensions the right hand side, and prints with this concept in mind.

For a simple example of R doing this, try coercing a vector to a matrix:

The dimensions went from (3) to (3, 1). NumPy actually does the opposite, and would convert (3) to (1, 3). This is fine, because they use it consistently there, but can be confusing when comparing broadcasting of rray to NumPy, and is just something to keep in mind.