By far the most common use of assertions is for checking input to functions. Consider this function for calculating the geometric mean:

```
geomean <- function(x, na.rm = FALSE)
{
exp(mean(log(x), na.rm = na.rm))
}
```

In a statically-typed language, we could enforce `x`

being a numeric vector. R’s dynamic typing (while mostly helping us be more productive) gives us some rope to hang ourselves with: `x`

and `na.rm`

can be absolutely anything. We need to handle the cases when `x`

is not numeric, or when x contains negative values, or when `na.rm`

is not a single logical value.

The built-in functions `exp`

, `mean`

and `log`

have some of their own logic for handling bad inputs, and it is possible to simply rely on that logic rather than writing your own. For demonstration purposes, let’s see how they behave, and then see if we can improve upon it.

If we pass a non-numeric `x`

, we see:

`geomean("a")`

`## Error in log(x): non-numeric argument to mathematical function`

The error message is OK, but it since it is appearing to come from `log(x)`

, it isn’t totally clear to the user where the problem originates. The *assertive* fix isto include `assert_is_numeric(x)`

in the function.

Where should this line go? In accordance with the first clause of the programming principle “fail early, fail often”, the assertion belongs at the start of the function.

```
geomean2 <- function(x, na.rm = FALSE)
{
assert_is_numeric(x)
exp(mean(log(x), na.rm = na.rm))
}
```

`geomean2("a")`

`## Error in geomean2("a"): is_numeric : x is not of type 'numeric'; it has class 'character'.`

The geometric mean doesn’t make any mathematical sense for (real) negative numbers, and will return `NaN`

if the input contains any.

`geomean(rnorm(20))`

`## Warning in log(x): NaNs produced`

`## [1] NaN`

The warning here is not so informative (why were the NaNs produced?), and appears to come from `log(x)`

. We could be strict and throw an error if there are negative values by adding a call to `assert_all_are_non_negative`

. To replicate the base-R behaviour we can define custom actions based upon the result of `is_non_negative`

:

```
geomean3 <- function(x, na.rm = FALSE)
{
assert_is_numeric(x)
if(!all(is_non_negative(x), na.rm = TRUE)) # Don't worry about NAs here
{
warning("x contains negative values, so the geometric mean makes no sense.")
return(NaN)
}
exp(mean(log(x), na.rm = na.rm))
}
```

`geomean3(rnorm(20))`

```
## Warning in geomean3(rnorm(20)): x contains negative values, so the
## geometric mean makes no sense.
```

`## [1] NaN`

For `na.rm`

, the `mean`

function coerces its input to be a logical value, warning if the value’s length is more than one.

```
x <- rlnorm(20)
x[sample(20, 5)] <- NA
geomean(x, c(1.5, 0))
```

```
## Warning in if (na.rm) x <- x[!is.na(x)]: the condition has length > 1 and
## only the first element will be used
```

`## [1] 1.62527`

The warning about the length is OK, but again its source is not totally clear for users. The coercion to logical happens silently, which isn’t ideal.

Again, we could be strict and throw an error when `na.rm`

isn’t a scalar logical value using `assert_is_a_bool`

. (This is a compound assertion checking both the type and the length of the object). In this case, to replicate the base-R behaviour, we will use some utility functions provided by *assertive*.

```
geomean4 <- function(x, na.rm = FALSE)
{
assert_is_numeric(x)
if(!all(is_non_negative(x), na.rm = TRUE)) # Don't worry about NAs here
{
warning("x contains negative values, so the geometric mean makes no sense.")
return(NaN)
}
na.rm <- coerce_to(use_first(na.rm), "logical")
exp(mean(log(x), na.rm = na.rm))
}
```

`use_first`

takes the first element of an object, warning if it has length more than one. `coerce_to`

checks and object’s `class`

, then converts it to the requested type, with a warning, using an appropriate `as.*`

function if it exists, or `as`

if it doesn’t.

`geomean4(x, c(1.5, 0))`

`## Warning: Only the first value of na.rm will be used.`

`## Warning: Coercing use_first(na.rm) to class 'logical'.`

`## [1] 1.62527`

Whether to throw an error on bad input or fix it is personal preference and depends upon context. For end-user functions should should usually try to be flexible and fix things unless they’ve done something very silly. For lower-level functions, you can often afford to be stricter.

- Check function inputs at the start of the function to “fail early”.
- Don’t be afriad of having lots of assertions to “fail often”.
- Use
`assert_`

functions to throw errors for inputs that you can’t salvage. - Use
`is_`

functions in an`if`

condition to provide custom behaviour. - Fix inputs with
`use_first`

and`coerce_to`

where possible.

- The
`mad`

function in the`stats`

package calculates the median absolute deviation. Type`mad`

to see its contents, and run`example(mad)`

to get a feel for how it works.

Update the function (either copy and paste, or use`fixInNamespace`

if you are feeling fancy) to include some assertions checking the inputs. Hint: Some of the inputs should be numeric; others should be logical. Some should be only allow a single value. [15 mins].