Chapter 18 Error trapping and other headaches

If you have an advanced estimator, or are relying on some package, it is quite possible that every so often your estimate will trigger an error, or give you a NA result, or something similarly bad. More innocuous, you might have estimators that can generate warnings; if you run 10,000 trials, that can add up to a lot of warnings, which can be overwhelming to sort through. In some cases, these warnings might be coupled with the estimator returning a very off result; it is unclear, in this case, whether we should include that result in our overall performance measures for that estimator. After all, it tried to warn us!

In this section, we talk about some ways to make your simulations safe and robust, and also discuss some ways to track warnings and include them in performance measures.

18.1 Safe code

Sometimes if you write a function that does a lot of complex things with uncertain objects – i.e. consider a complex estimator using an unstable package not of your own creation – you can run into trouble where you have a intermittent error.

This can really be annoying; consider the case where your simulation crashes on 1 out of 500 chance: if you run 1000 simulation trials, your program will likely not make it to the end, thus wasting all of your time. To protect yourself, you can write code that can, instead of stopping when it reaches an error, trap the error and move on with the next simulation trial.

To illustrate, consider the following broken function that sometimes gives us what we want, sometimes gives us a NaN due to taking the square root of a negative number, and sometimes crashes completely due to broken_code() not existing:

my_complex_function = function( param ) {
    
    vals = rnorm( param, mean = 0.5 )
    if ( sum( vals ) > 5 ) {
        broken_code( 4 )
    } else {
        sqrt( sum( vals ) * sign( vals )[[1]] )
    }
}

We run it like so:

my_complex_function( 2 )
## [1] 0.9437257
my_complex_function( 2 )
## Warning in sqrt(sum(vals) * sign(vals)[[1]]):
## NaNs produced
## [1] NaN
my_complex_function( 7 )
## [1] 1.419031

So far so good, other than a NaN warning. Now let’s run it a bunch of times:

resu = map( 1:20, ~ my_complex_function( 7 ) )
## Warning in sqrt(sum(vals) * sign(vals)[[1]]):
## NaNs produced
## Error in `map()`:
## ℹ In index: 2.
## Caused by error in `broken_code()`:
## ! could not find function "broken_code"

Oh no! Our function crashes sometimes. To trap the errors we use the purrr package to make a “safe” function as so:

my_safe_function = safely( my_complex_function,
                           otherwise = NA )
my_safe_function( 7 )
## $result
## [1] NA
## 
## $error
## <simpleError in broken_code(4): could not find function "broken_code">

Our safe version of the function gives us back a list of things: the result (or NULL if there was an error), and the error message (or NULL if there was no error). safely() is an example of a function wrapper, which takes a function and gives you back a new function that does something slightly different.

We include otherwise = NA so we always get a result, even if it is a NA result. Otherwise we would get a NULL when there is an error, which might be harder to track.

For example, we can use the above repeatedly, and then do a nice trick to get a list of error message separate from our list of results:

resu = map( 1:20, ~ my_safe_function( 7 ) )
## Warning in sqrt(sum(vals) * sign(vals)[[1]]):
## NaNs produced

## Warning in sqrt(sum(vals) * sign(vals)[[1]]):
## NaNs produced

## Warning in sqrt(sum(vals) * sign(vals)[[1]]):
## NaNs produced

## Warning in sqrt(sum(vals) * sign(vals)[[1]]):
## NaNs produced
resu <- transpose( resu )
unlist( resu$result )
##  [1]       NaN        NA       NaN 0.4354079
##  [5] 1.0734109        NA 1.5196734        NA
##  [9]       NaN 0.6382972 1.0836506        NA
## [13]        NA 1.6244187 1.1915743       NaN
## [17] 2.1165066 2.0638450 1.6465782        NA

The transpose() method takes a list of lists, and reorganizes them to give you a list of all the first elements, a list of all the second elements, etc. This is very powerful for wrangling data, because then we can make a tibble with list columns as so:

tb <- tibble( result = unlist( resu$result ),
        error = resu$error )
print( tb, n = 4 )
## # A tibble: 20 × 2
##    result error     
##     <dbl> <list>    
## 1 NaN     <NULL>    
## 2  NA     <smplErrr>
## 3 NaN     <NULL>    
## 4   0.435 <NULL>    
## # ℹ 16 more rows

There are other function wrappers in this “safe computing” family, such as “possibly,” which will try to run something and give you a default value if it fails:

my_possible_function = possibly( my_complex_function, 
                                 otherwise = NA )
my_possible_function( 7 )
## [1] 1.960578
rs <- map_dbl( 1:10, ~ my_possible_function(7) )
## Warning in sqrt(sum(vals) * sign(vals)[[1]]):
## NaNs produced

## Warning in sqrt(sum(vals) * sign(vals)[[1]]):
## NaNs produced
rs
##  [1] 1.197171       NA 1.533298      NaN       NA
##  [6]       NA       NA       NA      NaN 2.149232

There is also “quietly,” which makes warnings and messages get bundled, rather than printed to the console:

my_quiet_function = quietly( my_complex_function )

my_quiet_function( 1 )
## $result
## [1] 0.3686244
## 
## $output
## [1] ""
## 
## $warnings
## character(0)
## 
## $messages
## character(0)

This can be especially valuable to control massive amounts of printout in a simulation. If you have lots of extraneous printout, it can slow down the execution of your code far more than you might think.

Note that quietly() does not trap errors, just warnings:

rs <- map( 1:20, ~ my_quiet_function(7) )
## Error in `map()`:
## ℹ In index: 3.
## Caused by error in `broken_code()`:
## ! could not find function "broken_code"

You can “double wrap” your function, if you want, but what you get back is a bit of a mess:

my_safe_quiet_function = quietly( safely( my_complex_function, otherwise = NA ) )
my_safe_quiet_function(7)
## $result
## $result$result
## [1] NA
## 
## $result$error
## <simpleError in broken_code(4): could not find function "broken_code">
## 
## 
## $output
## [1] ""
## 
## $warnings
## character(0)
## 
## $messages
## character(0)

18.1.1 Making a safe function call

You can do some magical massaging to make things work better upfront so you can capture both errors and warnings and get the results in a nice tidy tibble. For example, say our my_complex_function() is a method designed to analyze our data. We might then write a wrapper that cleans up our safe and quiet version

unpack_mess <- function( mess ) {
  rs = tibble( result = NA, error = NA,
               warnings = "" )
  rs$result = mess$result$result
  rs$error = list( mess$result$error )
  if ( length( mess$warnings ) > 0 ) {
    rs$warnings = paste( mess$warnings, collapse="; " )
  }
  rs
}
unpack_mess( my_safe_quiet_function( 7 ) )
## # A tibble: 1 × 3
##   result error  warnings
##    <dbl> <list> <chr>   
## 1   1.56 <NULL> ""
my_safe_quiet_function <- compose( unpack_mess, my_safe_quiet_function )
my_safe_quiet_function(7)
## # A tibble: 1 × 3
##   result error  warnings
##    <dbl> <list> <chr>   
## 1   2.05 <NULL> ""

The compose() method makes a new function that will call unpack_mess() automatically. We have now wrapped our original function, in effect, three times. The result is a new function that takes the same parameters as our original my_complex_function() method, and returns some results, including information about errors and warnings, in a nicely organized way.

We could use such a modified function in a loop, and stack our results:

rs <- map_df( 1:20, ~ my_safe_quiet_function(6) )
rs
## # A tibble: 20 × 3
##    result error      warnings       
##     <dbl> <list>     <chr>          
##  1   1.76 <NULL>     ""             
##  2   2.11 <NULL>     ""             
##  3   1.68 <NULL>     ""             
##  4 NaN    <NULL>     "NaNs produced"
##  5  NA    <smplErrr> ""             
##  6   1.66 <NULL>     ""             
##  7  NA    <smplErrr> ""             
##  8   1.34 <NULL>     ""             
##  9   1.35 <NULL>     ""             
## 10   1.94 <NULL>     ""             
## 11   2.15 <NULL>     ""             
## 12   1.79 <NULL>     ""             
## 13  NA    <smplErrr> ""             
## 14 NaN    <NULL>     "NaNs produced"
## 15   1.91 <NULL>     ""             
## 16   1.67 <NULL>     ""             
## 17 NaN    <NULL>     "NaNs produced"
## 18   2.13 <NULL>     ""             
## 19   2.10 <NULL>     ""             
## 20 NaN    <NULL>     "NaNs produced"

These tools allow us to take existing analytic code and make it safe and quiet, so we can keep things organized in our simulation.

18.1.2 What to do with warnings in simulations

Sometimes our analytic strategy might give some sort of warning (or fail altogether). For example, from the cluster randomized experiment case study we have:

set.seed(101012)  # (I picked this to show a warning.)
dat = gen_dat_model( J = 50, n_bar = 100, sigma2_u = 0 )
mod <- lmer( Yobs ~ 1 + Z + (1|sid), data=dat )
## boundary (singular) fit: see help('isSingular')

We have to make a deliberate decision as to what to do about this:

  • Keep these “weird” trials?
  • Drop them?

If you decide to drop them, you should drop the entire simulation iteration including the other estimators, even if they worked fine! If there is something particularly unusual about the dataset, then dropping for one estimator, and keeping for the others that maybe didn’t give a warning, but did struggle to estimate the estimand, would be unfair: in the final performance measures the estimators that did not give a warning could be being held to a higher standard, making the comparisons between estimators biased.

If your estimators generate warnings, you should calculate the rate of errors or warning messages as a performance measure. Especially if you drop some trials, it is important to see how often things are acting pecularly.

The main tool for doing this is the quietly() function:

quiet_lmer = quietly( lmer )
qmod <- quiet_lmer( Yobs ~ 1 + Z + (1|sid), data=dat )
qmod
## $result
## Linear mixed model fit by REML ['lmerModLmerTest']
## Formula: ..1
##    Data: ..2
## REML criterion at convergence: 6485.293
## Random effects:
##  Groups   Name        Std.Dev.
##  sid      (Intercept) 0.0000  
##  Residual             0.9879  
## Number of obs: 2302, groups:  sid, 50
## Fixed Effects:
## (Intercept)            Z  
##    -0.03143      0.03433  
## optimizer (nloptwrap) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings 
## 
## $output
## [1] ""
## 
## $warnings
## character(0)
## 
## $messages
## [1] "boundary (singular) fit: see help('isSingular')\n"

You then might have, in your analyzing code:

analyze_data <- function( dat ) {
    
    M1 <- quiet_lmer( Yobs ~ 1 + Z + (1|sid), data=dat )
    message1 = ifelse( length( M1$message ) > 0, 1, 0 )
    warning1 = ifelse( length( M1$warning ) > 0, 1, 0 )

    # Compile our results
    tibble( ATE_hat = coef(M1)["Z"],
            SE_hat = se.coef(M1)["Z"],
            message = message1,
            warning = warning1 )
}

Now you have your primary estimates, and also flags for whether there was a convergence issue. In the analysis section you can then evaluate what proportion of the time there was a warning or message, and then do subset analyses to those simulation trials where there was no such warning.

18.2 Protecting your functions with “stop”

When writing functions, especially those that take a lot of parameters, it is often wise to include stopifnot() statements at the top to verify the function is getting what it expects. These are sometimes called “assert statements” and are a tool for making errors show up as early as possible. For example, look at this (fake) example of generating data with different means and variances

make_groups <- function( means, sds ) {
  Y = rnorm( length(means), mean=means, sd = sds )
  round( Y )
}

If we call it, but provide different lengths for our means and variances, nothing happens, because R simply recycles the standard deviation parameter:

make_groups( c(100,200,300,400), c(1,100,10000) )
## [1]  100  133 1675  402

If this function was used in our data generating code, we would see the warnings but might not know exactly what they were. We can instead protect our function by putting an assert statements in our function like this:

make_groups <- function( means, sds ) {
  stopifnot( length(means) == length(sds) )
  Y = rnorm( length(means), mean=means, sd = sds )
  round( Y )
}

This ensures your code is getting called as you intended. What is nasty about this possible error is nothing is telling you something is wrong! You could build an entire simulation on this, not realizing that your fourth group has the variance of your first, and get results that make no sense to you. You could even publish something based on a finding that depends on this error, which would eventually be quite embarrasing.

These statements can also serve as a sort of documentation as to what you expect. Consider, for example:

make_xy <- function( N, mu_x, mu_y, rho ) {
  stopifnot( -1 <= rho && rho <= 1 )
  X = mu_x + rnorm( N )
  Y = mu_y + rho * X + sqrt(1-rho^2)*rnorm(N)
  tibble(X = X, Y=Y)
}

Here we see that rho should be between -1 and 1 quite clearly. A good reminder of what the parameter is for.

This also protects you from inadvetently misremembering the order of your parameters when you call the function (although it is good practice to name your parameters as you pass). Consider:

a <- make_xy( 10, 2, 3, 0.75 )
b <- make_xy( 10, 0.75, 2, 3 )
## Error in make_xy(10, 0.75, 2, 3): -1 <= rho && rho <= 1 is not TRUE
c <- make_xy( 10, rho = 0.75, mu_x = 2, mu_y = 3 )