Testing function calls is probably one of the most common things you’ll do when writing SCTs. With testwhat, you can check whether students correctly called particular functions, whether the expected arguments were specified, and whether the values passed to these arguments are correct. In addition to argument checking, you can also check the result of calling a function and compare it with with this function call should’ve returned; another very robust way of checking a student’s work. This article is arguably one of the most important ones in this documentation, so read it carefully and completely.

Example 1: Basic

Suppose you want the student to call the round() function on pi (a value that’s available in R by default), as follows:

The following SCT tests whether the round() function is used correctly:

To find out which arguments you have to specify in check_arg(), you can use args(round). This will show that the names of the arguments of the round() function are x and digits.

When a student submits his code and SCT is executed, check_function() tests whether the student has called the function round(). Next, check_arg() checks whether each argument was specified. Finally, check_equal() checks whether the values of the argument are the same as in the solution. So in this case, it tests whether round() is used with the x argument equal to pi and the second argument equal to 3. The above SCT would accept all of the following submissions:


Inside check_equal(), you can use eval to control how parameters are checked:

  • If eval = FALSE, the expressions of the parameters are compared as strings, not the value that results from evaluating the expression.
  • If eval = TRUE, which is the default, the expressions that are used to set the arguments are evaluated and their results compared.

If you change the SCT as follows (which in this example is not a good idea!):

not all of the student submissions listed above would be accepted, because now pi is checked literally, as the string "pi", not as the number 3.1415.... Using eval = FALSE is interesting when you’re working with huge datasets and don’t want to compare their values exhaustively, or for objects for which the equality operator is not standard, such as SQL connections.

Example 2: Checking the result of a function call instead of the arguments

Checking whether the student function call, when executed, corresponds to the result of the solution function call is a very robust way of checking whether the function was called correctly. You can do this by piping check_function() into check_result(), check_output(), or check_error(), depending on whether you want to check the result of the function call, the output the function call generates, or the error the call generates.

As an example, suppose you want the student to call the sum function on a vector containing the values 1 to 5.

One way to solve this is with check_output_expr(), but this does not explicitly require the student to use the sum() function. To make the usage of sum() explicit, you can use check_function() in combination with check_result():

Here, check_function() looks for the call of sum() in both student and solution code, and then check_result() runs both (in their respective environments). Finally, with check_equal() you verify that the results of the function calls are equal. The above SCT would accept all of the following submissions:

When to use check_result() in combination with check_function()? Whenever there are is a multitude of ways to call a function, typically when the argument it takes is simply .... Functions from tidyverse packages such as tidyr and dplyr are typical use cases. Suppose you want to check whether a student correctly calculated a new column and selected two columsn from mtcars:

With the following SCT:

Both of the following submissions would pass. A check_arg()-based SCT chain would have a very hard time allowing for all of these solutions.

  • mtcars %>% mutate(lsp100km = 235.214 / mpg) %>% select(mpg, lsp100km)
  • mtcars %>% select(mpg, lsp100km = 235.214 / mpg)

Note: testwhat can handle the %>%: it will convert a %>% f(b) to f(a, b) again, so you don’t have to worry about anything.

Example 3: Multiple function calls

index, which is 1 by default, becomes important when there are several calls of the same function. Suppose that your exercise requires the student to call the round() function twice: once on pi and once on exp(1), Euler’s number. A possible solution could be the following:

To test both these function calls, you’ll need the following SCT:

The first check_function() chain, where index = 1, checks the solution code for the first function call of round(), finds it - round(pi, 3) - and then goes to look through the student code to find a function call of round(). It is possible that there are 5 function calls of round() in the student’s submission, and that only the fourth call matches the requirements for the first check_function() chain, but testwhat matches function calls in the student code by the index of the call in the solution code.

This means that all of the following student submissions would be accepted:

Of course, you can also specify all other arguments to customize your test, such as eval, args, not_called_msg and incorrect_msg.

Example 4: Multiple function calls (2)

When you’re checking a function call, check_function() will look for all ‘possible candidates’ to match the function call in the solution. This can have some nasty side effects, though. Similar to the previous example, suppose that you want the student to use round() twice, but you only want to test the second round() call:

An possible SCT would look like this then:

Suppose now that the student submits the following answer

If you try this out, you’ll see that pi will be highlighted, in the first call. This should be the second call, right? Here’s why it happens, step by step:

  • check_function("round") looks for all calls of round() in the student’s code. It finds two: on line 2 and on line 4.
  • Next, check_arg("x") checks whether the x argument is specified. It sees that both calls of round() do so, so both calls are still candidate to match the solution’s function call.
  • Next, check-equal() checks if the actual value of x carresponds to the value of that argument in the solution. More specifically, it looks for a x argument that equals exp(1). It checks the x argument of the first round() call and sees it’s no good. Then it checks the x argument of the second round() call and sees it’s no good either. At this point the SCT fails, and simply the first function of the candidates, the one at line 2, is highlighted as the incorrect one.

You can solve this by ‘ruling out’ the first round() call as a candidate, by simply using a check_function() chain beforehand:

This way, the first round() call will not be considered as a candidate anymore by the time you’re checking the second round() call. If you submit the same student code for this SCT, you will see that highlighting happens correctly this time.

Example 5: ... argument

Behind the scenes check_function() uses the match.call() function, to match the arguments to the function parameters. As an example, all of these different but equivalent ways to call the grepl() function:

grepl(pattern = "a{2}", x = "aabb")
grepl(pat = "a{2}", x = "aabb")
grepl("a{2}", x = "aabb")
grepl("a{2}", "aabb")
grepl(x = "aabb", pattern = "a{2}")
grepl(x = "aabb", "a{2}")

Are converted into a standardized form by match.call:

grepl(pattern = "a{2}", x = "aabb")

That way, it’s easy to tell how each argument was specified.

However, in R, there’s the so-called ellipsis argument to pass arguments in a very free form, without having to list them in the function signature explicitly. They are typically used in an S3 context, where the arguments are dispatched to ‘lower level S3 functions’. match.call() cannot handle the ellipsis argument in a straightforward way.

In terms of argument matching in general, 4 things can happen:

  1. match.call() perfectly works, and every argument you pass is matched to an argument name, without any ellipsis problems.

  2. check_function() figures out that you wanted to test an S3 function (predict() is a great example, it has a bunch of class-specific implementation). Based on the class of the first argument, it figures out the ‘more detailed’, class-specific implementation and tries to match the arguments to the signature of this more detailed function. NOTE: it is still possible that this signature features the ... arguments, so the points below apply as well for this case.

  3. The function call that match.call() has to standardize contains arguments that are matched to the ellipsis, but these arguments are named explicitly, for example the the main argument in plot(mtcars$wt, mtcars$hp, main = 'mtcars plot'). In this case, match.call() will match mtcars$wt to x, mtcars$hp to y and 'mtcars plot' to main, just like for regular arguments.

  4. The function call that match.call() has to standardize contains arguments that are matched to the ellipsis, but these arguments are not explicitly named. In this case, all ‘unnamed arguments’ are merged together in a list under the argument name ....

Things become tricky if case 4 occurs: if a student specifies multiple arguments without naming them, and you want to check these arguments, you’ll need to use "..." inside your args function. However, be aware that this "..." part can represent a number of arguments that the student specified.

As an example, suppose you want to the student the calculate the sum of 1, 2, 3, 4, and NA. sum() supports the following approach:

sum(1, 2, 3, 4, NA, na.rm = TRUE)

To test this:

All of the following submissions will pass:

However, none of the following submissions will pass (as you see, order is important):

If the arguments that are matched to ... are not correct, check_equal() will automatically generate the message:

Check your call of `sum()`. Did you correctly specify the arguments that are matched to `...`?

In this example of using the sum() function, using check_function() is a bad idea to start with. You also want to allow the student to use:

sum(c(1, 2, 3, 4, NA), na.rm = TRUE)

which is perfectly valid as well. That’s why the following SCT would be more appropriate:

Example 6: Allowing for different solutions

A typical example of this is an exercise with the plot() function:

df <- data.frame(time = seq(0, 2*pi, 0.01))
df$res <- sin(df$time)

# create a plot of res vs time
plot(df$time, df$res)

All of the following submissions should pass:

However, if you simply use the following SCT

The third and fourth plot won’t work, because the solution code does not contain the calls to compare to. You can use the more granular check_*() functions in combination with check_or() and override_solution(). As the name suggests, this temporarily overrides the solution so that you can use check_function() for different possibilities.

Example 7: Comparing formulas

testwhat has built-in functionality to deal with formulas rather robustly. Suppose you want to check whether a student called the lm function correctly:

lm(mpg ~ wt + hp, data = mtcars)

If you use the following SCT:

testwhat will recognize that the formula argument you specified is indeed an R formula object. It will then parse the formula and make it robust to different orderings. This means that the following submissions pass:

lm(mpg ~ wt + hp, data = mtcars)
lm(mpg ~ hp + wt, data = mtcars)

And the following fail:

lm(mpg ~ wt + hp + drat, data = mtcars)

So there is some ‘normalization’ going on, but handle with care.

Appendix: Advanced Automatic Feedback

The automatic messages that are generated when the student makes mistakes of different kinds is continuously improved to be as insightful as possible. For example, when the solution expects:


but the student submits:


The following SCT:

will generate the message: “Have you correctly specified the argument x inside print()? You specified a character string, while it should be a number.”. When comparing character strings between student and solution, this comparison goes further, hinting about potential incorrect capitalization, incorrect spacing, incorrect punctuation or possible typos.