R programming language allows the user create their own new functions. In this tutorial you will learn **how to write a function in R**, how the syntax is, the arguments, the output, how the return function works, and how make a correct use of optional, additional and default arguments.

- 1 How to write a function in R language? Defining R functions
- 1.1 Creating a function in R

- 2 Input arguments in R functions
- 3 Default arguments for functions in R
- 4 Additional arguments in R
- 5 The R return function
- 6 Local and global variables in R
- 7 Writing a function in R. Examples
- 7.1 Example function 1: Letter of Spanish DNI
- 7.2 Example function 2: Throwing a die

## How to write a function in R language? Defining R functions

The base R functions doesn’t always cover all our needs. In order to write a function in R you first **need to know how the syntax** of the `function`

command is. The basic R function syntax is as follows:

`function_name <- function(arg1, arg2, ... ) { # Code}`

In the previous code block we have the following parts:

`arg1, arg2, ...`

are the**input arguments**.`# Code`

represents the**code to be executed**within the function to calculate the desired output.

The **output of the function can be** a number, a list, a data.frame, a plot, a message or **any object you want**. You can also assign the output some class, but we will talk about this in other post with the S3 classes. The last is specially interesting when writing functions for R packages.

### Creating a function in R

To introduce R functions we will create a function to work with geometric progressions. A geometric progression is a succession of numbers a_1, a_2, a_3 such that each of them (except the first) is equal to the last multiplied by a constant *r* called ratio. You can verify that,

a_2 = a_1 \cdot r; \qquad a_3 = a_2 \cdot r = a_1 \cdot r^2; \dots

Hence, generalizing this process you can obtain the general term

a_n = a_1 \cdot r^{n-1}.

You can also verify that the sum of the *n* terms of the progression is

S_n = a_1 + \dots + a_n = \frac{a_1(r^n - 1)}{r-1}.

With this in mind you can create the following function,

`an <- function(a1, r, n){ a1 * r ** (n - 1)}`

that calculates the general term a_n of a geometric progression giving the parameters a_1, the ratio *r* and the value *n*. In the following block we can see some examples with its output as comments.

`an(a1 = 1, r = 2, n = 5) # 16an(a1 = 4, r = -2, n = 6) # -128`

With the previous function you can obtain several values of the progression passing a vector of values to the argument *n*.

`an(a1 = 1, r = 2, n = 1:5) # a_1, ..., a_5an(a1 = 1, r = 2, n = 10:15) # a_10,..., a_15`

You can also calculate the first *n* elements of the progression with `sn`

function, defined below.

`sn <- function(a1, r, n){ a1 * (r ** n-1)/(r - 1)}`

`sn(a1 = 1, r = 2, n = 5) # 31# Equivalentvalues <- an(a1 = 1, r = 2, n = 1:5)valuessum(values) # 31`

## Input arguments in R functions

**Arguments are input values of functions**. As an example, on the function we created before we have three input arguments named `a1`

, `r`

and `n`

. There are several considerations when dealing with this type of arguments:

- If you
**maintain the input order**, you**don’t need to call the argument names**. As an example, the following calls are equivalent.

`an(1, 2, 5) # Returns 16an(a1 = 1, r = 2, n = 5) # Returns 16`

- If you
**name the arguments**, you can use**any order**.

`an(r = 2, n = 5, a1 = 1) # Returns 16an(n = 5, r = 2, a1 = 1) # Returns 16`

- You can make use of the
`args`

function to**know the input arguments of any function**you would like to use.

`args(an)`

- If you
**call the function name**, the console will return the**code of the function**.

**Note that** sometimes you won’t be able to see the source code of a function if it is not written in R.

## Default arguments for functions in R

Sometimes it is very interesting to have default function arguments, so the **default values will be used unless others are included** when executing the function. When writing a function, such as the one in our example,

`function_name <- function(arg1, arg2, arg3 ) { # Code}`

if you want `arg2`

and `arg3`

to be `a`

and `b`

by default, you can assign them in the arguments of your R function.

`function_name <- function(arg1, arg2 = a, arg3 = b) { # Code}`

We will illustrate this with a very simple example. Consider, for instance, a function that plots the cosine.

`cosine <- function(w = 1, min = -2 * pi, max = 2 * pi) { x <- seq(-2 * pi, 2 * pi, length = 200) plot(x, cos(w * x), type = "l")}`

Note that this is not the best way to use a function to make a plot. See S3 classes for that purpose.

If you execute `cosine()`

the plot of `cos(x)`

will be plotted by default in the interval [-2 π , 2 π ]. However, if you want to plot the function `cos(2x)`

in the same interval you need to execute `cosine(w = 2)`

. Let’s see some examples:

`# One row, three columnspar(mfcol = c(1, 3))cosine()cosine(w = 2)cosine(w = 3, min = -3 * pi)`

## Additional arguments in R

The argument `...`

(dot-dot-dot) allows you to freely pass arguments that will use a sub-function inside the main function. As an example, in the function,

`cosine <- function(w = 1, min = -2 * pi, max = 2 * pi, ...) { x <- seq(-2 * pi, 2 * pi, length = 200) plot(x, cos(w * x), ...)}`

the arguments inside `...`

will be used by the `plot`

function. Let’s see a complete example:

`par(mfcol = c(1, 2))cosine(w = 2, col = "red", type = "l", lwd = 2)cosine(w = 2, ylab = "")`

## The R return function

By default, the R functions will return the last evaluated object inside it. You can also make use of the `return`

function, which is especially important when you want to return one object or another, depending on certain conditions, or when you want to execute some code after the object you want to return. It is worth to mention that you can return all types of R objects, but only one. For that reason it is very usual to return a list of objects, as follows:

`asn <- function(a1 = 1, r = 2, n = 5) { A <- an(a1, r, n) S <- sn(a1, r, n) ii <- 1:n AA <- an(a1, r, ii) SS <- sn(a1, r, ii) return(list(an = A, sn = S, output = data.frame(values = AA, sum = SS)))}`

When you run the function, you will have the following output. Recall to have the `sn`

and `an`

functions loaded in the workspace.

`asn()`

`$`an`[1] 16$sn[1] 31$output values sum1 1 12 2 33 4 74 8 155 16 31`

You may have noticed that in the previous case it is equivalent to use the `return`

function or not using it. However, consider the following example, where we want to check whether the parameters passed to the arguments are numbers or not. For this, if any of the parameters is not a number we will return a string, but if they are numbers the code will continue executing.

`asn <- function(a1 = 1, r = 2, n = 5) { if(!is.numeric(c(a1, r, n))) return("The parameters must be numbers") A <- an(a1, r, n) S <- sn(a1, r, n) ii <- 1:n AA <- an(a1, r, ii) SS <- sn(a1, r, ii) return(list(an = A, sn = S, output = data.frame(values = AA, sum = SS)))}`

`asn("3")`

`"The parameters must be numbers"`

If we have used the `print`

function instead of `return`

, when some parameter is not numeric, the text will be returned but also an error, since all the code will be executed.

`asn <- function(a1 = 1, r = 2, n = 5) { if(!is.numeric(c(a1, r, n))) print("The parameters must be numbers") A <- an(a1, r, n) S <- sn(a1, r, n) ii <- 1:n AA <- an(a1, r, ii) SS <- sn(a1, r, ii) return(list(an = A, sn = S, output = data.frame(values = AA, sum = SS)))}`

`asn("3")`

`"The parameters must be numbers"Error in a1 * r^(n - 1) : non-numeric argument to binary operator`

## Local and global variables in R

In R it is not necessary to declare the variables used within a function. The rule called “lexicographic scope” is used to decide whether an object is local to a function or global. Consider, for instance, the following example:

`fun <- function() { print(x)}x<- 1fun() # 1`

The variable `x`

is not defined within `fun`

, so R will search for `x`

within the “surrounding” scope and print its value. If `x`

is used as the name of an object inside the function, the value of `x`

in the global environment (outside the function) does not change.

`x <- 1fun2 <- function() { x <- 2 print(x)}fun2() # 2x #1`

To change the global value of a variable inside a function you can use the double assignment operator (`<<-`

).

`x <- 1y <- 3fun3 <- function() { x <- 2 y <<- 5 print(paste(x, y))}fun3() # 2 5x # 1 (the value hasn't changed)y # 5 (the value has changed)`

## Writing a function in R. Examples

In this section different examples of R functions are shown in order to illustrate the creation and use of R functions.

### Example function 1: Letter of Spanish DNI

Let’s calculate the letter of the DNI from its corresponding number. The method used to obtain the letter (L) of the DNI consists of dividing the number by 23 and according to the remainder (R) obtained award the letter corresponding to the following table.

R | L | R | L | R | L | R | L | |||
---|---|---|---|---|---|---|---|---|---|---|

0 | T | 7 | F | 14 | Z | 21 | K | |||

1 | R | 8 | P | 15 | S | 22 | E | |||

2 | W | 9 | D | 16 | Q | |||||

3 | A | 10 | X | 17 | V | |||||

4 | G | 11 | B | 18 | H | |||||

5 | M | 12 | N | 19 | L | |||||

6 | Y | 13 | J | 20 | C |

The function will be like the following.

`DNI <- function(number) { letters <- c("T", "R", "W", "A", "G", "M", "Y", "F", "P", "D", "X", "B", "N", "J", "Z", "S", "Q", "V", "H", "L", "C", "K", "E") letters <- letters[number %% 23 + 1] return(letters)}`

`DNI(50247828) # G`

### Example function 2: Throwing a die

The next function simulates *n* (by default *n = 100*) dice throws. The function returns the frequency table and the corresponding plot.

`dice <- function(n = 100){ throws <- sample(1:6, n, rep = T) frequency <- table(throws)/n barplot(frequency, main = "") abline(h = 1/6, col = 'red', lwd = 2) return(frequency)}`

Now you can see the simulation results executing the function.

`par(mfcol = c(1, 3))dice(100)dice(500)dice(100000)`

`# 100 1 2 3 4 5 60.17 0.11 0.20 0.16 0.25 0.11# 500 1 2 3 4 5 60.144 0.158 0.148 0.178 0.164 0.208# 100000 1 2 3 4 5 60.16612 0.16630 0.16569 0.16791 0.16697 0.16701`

As you can see, as we increase *n* we are closer to the theoretical value 1/6 = 0.1667.