Roman Pahl

Some hopefully useful (or even awesome) stuff related to R programming.

container - deque, set and dict for R

Written on July 22, 2018

Recently managed to put up my new package container on CRAN (and finally have a compelling reason to start an R-blog …). This package provides some common container data structures deque, set and dict (resembling Pythons dict type), with typical member functions to insert, delete and access container elements.

If you work with (especially bigger) R scripts, a specialized container may safe you some time and errors, for example, to avoid accidently overwriting existing list elements. Also, being based on R6, all container objects provide reference semantics.

Example: dict vs list

Here are a (very) few examples comparing the standard list with the dict container. For more examples see the vignette.

Init and print

library(container)
## 
## Attaching package: 'container'
## The following object is masked from 'package:base':
## 
##     remove
l <- list(A1=1:3, L=letters[1:3])
print(l)
## $A1
## [1] 1 2 3
## 
## $L
## [1] "a" "b" "c"

There are many ways to initialize a dict - one of them is passing a standard list. The print method provides compact output similar to base::str.

d <- Dict$new(l)
print(d) 
## <Dict> of 2 elements: List of 2
##  $ A1: int [1:3] 1 2 3
##  $ L : chr [1:3] "a" "b" "c"

Access elements

Accessing non-existing elements often gives unexpected results and can lead to nasty and hard-to-spot errors.

sum <- l[["A1"]] + l[["B1"]]
sum
## integer(0)

The dict provides intended behaviour (in this case stops with an error).

sum <- d$get("A1") + d$get("B1")
## Error in d$get("B1"): key 'B1' not in Dict

Catching such cases manually is rather cumbersome.

robust_sum <- l[["A1"]] + ifelse("B1" %in% names(l), l[["B1"]], 0)
robust_sum
## [1] 1 2 3

The peek method returns the value only if it exists. The resulting code is not only shorter but also easier to read due to the intended behaviour being expressed more clearly.

robust_sum <- d$get("A1") + d$peek("B1", default=0)
robust_sum
## [1] 1 2 3

Set elements

A similar problem occurs when overwriting existing elements.

l[["L"]] <- 0  # letters are gone
l
## $A1
## [1] 1 2 3
## 
## $L
## [1] 0

The add method prevents any accidental overwrite.

d$add("L", 0)
## Error in d$add("L", 0): key 'L' already in Dict
# If overwrite is intended, use 'set'
d$set("L", 0)
d
## <Dict> of 2 elements: List of 2
##  $ A1: int [1:3] 1 2 3
##  $ L : num 0
# Setting non-existing elements also raises an error, unless adding is intended
d$set("M", 1)
## Error in d$set("M", 1): key 'M' not in Dict
d$set("M", 1, add=TRUE)  # alternatively: d$add("M", 1)

Removing existing/non-existing elements can be controlled in a similar way. Again, see the package vignette for more examples.

Reference semantics

d$size()
## [1] 3
remove_from_dict_at <- function(d, x) d$remove(x) 
remove_from_dict_at(d, "L")
remove_from_dict_at(d, "M")
d$size()
## [1] 1
backup <- d$clone()
remove_from_dict_at(d, "A1")
d
## <Dict> of 0 elements:  Named list()
backup
## <Dict> of 1 elements: List of 1
##  $ A1: int [1:3] 1 2 3