Nested dependencies with %:%
I did a lot of nesting checking, but here I’m specifically interested in dependencies in the iterations in the nesting of foreach loops using %:%
Packages and setup
I’ll use the {future} package, along with {dofuture} and {foreach}.
Built-in nesting with dependencies
I’m getting strange errors when using built-in nesting where the iterations in the inner loop depend on the outer. I think those dependencies aren’t resolving how I assumed they were.
These iterations could be wholly independent of each other, e.g. i = 1:10
, j = seq(from = 0, to = 1, by = 0.1)
. But they could be dependent- e.g. in that simple case, we could naively say j = i/10
because that would make an equivalent vector, but it only happens for each i
(see below). That’s expected, but not necessarily obvious at first glance. And it could be more complex still (which is the situation I have), with j
indexing into something chosen by i
. I’ll go through each in turn, returning objects that allow me to assess what’s up.
Wholly independent
<- foreach(i = 1:10, .combine = rbind) %:%
indep_nested foreach(j = seq(from = 0.1, to = 1, by = 0.1), .combine = rbind) %dopar% {
<- tibble::tibble(outer_it = i, inner_it = j)
}# indep_nested
To check, we can see if there is a factorial mapping
outer_it 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
1 1 1 1 1 1 1 1 1 1 1
2 1 1 1 1 1 1 1 1 1 1
3 1 1 1 1 1 1 1 1 1 1
4 1 1 1 1 1 1 1 1 1 1
5 1 1 1 1 1 1 1 1 1 1
6 1 1 1 1 1 1 1 1 1 1
7 1 1 1 1 1 1 1 1 1 1
8 1 1 1 1 1 1 1 1 1 1
9 1 1 1 1 1 1 1 1 1 1
10 1 1 1 1 1 1 1 1 1 1
# indep_nested |> group_by(outer_it) |> summarise(n_outer = n())
# indep_nested |> group_by(inner_it) |> summarise(n_inner = n())
Simple dependency
Now we can make j
dependent on i
, but very simply. And we see that the factorial combination is lost- j
only maps to each i
. This is how the loop should work, though it may not be obvious at first glance- the vectors i/10
and seq(from = 0.1, to = 1, by = 0.1)
are the same, but the first only finds one value per i
, while the second finds the whole vector.
<- foreach(i = 1:10, .combine = rbind) %:%
simple_dep foreach(j = i/10, .combine = rbind) %dopar% {
<- tibble::tibble(outer_it = i, inner_it = j)
}# simple_dep
outer_it 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
1 1 0 0 0 0 0 0 0 0 0
2 0 1 0 0 0 0 0 0 0 0
3 0 0 1 0 0 0 0 0 0 0
4 0 0 0 1 0 0 0 0 0 0
5 0 0 0 0 1 0 0 0 0 0
6 0 0 0 0 0 1 0 0 0 0
7 0 0 0 0 0 0 1 0 0 0
8 0 0 0 0 0 0 0 1 0 0
9 0 0 0 0 0 0 0 0 1 0
10 0 0 0 0 0 0 0 0 0 1
Balanced indexing
Now we’re on to the bit that is tripping up some of my code. I have a list, and want to index through its names and values. Though I think the same thing would apply to any indexing.
<- list(a = 1:10, b = seq(from = 0.1, to = 1, by = 0.1), d = 11:20) ballist
<- foreach(i = names(ballist), .combine = rbind) %:%
list_dep foreach(j = ballist[[i]], .combine = rbind) %dopar% {
<- tibble::tibble(outer_it = i, inner_it = j)
}# simple_dep
outer_it 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 3 4 5 6 7 8 9 10 11 12 13 14
a 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0
b 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0
d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
outer_it 15 16 17 18 19 20
a 0 0 0 0 0 0
b 0 0 0 0 0 0
d 1 1 1 1 1 1
That looks right- there are only records for the inner values when they’re present in the outer. We can see that more clearly in the df itself
Unbalanced indexing
The above should work the same if the list-items are different lengths, but let’s check
<- list(a = 1:5, b = seq(from = 0.1, to = 1, by = 0.1), d = 11:13) unballist
<- foreach(i = names(unballist), .combine = rbind) %:%
list_dep_unbal foreach(j = unballist[[i]], .combine = rbind) %dopar% {
<- tibble::tibble(outer_it = i, inner_it = j)
thisloop }
That seems like it works how I expect. It’s not so clear then why I’m getting a shuffled issue in the code that prompted this, but it seems I need to look elsewhere.