Purrr or not

1 minute read

What’s an hour in the life of a graduate student?


I initially resisted functions in R. I started out using for loops. Looping back it feels like the craziest thing I could think of, but I for some reason couldn’t get my head around the function. Now I write functions for just about anything I use more than twice. I used apply for a long time, but then I started using map from purrr a lot. After reading some of the posts at jennybc.github.io I’ve gotten more comfortable using purrr for a lot of things.

If I have more than one file to load I use purrr:

 dat <- map_df(list.of.files[grep(scenario, list.of.files)], read_tsv) 

So I can search through a list of files and load just the ones I want.

I’ve used it a lot to run bootstraps on custom functions

scen_prod <- 
  pred_mass %>% 
  mutate(Month=month(md), Day = day(md)) %>% 
  filter(Month !=6 &# Age %in% c("A", "J", "U") &
           ((Month ==7 &Day < 25) |
              (Month==8 & Day > 5) )) %>% 
  rerun(.n = nboot, sample_i =mod.dat.sample(dat=., catch_n=dailycatch)) %>% 
  map_df(~sumModDat(.x$sample_i, scen = T))

Here I’m resampling data generated from a model to match the structure of capture data from the field.

Most recently I’m moving to what I feel is the most programmer way of doing things and the furthest out of my comfort zone, in using map on nested data frames. Nested data frames, like functions are just something I had a hard time getting comfortable with. I’ll admit I don’t always see the benefit to using them, but I’m getting in the habit of trying them out when it seems to make some sense.

modelRuns <- tibble(
                    dat = map_df(list(ressurvey, falcon_arrival,falc_inc, decline_f, decline_f, decline, stable, inc),nest)$data,
                    mod_dat= map_df(list(falc_arrival, falc_arrival, falc_pop, wesa_mass, wesa_food, wesa_pop,wesa_pop,wesa_pop), nest)$data,
                    bringyourowndat = c(T, rep(F,7)) ) %>% 
                     mutate(propPlots = pmap(., predictandplot, returnType="plot",  sepPlots=T, relAbundace=F))

Here I’m creating a nested dataframe of different model scenarios and real life data and then generating a nested column of plots. Total overkill I know, but hopfully I learned something here.

I know I’m not yet posting reproducible code. I’ll try to post examples from the r datasets to make them more than just me reflecting on the code I wrote that makes no sense to anyone but me.