Quick Start Guide

When performing UQ, it can hardly be overstated how useful it can be to run the model you are analysing for yourself. Obviously, if you can, as with Metawards, this comes with its own problems and dangers, and you might run the model in such a way as not to get anything useful out. Thanks to the work of our team (TJ McKinley and Chris Fenton in particular), and with help from Metawards developer Christopher Woods, we have a model structure (see here) and a (relatively) stable version of the code for running it for UQ and extracting the data for yourselves!

So why am I writing a “for dummies” guide to using it and not Chris/TJ? Well, because I qualify as a member of the target audience! Today, I went through TJ’s instructions, failed in several places and figured out everything I needed to do to run it, so I figured I would pass that knowledge on.

Instructions for Linux/Mac

As Mac uses Unix and I am on a Mac, the instructions are identical. There will need to be tweaks to the below for Windows users, but what is here should be enough to get you started.

Install Metawards

The easiest thing to do is to open a terminal and type

pip3 install metawards

I used pip3, you may have Python 3 as default and can use pip. The absolute best thing to do is go here and follow the instructions. They are really clear. For this part it doesn’t matter what directory you are in.

Now, you need to head to a directory where you want the Metawards data to live. If not familar with terminal navigation, I used

cd Dropbox/BayesExeter

and you can check if you are happy with what’s in the directory with

ls

Next install the metawards data files with

git clone https://github.com/metawards/MetaWardsData

Get the uq4covid version

Now, we get our version of the model, which consists of a number of files needed to do the running, designing and extracting.

These can be downloaded as a .zip file here.

Alternatively, these are also on the uq4covid.io repository here, which you should clone onto your machine or pull if you have it already.

Once you have it, go to directory vignettes/data/MetaWards and there is everything you need. To ensure we keep the repo version clean, it’s best to copy this folder out to somewhere else. This guide is going to use RStudio to do most things, so head to this folder in Rstudio and set it as your working directory.

Generating a new design

The file convertDesign.R creates new designs to your specification. To run a custom design, you need to open it and change a few things. I’ll highlight the blocks you need to look at now.

First the parameter ranges look like this

parRanges <- data.frame(
    parameter = c("r_zero", "incubation_time", "infectious_time", "hospital_time",
                  "critical_time", "lock_1_restrict", "lock_2_release",
                  "pEA", "pIH", "pIRprime", "pHC", "pHRprime", "pCR", 
                  "GP_A", "GP_H", "GP_C"),
    lower = c(2.5, 4, 2, 4, 4, rep(0, 11)),
    upper = c(4, 6, 4, 12, 12, rep(1, 11)),
    stringsAsFactors = FALSE
)

You can alter lower and upper for any of the parameters (or even fix some if you like) here. See the model description to see what they all do.

Next, generate your own design. The default is a test 5 member LHC and is in the code like this

design <- randomLHS(5, nrow(parRanges))
colnames(design) <- parRanges$parameter
design <- as_tibble(design)

Obviously, change the design to something of your own choosing (remembering at all times how bad random Latin Hypercubes can be), replacing the first of these 3 lines with your own.

The next 2 lines are super important

design$output <- ensembleIDGen(ensembleID = "Ens0", nrow(design))
design$repeats <- 2

The second gives the number of repeats per run so is fairly obvious. Our early experiments showed that at least 20 might be needed for some good techniques to work. The output argument is very important. This is a unique identifier for the ensemble. "Ens0" is the first test ensemble we’ve run. "Ens1" will be the next. Soon there will be "Ens9", "Ensa", "Ensb" and so on. We use “Ens” as Metawards does funny things if you use numbers or some single letters. Ideally, you would let us know you are generating a big ensemble with the next ID for putting on the git page. This identifier will ultimately be responsible for identifying your ensemble from others, and for matching inputs to outputs during extraction. Change it now.

One you have made those changes, run the whole script and you are ready to run metawards!

Running Metawards

There are 2 simple steps to this. First, you need to edit the file runscript.sh. Simply open it with xcode, and alter the line

export METAWARDSDATA=$HOME/Dropbox/BayesExeter/MetaWardsData

so that the directory is the place you cloned the Metawards data above. Now it’s run time! The default is to run for 100 days, but you will probably want longer to look at interesting things. The final part of the long line calling Metawards handles this

--nsteps 100

and you can alter it in obvious ways.

Open up a terminal and navigate via cd to the folder where your runscript.sh is. Once there type

./runscript.sh

and let the fun begin. Even this small test ensemble took about 10 mins on my machine, and it runs on all the cores, so don’t expect to be able to do a tonne whilst it’s running. TJ had a bigger job run all night.

Extracting data

Eventually the fans will cool down and the simulations will be complete. Now you will want to get at some of the outputs. The extractOutput.R file lets you do this. Open this in RStudio.

If you have all the required libraries, you can just run the whole thing and, by default, you will get the total number of infections in hospitals per ward. There is a lot you can do to customise, and details can be found in the Model Description. To get you started, here are some things you can change.

The chunk

## Hinc contains the new cases, so sum these
## over each ward for days 1--100
hosp_db <- filter(compact, day <= 100) %>%
    select(ward, Hinc) %>%
    group_by(ward) %>%
    summarise(Hcum = sum(Hinc))
## collect outcome of query
hosp <- collect(hosp_db)

is taking the database con created from the raw outputs, is selecting the first 100 days from the Hinc stage (which counts the new hospital cases at each day), and summing these up within each ward. Some reference to our model setup is required to see what to do, so maybe head there to see what the stages and demographics refer to in the diagram. Briefly, we have 4 demographics: genpop , hospital , critical, and asymp. The diagram in TJ’s instruction shows where they are, and also contains a description of the outputs. Note that columns with inc appended to their names count the new entries into the class on that day. So Hinc is the number of new hospital cases on a given day, and H is the number of individuals in class \(H\) (including Hinc). For brevity, extraneous stages are removed (for example, E and I only exist for the genpop demographic, so they are not returned for any other demographics since all entries are zero in any case). Entries are only returned for a ward once infection is present, so not all wards will have outputs on day 1 say. See the Model Description for full details.

Having extracted data like this in tibble form, it is ready to be processed for building whatever emulators you like.

Sundries

All of the files for running Metawards are in the different folders in our repo (that you will have cloned onto your machine), so you are free to change things for your own investigations. In the future, when we decide on different experiments we want to do, we will alter these files, but still be able to retain the same designing, running and processing experience.

Happy UQing!