Animal Crossing: New Horizons kept me sane throughout the first Melbourne COVID lockdown. Now, in lockdown 4, it seems right that I should look back at this cheerful, relaxing game and do some data stuff. I’m going to take the Animal Crossing villagers in the Tidy Tuesday Animal Crossing dataset and combine it with survey data from the Animal Crossing Portal, giving each villager a measure of popularity. I’ll use the Google Cloud Vision API to annotate each of the villager…
I went down a strange path recently, trying to compile binaries of R packages for Linux. I’m not sure why — this area is pretty much covered by the RStudio Package Manager. I’ll leave my Dockerfiles here in case they’re of any use to a future wayward R programmer.
I have a simple machine learning workflow that I recreate whenever I’m testing something new. I take some interesting data and a target, throw in some pre-processing, tune hyperparameters with cross-validation, and train a random forest. It’s all the basic ingredients for a machine learning model. Since I like Julia so much, I’ll recreate my simple machine learning workflow with Julia’s
I have a machine learning model that takes some time to train. Data pre-processing and model fitting can take 15–20 minutes. That’s not so bad, but I also want to tune my model to make sure I’m using the best hyper-parameters. With 16 different combinations of hyperparameters and 5-fold cross-validation, my 20 minutes can become a day or more.
AWS has announced support for container images for their serverless computing platform Lambda. AWS doesn’t provide an R runtime for Lambda, and this was the excuse I needed to finally try to make one.