
drake
is a package for orchestrating R workflows. Suppose I have some data in S3 that I want to pull into R through a drake
plan. In this post I’ll use the S3 object’s ETag to make drake
only re-download the data if it’s changed.
This message, related to the development of the theme, only displays on the
localhost
homepage to notify you of any important theme changes.
With this version, the theme has fully implemented static search using
lunr.js
. That being said, a few modifications were necessary to implement this
feature.
To utilize this, or future version of Hugo Future Imperfect Slim, please make the following changes:
config.toml
before [params]
:...
disableLanguages = [""]
[outputs]
home = ["html", "json"]
[params]
...
Remove the underscore from all about
and contact
page file names:
_index.md --> index.md
Add layout = "about"
or layout = "contact"
to all of the files you just
just adjusted the file names for.
While I realize this is inconvenient, I hope that it is worth it to you in the long run. Thanks for using the theme, and feel free to submit issues as needed.
Data, Maths, R
drake
is a package for orchestrating R workflows. Suppose I have some data in S3 that I want to pull into R through a drake
plan. In this post I’ll use the S3 object’s ETag to make drake
only re-download the data if it’s changed.
After I posted my efforts to use MLflow to serve a model with R, I was worried that people may think I don’t like MLflow. I want to declare this: MLflow is awesome. I’ll showcase its model tracking features, and how to integrate them into a tidymodels
model.
There’s always a need for more tidymodels
examples on the Internet. Here’s a simple machine learning model using the recent coffee Tidy Tuesday data set. The plot above gives the approach: I’ll define some preprocessing and a model, optimise some hyperparameters, and fit and evaluate the result. And I’ll piece all of the components together using targets
, an experimental alternative to the drake
package that I love so much.
I’m obsessed with how to structure a data science project. The time I spend worrying about project structure would be better spent on actually writing code. Here’s my preferred R workflow, and a few notes on Python as well.
Suppose I want a function that runs some setup code before it runs the first time. Maybe I’m using dplyr but I haven’t properly declared all of my dplyr calls in my function, so I want to run library(dplyr)
before the actual function is run. Or maybe I want to install a package if it isn’t already installed, or restore a renv
file, or any other setup process. I only want this special code to run the first time my function is called. After that, the function that runs should be…