David Neuzerling

David Neuzerling

Data, Maths, R

Locking down R package dependencies and versions is a solved problem, thanks to the easy-to-use renv package. System dependencies — those Linux packages that need to be installed to make certain R packages work — are a bit harder to manage.

There’s always a need for more tidymodels examples on the Internet. Here’s a simple machine learning model using the recent coffee Tidy Tuesday data set. The plot above gives the approach: I’ll define some preprocessing and a model, optimise some hyperparameters, and fit and evaluate the result. And I’ll piece all of the components together using targets, an experimental alternative to the drake package that I love so much.

I’m obsessed with how to structure a data science project. The time I spend worrying about project structure would be better spent on actually writing code. Here’s my preferred R workflow, and a few notes on Python as well.

Recent posts


Powered by Hugo and hugodown.

This content of this blog is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License except where stated otherwise.