Data Science

Blogs & Slides Only

Buy me a coffeeBuy me a coffee


I separate this section from the statistics one because when I needed to learn how to run statistics in R, I would feel frustrated when the documents I found were not directly telling me about specific tests I need to learn about. It may not be generalizable but it makes sense to me now. Sorry!

Using the tidyverse with Databases

By Vebash Naidoo

What is this?
Excerpt from site: You know R, especially the dplyr 📦. Even though the dplyr 📦 is so well written to mimic the SQL syntax - select(), group_by(), left_join() etc. there is still a cognitive load when you switch between using R syntax, and SQL syntax (ask me, who has often written == in SQL syntax on Athena only to wonder why I am getting an error 🤐).

You only have so much memory in your local environment, and may want your RDBMS to do the heavy lifting (most of the computation), and only pull data into R when you need to (e.g. pull in aggregated data to create plots for a report).

In this tutorial you will learn how to use dbplyr, which is a database back-end of dplyr, to execute queries directly in your RDBMS all the while writing R tidyverse syntax 😮 ⭐.

  1. Blog Part 1 here: https://sciencificity-blog.netlify.app/posts/2020-12-12-using-the-tidyverse-with-databases/
  2. Blog Part 2 here: https://sciencificity-blog.netlify.app/posts/2020-12-20-using-the-tidyverse-with-dbs-partii/