Welcome to `hoboR`: An R package for the analysis of HOBO weather data

hoboR is a package for the analysis of weather station data. Analyzing weather station data can quickly become a big data project, making the task of processing and analyzing data a significant challenge. To facilitate the analysis and manipulation of weather station data, we developed hoboR. hoboR aims to simplify this process, making it less time-consuming for researchers and users of weather data. hoboR offers a series of tools to load multiple CSV files, remove duplicates, and process and summarize the data.

hoboR is a specialized R package designed to streamline the processing of large datasets from HOBO weather stations and data loggers. With hoboR, handling .csv files generated by various HOBO models or other weather station data becomes effortless. This package analyzes any microclimate measurements such as temperature, relative humidity (RH), dew point, and radiation. Additionally, hoboR is versatile in accepting data in multiple date formats, including Day/Month/Year (DD/MM/YYYY), Month/Day/Year (MM/DD/YYYY), and two digits Year/Month/Day (YY/MM/DD), ensuring flexibility across global time formats.

HoboR Components and Other Examples

HOBOR components

HOBOR Callibration

An example to implement weather data

Manuscript–> HOBOR: An R package for the analysis of HOBO weather data

Install hoboR

hoboR installation via devtools, and in the process of submit it to CRAN.

First, install devtools and dependency libraries

install.packages("devtools")
library("devtools")
 devtools::install_github("LeBoldus-Lab/hoboR", force = TRUE)
library(hoboR)

Required dependencies

library(lubridate)
library(tidyr)
library(dplyr)
library(reshape2)
library(ggplot2)
library(scales)

How to use hoboR

For readers

hoboR is an R package that processes CSV files from HOBO weather stations and data loggers. The best way to start your project with hoboR is to organize the downloaded CSV files from HOBO in a single directory. Using the hobobinder() function, all the CSV files will be bound in a large data frame. For example, if you have 10 locations, you should have 10 folders, each containing all the CSV. This group of weather records normally includes duplicate entries that hobocleaner() can deal with and sort duplicates out. Most duplicates are generated by replacing batteries or retrieving data, etc. This clean data frame can be summarized by time and the sensors of the model device, like temperature, relative humidity and precipitation, mean and standard deviation, or mins and max records by implementing meanhobo().

The HOBO sensors might fail due to environmental issues and other issues. To identify such mail functionality, a series of functions to identify impossiblevalues() and sensorfailures() are available. Additional functionality for subsetting by time intervals using hoborange(), and snapshots of time intervals using timestamp(). A calibration guide for HOBO data loggers for microclimate experiments can be accessed and evaluated with calibrator() and correction() functions to adjust the HOBO devices at a baseline for all the HOBO loggers, crucial for microclimate studies. A couple of customized functions are also available, an example of the analysis of a buck bait experiment to investigate the incidence of sudden oak death (SOD) disease and environmental changes in Southern Oregon.

For coders

# load the library
library(hoboR)

Example:

The data was collected in China Creek in Southern Oregon using a weather station. The measurements were recorded every minute over a period of 5 months, from August to December 2022. The weather variables collected were humidity (Wetness), temperature (Temp), relative humidity (RH), and rain (Rain).

# Add the PATH to your sites for weather data (from HOBO)
path = paste0("~/Desktop/Adam/site_12_date_adj2/")
# make sure the path to your CSV files exists
file.exists(path)     # this will return a logical value TRUE

Confirm that the path exist, then proceed with binding all the csv files, and clean the data.

After merging, all records are present, including duplicate entries. The hobocleaner() function cleans the duplicate entries, and renames the columns. The format argument should match the HOBO format type: “ymd” for YYYY/MM/DD, “myd” for MM/YYYY/DD”, and “yymd” corresponds to two digits year YY/MM/DD. Be mindful with your format selection, otherwise proceed with caution.

# loading all hobo files
hobofiles <- hobinder(path, skip = 1)
hobocleaned <- hobocleaner(hobofiles, format = "yymd")
head(hobocleaned)

Let’s summarize the data every 30 min, and get the means for 1 day or get the mean every 24 hours

Note: that the original data was recorded every minute.

# getting hobo mean summary by time
hobot5 <- hobotime(hobocleaned, summariseby = "30 mins", na.rm = T)
hobomeans5 <- meanhobo(hobot5, summariseby = "1 day",  na.rm = T)
head(hobomeans5)

# getting hobo means by date
hobomeans <- meanhobo(hobocleaned, summariseby = "24 h",  na.rm = T)
head(hobomeans)

The clean data can be aggregated by time interval, e.g., "5 mins", "12 h", "1 day", etc., the hobotime() function or obtaining the mean, the minimum and maximum, and other summary statistics by implementing meanhobo().

Check the difference between both methods, summarizing and getting the mean, or mean only.

plot(1:nrow(hobomeans), hobomeans$x.Temp, type = "line")
lines(1:nrow(hobomeans5), hobomeans5$x.Temp, type = "line", col = "red")

If you want to select a time range, you can specify the date intervals using hoborange. Just specify the starting and end dates.

# Specify a window range 
timerange <- hoborange(hobocleaned, start="2022-08-08", end="2022-12-12")
head(timerange)

Check the variability every 12 hours, at midnight and noon for 100 days.

# Snapshot of a time interval 
a <- timestamp(hobocleaned, stamp = "2022-08-05 00:01", by = "24 hours",
          days = 100, na.rm = FALSE, plot = F, var = "Temp")
a$Group <- rep("night", nrow(a))
b <- timestamp(hobocleaned, stamp = "2022-08-05 12:01", by = "24 hours",
               days = 100, na.rm = FALSE, plot = F, var = "Temp")
b$Group <- rep("day", nrow(b))

daynight <- rbind(a, b)

Plot with ggplot2

ggplot(daynight, aes(x = Date, y = Temp, group = Group, color = Group)) +
  geom_line() +
  scale_x_datetime() +
  scale_y_continuous(limits = c(0, 30)) +
  scale_color_manual(values = c("orange", "black")) +
  labs(color = "Source") +
  scale_y_continuous(name = "Temperature °C")+
  scale_x_datetime(name = "Months")+
  theme_minimal()

weather 1var

Fig. 1) Visualization of the summary results calculated with hoboR of the weather recorded between October 2021 and December 2021, in Brookings, Southern Oregon.

# two vars
ggplot(hobocleaned, aes(x=as.POSIXct(Date))) +
  geom_line( aes(y=Temp, col = "red"), alpha = 0.5) + 
  geom_line( aes(y= Wetness, col = "blue"), alpha = 0.5) + 
  scale_y_continuous(
    # Features of the first axis
    name = "Temperature °C",
    # Add a second axis and specify its features
    sec.axis = sec_axis(~., name="Humidity")
  ) +
  labs(title = "Temperature: August - December, 2021", color = "Legend") +
  scale_color_manual(labels = c("Humidity", "Temp"), values = c("blue", "red")) +
  scale_x_datetime(name= "Date", labels = date_format("%Y-%m-%d"))+
  theme_bw()

weather 2var

Fig. 2) Visualization of the summary statistics of two weather variables (temperature and humidity) in Southern Oregon from October to December 2021.

Thank you for exploring the hoboR package. This tool is designed to facilitate statistical analyses and visualizations for ecology research. We will continue improving hoboR, incorporating new findings and feedback from the community to enhance its utility and functionality.

Acknowledgements: Work funded by the National Science Foundation (NSF) and the United States Department of Agriculture (USDA).

Further Support For further support or to contribute to the project, please visit our GitHub repository.

Welcome to hoboR: An R package for the analysis of HOBO weather data