Historical Data Bulk Downloads

This thread has a link to a generalized version of the script I use to pull the last two weeks of hourly data for the sensors we have deployed.

Also, here is a snippet of code that loops the request period (2 weeks of hourly data) over the request URL, instead of looping the sensor index over the URL, like I show in the script I linked:

#API keys assigned by PurpleAir support
read_key <- "your key here"
write_key <- "your key here"

#end date is today, start date is the beginning of last year
start_date <- as.POSIXct("2022-01-01")
end_date <- as.POSIXct("2023-02-14")

while(start_date <= end_date) {
  #define end of 2 week period, max request period for 1hr data
  end_period <- start_date + days(14)
  
  #format start and end times as UNIX timestamps
  start_timestamp <- as.numeric(start_date)
  end_timestamp <- as.numeric(end_period)
  
  #substitute start_timestamp and end_timestamp into request URL
  #replace "sensorindex" with your actual sensor index
  url <- paste0("https://api.purpleair.com/v1/sensors/sensorindex/history/csv?start_timestamp=", start_timestamp, "&end_timestamp=", end_timestamp, "&average=60&fields=pm2.5_cf_1_a%2C%20pm2.5_cf_1_b%2Cpm2.5_atm_a%2C%20pm2.5_atm_b%2Cpm2.5_alt_a%2Cpm2.5_alt_b%2Chumidity%2Ctemperature%2Cpressure%2Cuptime%2Crssi%2Cpa_latency%2Cmemory")
  
  #send API request and write to text file
  data <- GET(url,add_headers('X-API-Key'=read_key))
  data <- content(data, "raw")
  writeBin(data, paste("sensorindex",start_timestamp,end_timestamp,".txt", sep="_"))

start_date <- end_period + 1
flush.console() #this makes sys.sleep work in a loop
Sys.sleep(60)
}
2 Likes