Python script to retrieve historical data

Hi everyone! I hope you’re doing well. I’m still new to this group so I am sorry in advance if this has been answered before. Does anyone know of an existing Python script that uses the new API to build a data file of all historical data for a given list of sensors? If not, does anyone have an archive of historical data for many sensors that they would be willing to share?

I am working on a project that will require long-term historical data for several sensors. Any feedback is greatly appreciated. Thank you so much.

–Ben Weitz

1 Like

import datetime
import json
from json import JSONDecodeError

import requests

key_read = ‘1111’ # your API key
sensor = ‘13327’ # sensor number

def get_purple_history_to_json(start_date, end_date, json_file):
root_url = r’https://api.purpleair.com/v1/sensors/’ + f’{sensor}’ + ‘/history’

my_header = {'X-API-Key': key_read, 'Content-Type': 'application/json'}
my_params = {'start_timestamp': start_date, 'end_timestamp': end_date,
             'fields': 'humidity, temperature, pm2.5_atm, pm2.5_alt'}
response = requests.get(
    root_url,
    headers=my_header,
    params=my_params,
)
try:
    response_json = response.json()
except JSONDecodeError:
    print('Response could not be serialized')

with open(json_file, 'w') as fout:
    json.dump(response_json, fout)

if name == ‘main’:
start = input('Enter start date as YYYY M D: ')
start_string = [num for num in start.split()]
start_string = ‘-’.join(start_string)
start = [int(num) for num in start.split()]
# start_epoch = (datetime.datetime(*start) - datetime.timedelta(minutes=20)).timestamp()
start_epoch = datetime.datetime(*start).timestamp()

end = input('Enter end date as YYYY M D: ')
end_string = [num for num in end.split()]
end_string = '-'.join(end_string)
end = [int(num) for num in end.split()]
end_epoch = (datetime.datetime(*end) + datetime.timedelta(days=1)).timestamp()

json_file = f"purple_air_{start_string}_{end_string}.json"
get_purple_history_to_json(start_epoch, end_epoch, json_file)
1 Like

This kind of a rough and ready script.

As I recall, I could only retrieve 3 days’ worth of data at a time. In fact, I think it was three days less one reading.

Good luck with your project!

This may be more than you are looking for but RAMADDA (https://ramadda.org/) is a comprehensive data management system. It has support for collecting and managing Purple Air data. I would put more links in this reply but I am restricted to 2
For example - Purple-Catalina

Or a collection of sensors -

It is server software but runs pretty much anywhere. You just need Java. While it is running it will collect the real-time data but it also has an interface for accessing the historical data.

However, the constraints in place on the Purple Air API end of things on how much data can be downloaded are still in place, e.g., you can only get a day or 2 of real-time data

You will need to get an API key from Purple Air to collect the real-time data and you also need to get permission from them to access the historical data

1 Like

Check out this project here: GitHub - carlkid1499/purpleair_data_logger: A logger that will query purple air sensor(s) for data. That data can then be stored in a PostGreSQL database, CSV files, or SQLite3 databse. Find this package on pypi: https://pypi.org/project/purpleair-data-logger/#history You can grab your own sensor here: https://www2.purpleair.com/

It is able to query live data from one or many sensors into CSV, SQLite3 DB or PSQL DB.

The historic API calls have been implement but at the moment they are blocked by Purple Air. You can reach out to them via email to request large amounts of data. Here is their contact information.

contact@purpleair.com