R Script for performing monitor checks

As mentioned in my other post, I’ve written a script in R-studio to compare A and B channels of a sensor or multiple sensors and check for agreement. This is very helpful if you’re like me and have to babysit a bunch of sensors that are running all the time.

The script uses the “secondary” PA data (specifically the particles >0.3um), which I’ve read is most closely related to the actual scattering signal from the Plantower sensors. Unlike the mass concentration values included in the “primary” data, these values actually go all the way down to zero so they work much better for low-concentration time periods.

The script will generate plots of A vs B channel values on a log-log scale, along with a table that includes the slope and y-intercept of the fit (after removing points where A&B channels had significant disagreement). This table also includes a max value, and fraction of data points where sensors disagreed or registered zero. Lastly there is a column with a binary yes/no that determines if the monitor is functioning properly.

The metrics used here are somewhat arbitrary: PM disagreement of 70% difference in channel A&B values is taken from USEPA methodology and can be altered as needed. also the y-intercept, slope, disagreement, and percent zero values used for the binary yes/no column are just the ones that I use for my study and can also be changed to suit your purposes.

Hopefully this will be helpful for some folks. Feel free to alter it for your own purposes. Sorry for the lack of documentation in the script. If you have any questions, feel free to ask.

-Aaron

2 Likes

Nicely done, thanks for this.

Do you have a source for the 70% difference threshold you attribute to EPA? I’m interested in any other QC guidance that document might contain.

1 Like

Hey Troy,

We got that number when speaking to some EPA folks a while back, or when listening to one of their presentations. I did just look it up though, and it appears they settled on 61% OR 5 ug/m3. That being said, they also state that they did not attempt to optimize their cleaning procedure, so it does appear to be a bit arbitrary. I know from my own experience that using a much lower threshold like 30% will remove a significant number of data points. I’ve found that the 70% threshold along with averaging the two channels works fairly well without overcleaning.

1 Like