How the Data is Collected
The most important part of our work is to base our reasoning and conclusions on measurements of observables. What we mean with that is to examine things such as flow-rate, level, temperature, etc., based on measurements with calibrated instruments. The SPORE group does not have the financial resources to deploy instrumentation and hence we rely in data measured by other parties, mostly government entities. This section provides some background on the data that we use and report here on this website.
There are four data types that we track and graph: Lake Specific (height, power plant status), Streams (height, flow rate), Weather data (pressure, temperature, relative humidity, dew point temperature, wind speed, wind direction) and Other. Charts and graphs are produced automatically every morning and uploaded to the web site between 5 am and 5:30 am.
We obtain the data about the level of the lake and the status of the generator (ON/OFF) from the deepcreekhydro website. We have written a Ruby script that reads the deepcreekhydro home page every 10 minutes and extracts the data that is displayed in the "Lake Status" box. The 10 minute interval was initially chosen rather arbitrarily to provide us with a level of detail that we thought we needed. We have subsequently found out that this information is updated every 5 minutes; we keep processing it every 10 minutes though.
Our record taking started on Dec 22, 2011. The data is taken by two separate computers, located a couple of miles apart. Power outages and other malfunctions/interruptions, have caused some gaps in the data, but they are relatively minor in the overall scheme of things.
The raw information has glitches in some of the lake level data (~2 percent; reason unknown). Generally it's only one bad point at a time, but sometimes there can be several in a row. The bad points have been replaced by linearly interpolating between adjacent good values. The resulting file contains data in the following fields: sequence #; elapsed time(hrs); lake level(ft asl); generator status (Off=0, On=1); Date of gage reading(day/month/year); Time of gage reading(hours:minutes).
The USGS deploys a number of gages in nearby streams that monitor the flow-rate of the stream, its level, the temperature of the water and precipitation. Not all locations measure all four variables. We extract the data daily from their respective sources on the web. The USGS measurements are typically made every 15 minutes.
Weather data is obtained from several sources. The primary source is Garrett Airport. Their gages are checked and calibrated quarterly by NOAA, and are therefore very reliable. The only important piece of information missing is precipitation! Garrett College also has a weather station that provides reasonable data. Prior to August of 2012 the precipitation data is somewhat suspect because of the intermittent operation of the gage and its shielded location. This has now been rectified and the station should provide reasonable data.
When we find other relevant data we add them in this section of the website. For example, we found "evaporation" data for Savage Lake, very appropriate to Deep Creek lake, but it dates back many years ago. This data will be shown here.
All automatic data collection and processing is done on two Apple computers, located a few miles apart. The primary one, and the one that does all of the post-processing, is an Apple Mac Pro running system 10.7.4. The data collection and data processing are done with Shell scripts, Ruby scripts and R scripts. Scheduling of the operations is done using Apple's iCal utility that can execute AppleScripts, which in turn can command functions in Apple's Automator workflow application. The Shell scripts activate the Ruby and R scripts which download the data, processes the data to correct glitches, convert the data in appropriate units and data structures, and generates the necessary graphics and web pages. Automator interacts with Fetch, an ftp application, which uploads the webpages with the updated data onto the web server. The total time consumed by all of the operations for the 23 charts takes less than a minute. The various processes are slightly spaced apart so that buffers can be completely flushed before the next step is activated.
Some of the charts are produced using the JavaScript methods developed by HighCharts It takes a little learning, but the system produces high quality charts, as can be seen on their and this website
Other charts are produced by R scripts specifically developed for this website. Check out the "bathymetry" work that we have done. It uses R [version 2.15.2 (2012-10-26)] exclusively to produce the bathymetric maps.
The source of data for the Lake Specific charts is from the deepcreekhydro website
The source of data for the Stream Gages charts is USGS for sites: Hoyes Run => 03076100, Poland Run => 03075800, Cherry Creek => 03075905, Bear Creek => 03076600, Oakland => 03075500, Friendsville => 03076500
The source of data for the Weather charts is Garrett County Airport (K2G4)