We also evaluate the impact of using different amounts of training data and calibration durations

Measuring and predicting temperature accurately is challenging due to variation across farm micro-climates where local temperature can deviate from the surrounding area which is typically measured at mesoscale. Measuring temperature for a large number of micro-climates on a farm can be prohibitively expensive with extant weather stations and sensors . In this chapter, we explore the use of sensor synthesis to estimate outdoor temperature on farms using the processor temperature of simple, inexpensive single-board computers or micro-controllers such as those in the Arduino family Arduino. Our approach estimates outdoor temperature from the on-board processor temperature sensor that these devices support and which is available via their respective hardware/software interfaces. Such devices cost around $5, are battery or solar-powered, and can be packaged in small, inexpensive, weatherproof enclosures, making them practical for use in moderate and large scale geographic deployments. To investigate how well the processor temperature of these devices can be used to predict outdoor temperature, we have developed an on-farm IoT system in which we place single-board computers in-situ throughout the farm. The devices transmit measurements of CPU temperature wirelessly to wall-powered, indoor, edge cloud systems Elias et al. . We first calibrate the device CPU temperature against a co-located,vertical growing racks high-quality temperature sensor using linear regression.

We then remove the temperature sensor at each remote location. The edge cloud computes a prediction of outdoor temperature for each device/location for each CPU measurement that it receives from the device. It does so by applying the regression coefficients from the calibration period to the CPU temperature measurement. To account for autocorrelation in the time series, we investigate the use of Single Spectrum Analysis Golyandina & Zhigljavsky to extract a smooth “signal” from the data prior to performing linear regression and compare this approach to non-smoothing.Finally, we integrate different outdoor temperature sources that include device-attached sensors , high-end, on-farm weather stations, and remote Weather Underground Weather Underground stations. We first consider two different configurations. The first is a “limit study” in which we continuously update the regression coefficients using a co-located temperature sensor, to compute a one step ahead prediction. This configuration represents an upper bound on the efficacy of predicting outdoor temperature from processor temperature. Using a second configuration, we consider a practical application of our approach in which the edge cloud estimates the outdoor temperature using information from the initial calibration period and the CPU temperature measurements reported by the device every 5 minutes. Next, because sensor synthesis is based on computed estimates rather than actual measurement, it introduces the possibility of additional error beyond measurement error. To address this, we examine how a larger ensemble of measurements improves the accuracy of “synthetic” temperature measurement while, at the same time, not requiring the use of powerful computational resources.

Reducing the prediction error is not only academically interesting, rather, precision has a direct impact on the cost and efficiency of what has become known as precision agriculture or precision farming. In precision agriculture, farmers use technology to increase the efficiency of farming techniques increasing crop yields and reducing costs. Having more precise temperate data reduces the cost of frost prevention and prevents excessive resource use without negatively impacting crop production. Consequently, we believe that our approach can contribute to improved farming outcomes, enable water and energy savings, and help reduce carbon emissions, by providing high-quality data to data-driven, IoT-based agricultural applications. We then extend our approach to use a combination of processor temperatures from multiple devices and outdoor temperature from high-quality, remote weather stations to train a multiple linear regression model. We use this model to estimate the future outdoor temperature at a particular device location that is not part of the model. We also investigate the efficacy of computationally simple smoothing techniques to reduce noise. We also investigate how well our approach performs when the processors on the devices experience load. The load may affect processor temperature and thus negatively impact the accuracy of our outdoor temperature estimates. To do so, we develop techniques that successfully deal with the perturbations caused by load variability, which is an important requirement to make our sensor synthesis practical in the field . In this chapter, we investigate the relationship between processor temperature embedded in single-board computers, and the atmospheric temperature that surrounds them. Our goal is to place these computers in-situ in agricultural settings for use as thermometers. By doing so, we can leverage their measurements to actuate and control a wide range of IoT-based farm operations, while driving down the cost of implementing such solutions at scale.

Examples of such farm operations include irrigation scheduling and frost damage mitigation strategies. For automatic irrigation scheduling, real-time temperature measurements are used to compute localized estimates of evapotranspiration , which indicates the amount of water that has been lost and that must be replaced via irrigation. Both under and over-watering can decrease productivity, destroy crops, and degrade soil health. Irrigation scheduling is the most common form of IoT and data-driven decision support system on farms and is especially important for managing farms in drought-stricken regions. The terms “frost” or “freeze” are used by the public to describe a meteorological event that causes freezing injury to crops and other plants, when the air temperature falls below the tolerance level of the specific plant Levitt et al. . The ability to predict the onset of frost, its duration, and the specific locations where frost will occur is of tremendous value to the agricultural industry. In the USA, there are more economic losses to frost damage than to any other weather-related phenomenon White & Haas . Active frost protection strategies include application of water, use of wind engine-driven machines and heaters, and/or some combination of these methods, all of which are extremely labor-intensive and costly for growers. If the onset or duration of frost is mis-predicted, the cost of any mitigation strategies applied is lost. Alternatively, incorrectly predicting that a freeze will not occur to save these costs can devastate a crop. For this reason, current practice is conservative, passing any unnecessary mitigation costs on to the consumer in exchange for a low risk of crop loss. In both operations, accurately measuring and predicting the temperature in real-time is required. However, the temperature is not uniform and can vary widely across a farm,vertical farming in shipping containers requiring that operations account for very localized differences to obtain measurable outcomes. Micro-climates can occur in large numbers due to topographic differences, surrounding structures, ground cover, plant maturity, and nearby bodies of water. Measuring temperature across vast numbers of micro-climates is costly and labor-intensive given the price of high-quality sensors and complexity of sensor management . Many IoT vendors provide managed services to reduce this complexity for growers, but these services are expensive, require that data be transmitted off-farm to cloud-based applications via cellular, and impose a recurring subscription fee on farmers in order to view their data. As a result, IoT advances have not achieved widespread uptake in agriculture, despite their potential.As part of the UCSB SmartFarm effort Krintz et al. , we have investigated ways of reducing cost and complexity of temperature-based IoT solutions, while maintaining accuracy and robustness. SmartFarm implements a low cost, on-farm edge cloud comprised of multiple Intel Next Unit of Computation machines Int . Using open-source cloud software and Eucalyptus Nurmi et al., we design the edge clouds to be self-managing and to perform a wide range of data analytics on-farm data, thereby precluding the need to transmit data off-farm and keeping cost, complexity, and latency low Krintz et al. , Elias et al. . We use SmartFarm and single-board computers to provide accurate, real-time estimates of micro-climate temperature across a farm.

To do so, we place battery or solar-powered devices in-situ in various settings and configurations within inexpensive enclosures. The devices transmit their CPU temperature wirelessly to an on-farm edge cloud every 5 minutes. As ground-truth, we consider co-located DHT digital sensors, high-end, on-farm weather stations, and Weather Underground remote weather service Weather Underground , which farmers commonly use to estimate temperature. Figure 4.1 shows a two-week time series trace of CPU temperature from a Raspberry Pi Zero, the outdoor temperature from an attached digital DHT22 temperature sensor , and the outdoor temperature from a nearby Weather Underground station . WU measures outdoor temperature at 10 meters and the Pi Zero is at a 1 meter altitude. The Pi Zero is in a plastic enclosure with a small, covered hole from which the DHT wires exit; the DHT sensor is outdoors and hanging freely. The device is located outdoors under constant shade in Goleta, CA. We refer to this device as Pi1 in later sections of the paper. The average CPU temperature on the Pi Zero during this period is 99.71 F with a standard deviation of 4.69. The mean and standard deviation for the DHT sensor and WU station are 61.93 and 60.20 , respectively. DHT and WU temperature is similar but WU exhibits data dropout , more variance, and more extreme temperatures. From this graph, there appears to be a correlation between CPU temperature and both outdoor temperature measures for this location. The CPU values exhibit small oscillations or noise . A sub-portion of the CPU data alone is shown in Figure 4.2 using a different scale. We note that there are some discrepancies in the shape of different curves. We observe similar relationships using other types of devices, locations, and sources for ground-truth temperature measurements. We next investigate how accurately we can predict outdoor temperature using CPU temperature of these devices.The data in Figure 4.1 is typical of the outdoor SmartFarm installations we have deployed suggesting that linear regression would be an effective way to predict outdoor temperature from CPU temperature. Because each single-board computer is running a multi-user operating system , however, the CPU temperature exhibits fluctuations that we do not observe in the outdoor temperature. Further, because these fluctuations are caused by programs that are running on the computer, they are autocorrelated in time. To account for this autocorrelated “noise” in the CPU temperature series, we apply Single Spectrum Analysis Golyandina & Zhigljavsky to the CPU series before performing regression. SSA decomposes an auto correlated time series into “basis time series” which are analogous to principle components Abdi & Williams , Wold et al. . By summing the most significant basis series , SSA can extract a smooth “signal” from a noisy time series. To do so, SSA requires the number of lags over which auto correlation is significant to be supplied as a parameter. To investigate the accuracy with which it is possible to predict outdoor temperature, our system runs multiple smoothing passes, each with a successively larger number of lags up to 12 . During daylight and nighttime hours, outdoor temperature can be auto correlated for several hours, but during the early morning or early evening the significant auto correlation duration is significantly less. For each lag we compute the coefficient of determination for a regression covering a previous window of time and choose the number of lags that generates the highest R2 value. We refer to this window as the training window . Typically the best R2 value is for 6 lags indicating that the significant auto correlation in the CPU temperature series covers about 30 minutes. The method recomputes both the smoothed series and the regression coefficients every time a new outdoor measurement is generated . Thus the approach is a “piece wise” linear regression approach where the data is re-smoothed using the “best” number of lags before each regression. When a new CPU value arrives, we use the regression coefficients to compute a prediction of outdoor temperature. Prior to applying smoothed regression coefficients, we append the new CPU value to the training window . We compute the prediction using the smoothed CPU value . We then compare this value to the actual outdoor measurement to compute the absolute difference and square difference as the error.We refer to this configuration as a “limit study” because we believe that it provides us with an upper bound on the efficacy of our approach.