The sugar levels of all tested grapes were either equal to or considerably higher than other fruits tested

The percentage of eggs that developed to adults decreased with the increasing egg density per gram of fruit probably due to intra-specific competition, and this was further confirmed by manipulating the egg density and using the same ‘Bing’ cherry cultivar as the tested host. Females preferred larger fruit for oviposition, which is consistent with the density-dependent survival as the large fruit support higher numbers of fly larvae per fruit. It is well known that many fruit flies employ a variety of fruit characters to assess host quality and tend to be more attracted to larger fruits. Female D. suzukii appears to be able to assess host quality based on fruit size, and this behavior would likely increase foraging efficiency per unit time. Though we recovered very low numbers of D. suzukii from damaged citrus fruits, our laboratory study showed the fly can oviposit into and develop from freshly damaged or rotting navel oranges. Kaçar et al. showed that D. suzukii overwinter in citrus, surviving 3–4 months when fresh oranges were provided as adult food or ovipositional medium,stackable planters and field-emerged adults from soil-buried pupae could produce and oviposit viable eggs on halved mandarin fruit.

Thus, citrus fruit likely play an important role as reservoirs in sustaining the fly populations during San Joaquin Valley winter seasons, and in the spring, those populations may migrate into early season crops, such as cherries. We did not observe grape infestation in our field collections, and our laboratory trials showed a low survival rate of D. suzukii offspring on grapes when compared to other fruits . The oviposition susceptibility and offspring survival could vary among varieties or cultivars due to the variations in skin hardness and chemical properties. For example, Ioriatti et al. demonstrated that oviposition increased consistently as the skin hardness of the grape decreased. Chemical properties, such as sugar content and acidity levels, may play a role in host susceptibility. In the current study, we found that although table grapes had a tougher skin than raisin or wine grape cultivars tested , females were able to lay eggs into all three types of grapes, often through the fruit surface or near the petiole . We also found that tartaric acid concentration negatively affected the fly’s developmental performance. Still, about 20% eggs successfully developed to adults in the diet mixed with the highest tartaric acid, whereas only 4.5% eggs developed from the wine grape cultivar tested.

It is thus possible that other unknown chemical traits might also affect larval performance. Overall, our results are consistent with other reported studies that grapes are not good reproductive hosts for D. suzukii.California’s San Joaquin Valley is one of the world’s most important fruit production regions, with a diverse agricultural landscape that can consist of a mosaic of cultivated and unmanaged host fruit crops. Such diverse landscapes result in the inevitable presence of D. suzukii populations that represent a difficult challenge for the management of this polyphagous pest. We showed that only the early seasonal fruits, such as cherries, seem to be at greatest risk to D. suzukii Many of other later seasonal fruits are not as vulnerable to this pest, because either their intact skin reduces oviposition, they ripen during a period of low D. suzukii abundance, or their flesh has chemical attributes that retard survival. However, some of these alternative hosts—such as citrus and damaged, unharvested stone fruit—may act as shelters for overwintering populations and provide sources for early populations moving into the more susceptible crops. Consequently, area-wide management strategies may need to consider fruit sanitation to lessen overwintering populations, suppressing fall and winter populations by releasing natural enemies, and reducing pest pressure in susceptible crops through ‘border-sprays’ and/or ‘mass trapping’ to kill adults before they move into the vulnerable crop.

Alternative and sustainable area-wide management strategies such as biological control are highly desirable to naturally regulate the fly population, especially in uncultivated habitats. An understanding of the temporal and spatial dynamics of the fly populations would be of aid in the optimal timing of the future release of biological control agents to reduce the source populations in the agricultural landscape. Many development programs are initiated as pilots with the intent of scaling if the trial shows success. High-quality implementation at the pilot stage is necessary to ensure a program remains faithful to its intended design. As a result, pilots are commonly run by organizations with strong institutional capacity enabled by a history of local expertise and involvement. However, the intensive community engagement and oversight brought to bear at the pilot stage often cannot be replicated as a program expands in scope. There is growing concern that demonstrations of large program impacts at small scale may have limited external validity as expansion brings in new levels of management and administration. In this paper, we illustrate how the same features that constrain implementation quality at scale may also bias pilot evaluation, threatening internal validity by overstating true program impacts. Our study takes place in the context of an unsuccessful agricultural intervention to promote smallholder cultivation of pulses in Bihar, India. The policy was piloted in a two-year randomized evaluation by four non-governmental organizations selected for their extensive history of local rural development work in the study area. We uncover evidence that farmers involved in program evaluation take costly actions that make the evaluation appear more favorable to the implementing organizations. The primary data for this study come from an incentive-compatible elicitation of demand for seeds of the target pulse crops at endline. Actual seed purchases were based on elicited responses, ensuring the decision had real stakes. This exercise was intended to measure farmers’ sustained intention to produce pulses after program activities concluded, an explicit goal of the program at the outset. Implementers defined success as an increase in treated farmers’ preference for pulse cultivation, resulting in greater seed demand. To evaluate how implementers’ desires affect participant behavior, we experimentally varied the salience of program evaluation during demand elicitation. Specifically, enumerators introduced the elicitation either as an explicit evaluation of the implementer’s efforts or more generally as a study of regional attitudes toward pulse cultivation. After this manipulation, elicitation proceeded identically for all participants. Importantly,stacking pots we ensure the introductory language does not communicate information about product quality by offering a consistent product explicitly sourced and delivered by the local implementer. We interpret differences in participants’ willingness-to-pay for pulse seeds by evaluation salience as a reflection of participants’ implicit preferences over the outcome of the evaluation itself. Increasing the salience of evaluation skews the estimated treatment effect in favor of the implementers.

Overall, the two-year intervention actually discouraged pulse cultivation among treated farmers by con- firming their belief that growing pulses was not worth the opportunity cost of displacing more lucrative alternatives . This belief manifested as 25% lower demand for pulse seeds on average in the incentive-compatible elicitation. However, the negative treatment effect was only observed in elicitations with low evaluation salience, where treated farmers purchased less than half the quantity of their untreated counterparts. By contrast, there was no distinguishable difference in elicitations with high evaluation salience. Making the evaluation salient during data collection obscured evidence of a negative treatment effect. This shift in seed demand represents costly action taken by study participants. Treated farmers spent an average of Rs. 70 more on pulse seeds in high-salience elicitations. More importantly, seed purchases reflected real cultivation choices over the following crop season. We find a strong, positive correlation between seeds purchased and area planted, with no systematic deviation by salience status. On average, farmers who were reminded of their participation in a program evaluation subsequently altered cultivation on 2% of their cropland in the following season. This reallocation of real on-farm resources, while modest, indicates this bias extends beyond simple survey misreporting or other forms of cheap talk. Responsiveness to implementer desire can be thought of as a form of Hawthorne effect. Past work has established that subjects in an experiment may alter their behavior when they know they are being monitored or evaluated . Most relevant to our study, de Quidt et al. investigate experimenter desirability, whereby participants act in response to researcher objectives, as a possible source of bias. The authors intentionally manipulate beliefs about the experimenter’s desires and find the resulting distortions to be modest. We extend this type of work to introduce the possibility that the relevant pressure in program evaluation comes not from the experimenter conducting the evaluation, but rather the implementer being evaluated. Implementer desirability may influence any participant survey response, but it is especially concerning for complex elicitation methods introduced exclusively to generate evaluation data. In particular, Becker-DeGroot-Marschack elicitation is common in field experiments—including our own study—because it reveals respondents’ full demand curve in an incentive-compatible manner . This mechanism has been criticized due to concerns about misunderstanding of the dominant strategy , weak incentives for accuracy , decision fatigue , and price anchoring . Nevertheless, it empirically performs well relative to other measures of willingness-to-pay . We show that even if an elicitation accurately reveals demand, the demand itself may be influenced by participants’ preferences over the outcome of evaluation. This bias is more likely to arise in unnatural exercises that call attention to the research in progress—such as BDM elicitation—that in recall questions about participants’ regular endeavors—such as self-reported market purchases. Our findings more generally contribute to the large literature on reproducibility and policy scaling . Across many sectors, programs implemented by NGOs systematically generate greater impact than those run by governments . In a study closely related to ours, Usmani et al. highlight the particular importance of prior community engagement in NGO effectiveness. This feature is difficult to replicate at scale , and may therefore threaten the external validity of evaluation results generated from an NGO implemented pilot. Our research establishes how these factors can also undermine the internal validity of evaluation independent of their role in implementation effectiveness, presenting a new reason why pilot results may not be recreated at scale. Implementer desirability bias can create issues of endogenous participant effort similar to those proposed by Chassang et al. . Many development programs rely on complementary investments from program beneficiaries, and beliefs about implementation quality can alter participants’ incentives to invest. For instance, in two field evaluations, Bulte et al. and Bulte et al. show how the returns to improved seeds in Tanzania depend crucially on farm labor choices, which are in turn a function of participants’ perception of the quality of the seeds being evaluated. We find that participant behavior responds not only to beliefs about implementation quality, but also preferences over evaluation outcomes. Heterogeneity in responsiveness to implementer desire would also exacerbate external validity concerns regarding favorable selection of sites or indicators if NGOs strategically seek out communities where evaluation is likely to show positive impact. This could contribute to the relationship between prior NGO activity and current program effectiveness reported by Usmani et al. , though we find no evidence of such selectivity in our setting. The state of Bihar is among the poorest in India with over a third of households below the national poverty line. The population is also predominantly agrarian with just 12 percent residing in urban centers. As a result, Bihar has been a region of focus for rural development programs by both the Government of India and the NGO sector, frequently funded by external donors. There are currently 4,255 registered NGOs and volunteer organizations in the state,1 the majority of which operate at small scale and rely on heavy engagement with beneficiary communities. We investigate how to translate experience from this type of localized development work into guidance for policy design. Our study is tied to an agricultural development venture initiated by the Government of India and managed by an international NGO. In 2016, the managing organization enlisted four local Bihari NGOs to implement a pilot intervention aimed at increasing the production of pulses by farm households. Many households in this region grow small amounts of pulses—primarily pigeon pea—on crop borders or other marginal land for home consumption. The partnership designed and administered a two-year package of input subsidies, agricultural extension services, and marketing support to modernize cropping practices and boost output. This package was piloted in five districts to test whether intensive short-term investment could shift the long-term crop portfolio of participating households. Implementing partners for this project were selected because of their track record with local development.