Sobal sensitivity analysis utilizing polynomial regression like- wise determined FBS, MgSO4, and L-Phenylalanine were the most explanatory components when taking component-component interactions into account. Focusing on optimizing only those components might bring further improvements, which is now feasible because fewer experiments were needed to arrive at this conclusion. Another issue was that the HND algorithm often did not change experimental conditions enough, leading to heavy clustering around early high performing local optima . Myopia should be encoded into the DYCORS arm of the HND to allow for more exploration of the design space, while balancing the need for exploitation of regions of the design space that show promise. It is also possible that initializing the optimization with a more dispersed design would yield a more successful optimization. However, results from indicate that the initialization strategy used may not have a large effect. In reality,growing blueberries in pots the impact of initialization is likely to be a strong function of the design surface and how close initial points are to the true optimum, neither of which are know a priori.
Using α as a metric, HND performs similar to DOE, and both better than GM . This is true over multiple days after cell seeding and is true when using cell number to calculate α , seemingly validating the use of %AB at 48 hr post-seeding in approximating proliferation more generally. However, when measuring cell number at multiple passages both designed media perform worse than GM. This is because the objective function α relied on measurements without multiple passages, so does not account for the dynamics of long-term cellular growth. This was a major shortcoming of the objective function picked, but not the HND or DOE itself. Future work in media design should incorporate more relevant metrics for optimization, such as a multi-passage objective function. Additionally, the %AB metric was not a perfect measure of cell number. Figure 3.5 and Figure 3.2 appears to indicate HND and DOE media outperform GM, but when cell number is measured both optimal media have 8–9% fewer cells. Because AlamarBlue is a metabolic indicator, using it in the objective function for both methods may have biased the process towards higher metabolic activity rather than more proliferation. Despite these shortcomings, the HND has been demonstrated to be able to optimize high dimensional experimental systems.
In our previous work in media optimization, fewer variables required more experiments to complete. In this work, we demonstrate optimization of 30 components with 70 experiments with no dimensionality reduction or screening designs, to our knowledge, a unique accomplishment in experimental optimization efficiency. Therefore, this represents a valuable proof of concept in the field of experimental optimization. While not able to fully replace first principles understanding of systems often based on the DOE approach , we show that the HND could aid in the optimization of the hardest design problems, including those found in the bio-processing and larger cultivated meat industry, reducing the cost of experimentation and time-to-market for a new product. Culture media used in industrial bio-processing and the emerging field of cellular agriculture is difficult to optimize due to the lack of rigorous mathematical models of cell growth and culture conditions, as well as the complexity of the design space. Rapid growth assays are inaccurate yet convenient, while robust measures of cell number can be time-consuming to the point of limiting experimentation. In this study , we optimized a cell culture media with 14 components using a multi-information source Bayesian optimization algorithm that locates optimal media conditions based on an iterative refinement of an uncertainty-weighted desirability function. As a model system, we utilized murine C2C12 cells, using AlamarBlue, LIVE stain, and trypan blue exclusion cell counting assays to determine cell number.
Using this experimental optimization algorithm, we were able to design media with 181% more cells than a common commercial variant with a similar economic cost, while doing so in 38% fewer experiments than an efficient design-of-experiments method. The optimal medium generalized well to long-term growth up to four passages of C2C12 cells, indicating the multi-information source assay improved measurement robustness relative to rapid growth assays alone. Every bio-process in which cells are the final product or used in the production process requires suitable culture conditions for cell growth and product quality. In the rapidly growing cellular agriculture / cultivated meat industry, where cells are grown for consumption to replace carbon intensive and often unethical animal agriculture, cost-effective media has been identified as the most critical aspect in scale-up and commercialization. Optimizing these conditions is difficult due to a large number of media components with nonlinear and interacting effects between cells,medium, matrix material, and reactor environment. Typically, culture media used for processes in cellular agriculture consist of a basal medium of glucose, amino acids, vitamins, and salts supplemented with fetal bovine serum for improved cell survival. FBS is an undefined, animal-derived serum consisting of proteins, hormones, and other large molecular weight components, and contributes substantially to the cost of media. Even when enriched with additional growth factors or FBS, media is often far from optimal for all cell types and requires adaptation and/or optimization, which is difficult for media mixtures with >30 components, as is common in cell culture. To manage this complexity, design-of-experiments meth- ods are often employed in which factors are set to a user-specified value and outputs are measured. These DOE designs are arranged in such a way that statistically meaningful correlations can be found in fewer experiments than techniques like intuition, “one-factor-at-a-time” sequences, or random designs. A more advanced form of this is to use sequential, model-based DOEs such as a radial basis function or Gaussian Process,drainage gutter combined with an optimizer/sampling policy, to automatically select sequences of optimal designs. These approaches are often more efficient than traditional DOE at optimizing systems using fewer experiments and allow for more natural incorporation of process priors, measurement noise, probabilistic output constraints and constraint learning, multi-objective, multi-point, and multi-information source designs. Even with these methods available, limitations still exist. In previous work, we applied a machine learning approach to optimize complex media design spaces but had limited success due to the difficulty in measuring cell number for multi-passage growth.
Therefore, in this study, we utilized a multi-information source Bayesian model to fuse “cheap” measures of cell biomass with more “expensive” but higher quality measurements to predict long-term medium performance. We refer to the simpler and cheap assays as “low-fidelity” IS, and more complex and expensive assays as “high-fidelity” IS. While not always predictive of long-term growth, these lower fidelity assays are at least correlated with cell health and can help in identifying interesting regions of the design space for further study with the high-fidelity IS. We used this model, with Bayesian optimization tools, to optimize a cell culture medium with 14 components while minimizing the number of experiments, optimally allocating laboratory resources, and building process knowledge to improve our optimization scheme and model. In Section 4.2 we discuss the computational and experimental components of this BO method. In Section 4.3 we present the results of the BO method in comparison to a traditional DOE method, followed by Section 4.4 where we demonstrate the importance off using multiple sources of information to obtain relevant process knowledge and/or optimization results. The system under consideration was the proliferation of C2C12 cells. These cells are immortalized muscle cells with similar metabolism and growth characteristics as other adherent cell lines useful in the cellular agriculture industry. Cells were stored in 70% DMEM , 20% FBS , 10% dimethylsulfoxide freeze medium at -196◦C until thawed. Vials were thawed to 25◦C and the freezing medium was removed by centrifugation at 1500 × g for 5 min. The centrifuged cell pellet was resuspended in 17 mL of DMEM with 10% FBS and placed on 15 cm sterile plastic tissue culture dishes . Cells were incubated in a 37◦C and 5% CO2 environment. After 24 h the medium was removed, the culture dish washed with Phosphate Buffer Solution , and fresh DMEM with 10% FBS was introduced. After an additional 24 h, cells were harvested using tripLE solution , diluted in PBS, and counted using Countess II with trypan blue exclusion and disposable slides . The process of removing cells from a plate, counting, and re-plating them with fresh medium is called sub-culturing or passaging. How well the C2C12 cells survive and grow after passaging is indicative of their long-term potential in a large cellular agriculture process. The design space was comprised of the components and minimum/maximum concentrations listed in Table 4.1. These components were chosen because they are often used to supplement standard DMEM to improve cell growth; this represents a reasonable test case for the industrial application of these multi-IS BO methods to the cellular agricultural industry. The composition of standard DMEM , is shown in Table 4.3, and should not be confused with the base DMEM “supplement” , which contains only amino acids, trace metals, salts, and vitamins and none of the other 14 components. pH and osmolarity are not controlled in this study, so act as latent variables. Production scale cellular agricultural processes will require > 10 passages of cell growth so optimizing growth based on single-passage information is not adequate. However, multi-passage growth assays are difficult / expensive to measure, and even more difficult to optimize when given many components. We managed this complexity by coupling long-term cell number measurements with simpler but less valuable rapid growth chemical assays in murine C2C12 cultures as a model system for cellular agricultural applications, capturing a more holistic model of the process. We combined this with an optimization algorithm that efficiently allocates laboratory resources toward solving argmaxD for desirability function D, a function that incorporates both cell growth and medium cost. This resulted in a 38% reduction in experimental effort, relative to a comparable DOE method, to find a media 227% more proliferative than the DMEM control at nearly the same cost. As the longer-term passaging study suggests, our Passage 2 objective function and IS were well calibrated to mimicking the complex industrial process of growing large batches of cells over many passages, with Passage 4 cell numbers well predicted by this objective function. The reasons for the success of the BO are myriad. The BO method iteratively refines a single process model to improve certainty in D-optimal regions, whereas the DOE relies on a series of BB designs where the older data sets are ignored because they were outside of the optimal factor space. The BO also used a variety of IS, whereas the DOE only used a single low-fidelity AlamarBlue metric . Looking at Figure 4.8c, the AlamarBlue and LIVE tended to cluster around the point y =1, making it difficult to distinguish between high quality and low-quality media. The BO method also refined its multi-IS model over the entire feasible design space, allowing it to take advantage of optimal combinations and concentrations of all 14 components over the entire domain, whereas the DOE needed to reduce the design and factor spaces to reduce the number of experiments needed, and may have identified the wrong optimal boundary locations resulting in sub-optimal experimental designs. The BO method was also able to leverage information about process uncertainty to improve the model is poorly understood regions of the design space, whereas the steepest accent method used by the DOE chased after improved D with little regard for overall noise or experimental errors. This was worsened by the sensitivity of the polynomial model to random inter-batch fluctuations in AB%, which may have driven the DOE to sub-optimal media. Note that the success of our BO method should not be taken as generic superiority over all potential instantiations of DOE or commercial media used for C2C12 growth. While the BO method worked well at solving the experimental optimization problem, the multiIS GP accuracy was limited to highly sampled regions of the design space, thus limiting the efficacy of sensitivity analysis. This was a conscious decision made to trade off post-facto analysis for sampling media with high desirability D.