Our goal is to examine the complex coupling of oxidative ageing chemistry with environment- and history dependent viscosity, and generate a comprehensive model that can be applied to a range of aerosol transformation chemistries. Our model design strategy is to develop key components of the model separately, and merge them after validation. This approach is uniquely enabled using stochastic simulations. The main components required are a description of uptake of gas reactants, a description of the free radical chain over many product generations, and a description of dynamic perturbations by environmental factors such as temperature, pollutant plumes and humidity and internal factors such as immiscibility of products.In the first year of this LDRD we have made significant progress on the first two components: modeling uptake and the free radical chain. We have focused on a system for which there are extensive experimental data in the literature, the reaction of squalane aerosols with OH over a broad range of pressures. Using only kinetics from the literature we have shown that the current understanding of uptake is at best incomplete – it is not simply the sticking probability of reactants to particle surfaces followed by reactions.
Rather,hydroponic gutter the uptake coefficient is inseparable from the intrinsic internal transport rate of the aerosol itself relative to the gaseous reactant-aerosol collision rate, and is therefore an emergent property of the system. This represents a fundamentally new insight that promises to impact thinking on transformations of aerosols and liquid films. The free radical chain reaction that ensues after the initial reaction between OH and squalane aerosol is complex, involving many generations of addition of oxygen-containing functional groups forming hundreds of distinct products as well as fragmentation that transform the composition of the aerosol and cause it to shrink. In work led by the postdoctoral researcher in this project, Aaron Wiegel, we have developed a compact description of this chemistry that fully reproduces experimental observations. The free radical reaction model, which uses only literature kinetics, describes molecules as collections of functionalities and is therefore inherently general. It is the core of a universal free radical oxidation scheme that can be used to apply to a broad range of molecular systems. We aim to massively accelerate multi-modal data analysis to enable real time data explorations in order to speed the discovery and hypothesis generation process in neurosciences. We plan to precisely quantify changes in network structure, especially those that lead to neurodegenerative diseases. Our collaborative research project promises to significantly improve the fidelity and scope of neuroimaging analysis using high-performance computing methodologies via state-of-the-art graph analysis, image processing, and visualization techniques.
The research developed for these techniques will be applicable to a variety of evolving big data domains of interest to the DOE. Overall, our work will address the 3V components for big data neuroscience problems. Performance improvements in the segmentation of structural MRI and high speed computation of adjacency matrices will allow the end user to process more data in a shorter time, with expectations to view and analyze data in real time to speed the discovery and hypothesis generation process . By enabling researchers to integrate and interrogate data from multiple data modalities at the same time, we address the issue of variety. Additionally the volume problem is also addressed via the significant acceleration of comparative data analysis from multiple measurements for the same or multiple patients. We developed a data driven method for functional parcellations of brain regions as well as an adaptive hierarchical community detection method to fine tune parcels. This is a very challenging problem with more than 100K dimensions and relatively smaller sample size. The data is also very noisy due to fMRI measuring indirect blood oxygen levels as opposed to direct neural activity. Consequently, using global correlations is known to give unsatisfactory results because many nearby “voxels” have very similar time series data. CONCORD is the first method with provable global convergence properties that performs sparse inverse covariance estimation. The accelerated version of this algorithm, CONCORD-ISTA, uses block sparse linear algebra instead of coordinate-wise updates. What is needed is the partial correlation between voxels, i.e. the correlation of pairs of voxels after removing for the effects of all other voxels. We apply CONCORD-ISTA to our problem as it provides the best theoretical guarantees for estimating partial correlations between voxels.
We also developed a method for interactive visual exploration of functional magnetic resonance imaging data to analyze the correlation between activities in different human brain regions when resting or when performing mental tasks. Our visualization tool improves visual data exploration by generating multiple coordinated views, supporting the brushing-and-linking concept, and integrating community detection. Our tool provides neuroscientists with a powerful means to comprehend such complex data more effectively and efficiently. Mesoscale is DOE’s next frontier in their effort to develop to control chemical and physical processes that lead to new or more efficient renewable energy resources and approaches to reduce the carbon footprint. Modeling the emergent mesoscale phenomena utilizing computational chemistry methodologies requires the exploration of essential collective variables and order parameters in systems of sufficient size and disorder and with sufficient statistical sampling using accurate and scalable computational chemistry methods. We will improve the performance of key ab initio methodologies utilizing tools developed by the SciDAC Institutes, and by developing new and advanced algorithms for two-electron integrals and planewave FFT for the Intel Xeon Phi. We will develop computational chemistry tools that integrate a Kinetic Monte-Carlo methodology with scalable high accuracy ab initio and Car-Parrinello methodologies available in NWChem. To enable major scientific discoveries of mesoscale phenomena, computational models need to be integrated with a broad range of complex spectroscopic imaging experiments. To enable integration of mesoscale modeling efforts with complementary experimental work, we will develop the semantics and tools for needed to analyze and enable knowledge discovery in scientific data generated from mesoscale experiments and simulations. Our most significant accomplishment is the development of the Global Arrays/GASNet Interface called GAGA. This work was done in collaboration with members of the DEGAS project. The current implementation has been demonstrated to work efficiently on Infiniband clusters and Cray platforms, with significant faster performance and better CPU utilization for the coupled cluster single doubles capability in NWChem.Key two-electron algorithms were analyzed, and in collaboration with the SUPER institute we were able to obtain a 25% increase in performance and we improved the load-balancing algorithm on conventional CPUs. Similar performance improvement was found for Intel Xeon Phi, however the absolute performance is lacking relative to CPUs. We optimized the Fock build capability in NWChem and two-electron integrals to increase their efficiency on Intel’s nextgeneration Intel Xeon Phi processors.
In addition,hydroponic nft channel a distance based screening algorithm to enable O computing was implemented that will reduce redundant computation of expensive multi-center two-electron integrals in large systems. A first version of CML in NWChem has been made available to collaborators in industry . The work was presented at IUPAC in 2014 and we are in the process of writing a proposal to the organization for support to develop a consistent dictionary and ontology for chemical sciences. We are partnering with Kitware, using their MongoChem infrastructure to build a demonstration case for heavy element chemistry, linking computational and experimental data in a semantically rich framework. Our goal is to create a computational framework that will allow the in silico design of metal organic framework materials through a strategy based on structure optimization with respect to property. Efficient global optimum search algorithms will be utilized to efficiently navigate the space of possible structures. By performing this search on a broad space of MOFs, we thereby remove limitations on the search space encountered by currently utilized enumeration-based strategies. We plan to explore two approaches. In the first, we abstract molecular models of MOF building blocks as geometrical – or alchemical – building blocks, defined by a number of continuous parameters, which are optimized using gradient-based techniques. In the second, we perform the search in a discrete space of real molecular building blocks and employ Genetic Algorithm -based search techniques. Although the proposed approaches are general and can be used to design a MOF material with almost any desired property, our work will focus on properties critical for gas separation and storage. We aim to enable design of materials for carbon capture and natural gas storage. Therefore properties of interest include high surface area, large pore diameters and adsorption properties of gas of interest. Our framework for MOF design comprises three components: MOF assembly module; rapid property estimation module; and structure optimization control module. The key aspect of each component is modularity: these components can be substituted or extended to include other building block representation schemes, property estimation modules, scoring functions and optimization/search algorithms.Technological advances in computers and sequencing technology have enabled bio-informatics to develop at an unprecedented rate, especially in terms available data volume that require analysis. However, biologists face significant challenges in effectively studying these data sets due to the complexity of optimizing these classes of computations on modern computational systems. The goal of this project is to deliver unprecedented computational capability to large-scale analytics for key bio-informatics applications, via the development and integration of flexible and high-performance software packages. Recent work is targeting the analyses of high-throughput “next generation” genome sequencing technologies that are producing a flood of inexpensive genetic information, which is invaluable to genomics research. Sequences of millions of genetic markers are being produced, providing genomics researchers with the opportunity to construct high-resolution genetic maps for many complicated genomes. However, the current generation of genetic mapping tools were designed for the small data setting, and are now limited by the prohibitively slow clustering algorithms they employ in the genetic marker-clustering stage.
Our most significant accomplishment focuses on the first step of genetic mapping, which involves clustering markers into linkage groups. This is traditionally performed by various standard clustering algorithms applied to a similarity graph of the markers, which creates a significant bottleneck for large numbers of markers. Our work developed a fast clustering algorithm that circumvents the computation of all similarities by exploiting prior knowledge about the specific structure of the marker data: linkage groups have an intrinsically linear substructure that remains reflected in the similarity measure. After sorting, the algorithm creates a specific sketch that respects both the geometry and quality of the data. Using synthetic and real-world data, including the grand-challenge wheat genome, we demonstrate that our approach can quickly process orders of magnitude more genetic markers than existing tools while retaining — and in some cases even improving — the quality of genetic marker clusters. An important application of our method is in the efficient construction of ultra-dense genetic maps for large and complex genomes that are filled with repetitive sequences that frustrate genome assembly but do not limit the number of genetic markers. The most economically important of these genomes are various grasses, including crops grown for food or as bio-fuel feed stocks . The aim of this project is to develop advanced, high-order accurate, numerical methods for the computational simulation of multi-physics processes involving multiple moving interconnected interfaces. To achieve this, we are developing a new computational framework for complex fluid flow problems based on discontinuous Galerkin embedded boundary methods. The framework capitalises on mathematical and computational advantages provided by implicitly-defined representations of geometry, as used for instance in the level set method for single interfaces, and the recently developed Voronoi Implicit Interface Method for multi-phase physics. In particular, we are developing new projection methods for incompressible fluid flow which are high-order accurate in space and time as well as new techniques for consistent coupling of fluid-solid motion. The goal is to enable high-fidelity simulation of multi-physics interface dynamics, especially when the dynamics of the interface produce small scale features such as boundary layers that would not be captured by existing lower-order methods. The new framework holds promise for enabling high-accuracy simulation in complex geometries and offers advantages over other numerical methods in regards to high performance super computing.