Tags: LLNL/SNLS
Tags
Version 0.4.1 Fixes a few small bugs: SNLS_view_types.h's contains_data() fcn had a bug in it where for empty RAJA views it would always return a value of 1. This was due to us using the wrong function from the RAJA layouts as the size() always returns a value >= 1 even if empty :/ We needed to use the size_noproj() which returns the actual size instead. Our benchmarks had a bug in them where the SNLS_NewtonBB class uses cases were still using the old API for the template args rather than the final one we settled on for the v0.4.0 release. Added a small helper function in the SNLS::Device class so that we could wait on all work associated with a resource set rather than wait on just the event + resource set.
Version 0.4.0 introduced a number of new features related to solver c… …hanges and abstraction layer changes This version has updated the minimum required version of C++ to C++17. The move to a more newer version of C++ has allowed us to make use of a lot of new features such as class auto-deduce template types / arguments. Additionally, we can make use of some of the expanded constexpr features that make life much simpler. Overall, these changes make writing new models and our own internal development simpler. Additionally, we increased the minimum cmake version to 3.20 to be closer inline with most modern build systems. One of the big new changes is that for all of our non-batch models we now support lambda functions as inputs to the model rather than a full blown class for the computeRJ/computeFJ calculations. As part of this change, all the ND-solvers require an additional template argument that sets the size of the nonlinear system. If using the class version of things we already require the nDimSys variable be defined, and so we set the template argument to default to this value. This way old libraries still stay the same largely. Next, on the NewtonBB model we got rid of the unbounded template argument and moved it to be a class constructor variable. By doing this, we can now auto-deduce the CFJ template argument which simplifies user's experiences when using this solver. Additionally, we changed the CFJ to no longer be taken in as a pointer but by reference for the class construction to be inline with all the other solvers. For the batch solver, we largely kept things the same from previous versions outside of fixing some GPU issues where we were capturing *this implicitly in our lambda captures which was causing some issues on the GPU. The one noteable breaking change though is related to the setupSolver function. We no longer take in a snls::TrDeltaControl* type but instead take in a type called TrDeltaInput which is used to construct the TrDeltaControl internally. We needed to do this to ensure we could properly tranfer the TrDeltaControl object between HOST and GPU memory space when running on the GPU. On the forall abstraction side of things, we made a number of changes here. First, we updated the forall abstractions to have better debugging experiences. Users can still make use of the SNLS_FORALL macros but the recommended new interface is the snls::forall functions. Since these functions, allow the debugger to have the correct line attributed to it when an issue occurs. Next, we expanded the number of forall abstractions so users can operate on different streams, make use of async calls, specify at runtime which execution space to run on in the function call. Additionally, we have the forall calls return a type that can be used to either query the status of the forall or even have a stream wait forall to finish. In order to support these new features, we added a few new functions to the Device which provide the appropriate types / make it simple to wait on things. We also moved the View types in this release to their own file. Additionally, we added our own View-type called SubView which allows one to take a window/slice of a View and have the indexing operation appear as if you're operating in this smaller space. The object is also generic enough that it'll work with any type that has a few RAJA-view like operators aka we can build a SubView of a SubView and have things just work. Finally, we added the option to just build with RAJA support which means a user will have access to the forall and view-type abstractions. This was done as we wanted to make it simple for users to use these features without having to build the whole RAJA Portability Suite.
Version 0.3.1 fixed an issue with cuda builds that arised from an ung… …uarded compiler option which if repeated twice would cause a build to die. Pulled version 0.3.0 adds a number of features. First, a batch version of the SNLSTrDlDenseG solver class was added. Additionally, it adds an optional memory manager, forall abstractions, and a device class to control the memory manager and forall abstraction layers. These new optional features are only available when SNLS is compiled with the RAJA Performance Suite as they make use of RAJA, CHAI, and Umpire under the hood. Next, the library has moved back from being a header-only library to being a compiled library again. Additionally, a proper cmake config file is now produced when the library is installed so users may now use cmake's find functionality to import the library. Last, the trust-region delta class has been extracted into its own header, so it can be shared among various solver classes.