Description
I believe our treatment of strides in Dense
vectors could be improved. Sometimes we ignore strides and copy padding data directly, sometimes we overwrite existing values, causing potential reallocations. And sometimes, it looks like we could save on some storage. I see two important areas where this pops up:
Matrix conversions
EnablePolymorphicAssignment
implements convert_to
via the assignment operator. This means that if you copy from a Dense matrix to another matrix, this operation will overwrite the stride and potentially reallocate the target, copying all padding data over. I believe we could improve our memory footprint if we reused the existing storage and didn't modify the target stride. At least, we shouldn't be touching padding data, neither reading nor writing.
The same is also true for value type conversions and (potentially) other places I can't think of right now.
Solvers
In solvers, we use Dense::create_with_config_of
to create intermediate vectors of the same size and stride. This means that if our input vector has a huge stride, we will use this value for all internal vectors, potentially creating a large amount of unnecessary padding. I guess the idea behind this is that we only need to pass a single stride value to the kernels, instead of having a distinct stride for each vector? Do you think we can find a more memory-efficient solution for the worst case?