8000 GitHub - ashutom/rocSOLVER
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

ashutom/rocSOLVER

 
 

Repository files navigation

rocSOLVER

rocSOLVER is a work-in-progress implementation of a subset of LAPACK functionality on the ROCm platform.

Documentation

For a detailed description of the rocSOLVER library, its implemented routines, the installation process and user guide, see the rocSOLVER documentation.

Quick start

To download rocSOLVER source code, clone this repository with the command

git clone https://github.com/ROCmSoftwarePlatform/rocSOLVER.git

rocSOLVER requires rocBLAS as a companion GPU BLAS implementation. For more information about rocBLAS and how to install it, see the rocBLAS documentation.

After a standard installation of rocBLAS, the following commands will build and install rocSOLVER at the standard location /opt/rocm/rocsolver

cd rocsolver 
./install.sh -i

Once installed, rocSOLVER can be used just like any other library with a C API. The header file will need to be included in the user code, and both the rocBLAS and rocSOLVER shared libraries will become link-time and run-time dependencies for the user applciation.

Using rocSOLVER example

The following code snippet uses rocSOLVER to compute the QR factorization of a general m-by-n real matrix in double precsision. For a description of function rocsolver_dgeqrf see the API documentation here.

///////////////////////////
// example.c source code //
///////////////////////////

#include <iostream>
#include <stdlib.h>
#include <vector>
#include <rocsolver.h>      // this includes all the rocsolver C interfaces and type declarations

using namespace std;

int main() {
    rocsolver_int M;
    rocsolver_int N;
    rocsolver_int lda;

    // initialize M, N and lda with desired values
    // here===>>

    rocsolver_handle handle;
    rocsolver_create_handle(&handle); // this creates the rocsolver handle

    size_t size_A = size_t(lda) * N;     // this is the size of the array that will hold the matrix
    size_t size_piv = size_t(min(M, N)); // this is size of array that will have the Householder scalars   

    vector<double> hA(size_A);        // creates array for matrix in CPU
    vector<double> hIpiv(size_piv);   // creates array for householder scalars in CPU

    double *dA, *dIpiv;
    hipMalloc(&dA,sizeof(double)*size_A);       // allocates memory for matrix in GPU
    hipMalloc(&dIpiv,sizeof(double)*size_piv);  // allocates memory for scalars in GPU
  
    // initialize matrix A (array hA) with input data
    // here===>>
    // ( matrices must be stored in column major format, i.e. entry (i,j)
    //  should be accessed by hA[i + j*lda] )


    hipMemcpy(dA,hA.data(),sizeof(double)*size_A,hipMemcpyHostToDevice); // copy data to GPU
    rocsolver_dgeqrf(handle, M, N, dA, lda, dIpiv);                      // compute the QR factorization on the GPU   
    hipMemcpy(hA.data(),dA,sizeof(double)*size_A,hipMemcpyDeviceToHost); // copy the results back to CPU
    hipMemcpy(hIpiv.data(),dIpiv,sizeof(double)*size_piv,hipMemcpyDeviceToHost);

    // do something with the results in hA and hIpiv
    // here===>>

    hipFree(dA);                        // de-allocate GPU memory 
    hipFree(dIpiv);
    rocsolver_destroy_handle(handle);   // destroy handle
  
    return 0;
}

Compile command may vary depending on the system and session environment. Here is an example of a common use case

>> hipcc -I/opt/rocm/include -L/opt/rocm/lib -lrocsolver -lrocblas example.c -o example.exe            

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 75.9%
  • C 17.1%
  • CMake 4.5%
  • Shell 1.3%
  • Groovy 0.8%
  • Python 0.4%
0