8000 GitHub - simon2/H-div
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

simon2/H-div

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Synopsis

H-div is a parallelized implementation of construction of hierarchical matrices (H-matrices) using Cilk Plus and Tascell based on a sequential Fortran implementation in HACApK library coded by Prof. Akihiro Ida et al.

Publications

Requirements

  • One or more Intel multi-core CPUs

  • To run Cilk Plus versions, Intel C++ Compiler version >= 17

    (Note: Cilk Plus is no longer in support from Intel and will be removed from Intel compiler some day in the future.)

  • To run tascell versions on shared memory systems

    • tascell compiler version later than Jan 21, 2019.
    • GCC version >= 4.8.5 (or ICC with compatibility of GCC version higher than 4.8.5)
  • To run tascell versions on distributed memory systems

    • tascell compiler version later than May 15, 2022 of branch mpi-bcst.
    • Intel C++ Compiler version >= 17
    • Intel MPI version >= 17

Download

git clone https://github.com/simon2/H-div.git

Compile & Execution

make hmat_div
./hmat_div

File Explanation

1. matrix partitioning only

  1. Sequential
    • hmat_div.c: The first sequential implementation using C directly translated from Fortran implementation in HACApK.
    • hmat_div_direct.c: Sequential C implementation but exchange data elements directly instead of exchange their index. This is the baseline of sequential implementation in paper.
    • hmat_div.cpp: Sequential implementation of C++.
    • hmat_div.sc: Sequential implementation using S-expression-based syntax. This is the base of Tascell.
    • hmat_div_array.c: CT is not in linked-tree manner, but use a pre-allocated array. (will be discribed in next paper)
  2. Cilk Plus
    • hmat_div_cilk.c: Final implementation using Cilk Plus.
    • hmat_div_BCT_cilk_list_reducer.cpp: Tried CILK_LIST_REDUCER, but not included in paper due to bad performance.
    • hmat_div_CT_cilk_parByLevel.c: Tried to swith parallel code to sequential code by tree level information. Not included in paper due to bad performance.
    • hmat_div_BCT_cilk_malloc.c: Create private BCT array for each worker by using malloc function. Not included in paper due to bad performance.
  3. Tascell
    • hmat_div.tcell: The final implementation using Tascell.
    • hmat_div_locality.tcell: Tried to make upper tree levels execute sequentially and execute in parallel in lower levels, for better data locality. However, the result of speedup is not essential.
    • hmat_dist.tcell: Baseline version parallelized on distributed memory systems.
    • hmat_dist_bcst.tcell: Add broad-cast to hmat_dist.tcell.
    • hmat_dist_cas.tcell: Use CAS to store CT nodes.
    • hmat_dist_casc.tcell: Use CAS to store CT nodes, but in chunks.
  4. OpenMP
    • hmat_div_omp.c: The OpenMP implementation we mentioned in paper.

2. matrix partitioning + filling

  1. Sequential
    • hmat_filling.c: Sequential version of C which do filling after all leaf-nodes are created.
    • hmat_array_filling_wBCT.c: Sequential version of C which do filling a leaf-node is created.
  2. MPI + OpenMP
    • hmat_array_filling_MPI.c: Parallelized hmat_filling.c using MPI.
    • hmat_array_filling_dynamic.c: Parallelized hmat_filling.c using MPI and OpenMP with dynamic scheduling.

Contributors

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  
0