[RFC] src/aiori-CEPHFS: New libcephfs backend #217
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a new aiori backend using libcephfs that is loosely based on the existing POSIX and RADOS backends. It also borrows the "prefix" concept from the DFS backend for an existing POSIX mount point (necessary for ior/mdtest to function properly even when using a library for direct filesystem access). A slight change to libcephfs.h is needed for IOR to properly compile (this does not appear to be necessary for C++ clients using libcpehfs however):
io500 tests on a 10 node in-house test cluster with 2X replication and co-located clients appeared to function properly with similar (though much better in the case of sequential reads) scores vs using the POSIX backend with kernel based CephFS mount points. In the following results, the mdtest easy directories are being round-robin pinned prior to the test, though in the near future ceph will do ephemeral pinning across MDSes automatically with a single top-level xattr.
2020-03-06-RedHatLibCephFS-10-30.zip
Generally, lower scores in unaligned reads/writes and build-up time for dynamic subtree partitioning in the ior and mdtest hard test cases held us back (we actually see higher scores with longer run times!). Given how scores are calculated these will be prime targets for future optimization.
Signed-off-by: Mark Nelson mnelson@redhat.com