8000 RPP Normalize U8 - 3D kernel Optimization by Dineshbabu-Ravichandran · Pull Request #360 · r-abishek/rpp · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

RPP Normalize U8 - 3D kernel Optimization #360

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 51 commits into
base: develop
Choose a base branch
from

Conversation

Dineshbabu-Ravichandran
Copy link
  • Optimized Normalize 3D for U8 bit depth using SIMD instructions.

Copy link
@HazarathKumarM HazarathKumarM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dineshbabu-Ravichandran Please address the review comments

@@ -178,7 +200,11 @@ int main(int argc, char **argv)
fill_mean_stddev_values(nDim, maxSize, meanTensor, stdDevTensor, qaMode, axisMask, scriptPath);

startWallTime = omp_get_wtime();
rppt_normalize_host(inputF32, srcDescriptorPtrND, outputF32, dstDescriptorPtrND, axisMask, meanTensor, stdDevTensor, computeMeanStddev, scale, shift, roiTensor, handle);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have the function for u8 in misc tests? you have already added the same function in the image unit tests

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed U8 call in misc test suite .


if(inputBitDepth == 0)
{
if(srcDescPtr->c == 3)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we have a single without elseif ? Please check this

meanPtr[0] *= normFactor;
}


Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove the extra space here

srcPtrTemp += stride[2];
}
rpp_rsqrt_avx(stdDevPtr, (Rpp32s)(1), 0, normFactor, scale);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove the space here also

@@ -277,6 +323,359 @@ void normalize_3D_tensor_nontoggle(Rpp32f *srcPtr, RpptGenericDescPtr srcGeneric
}
}

// Computes normalize for 3D non toggle variants
void normalize_3D_tensor_nontoggle(Rpp8u *srcPtr, RpptGenericDescPtr srcGenericDescPtr, Rpp8u *dstPtr, RpptGenericDescPtr dstGenericDescPtr,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check the alignment here

HazarathKumarM and others added 17 commits November 26, 2024 12:56
* Make initial changes for raw CPP version of warp perspective

* Fix calls to compute_warp_perspective_src_loc function

* Update changes to go through nearest neighbours case

* AVX HOST codes for warp perspective initial

* Fixes for accuracy in warp perspective

* More fixes for accuracy in warp perspective

* Update the cide for AVX version of Planar to Planar

* Add bilinear u8 host code for warp perspective

* Make updates to include functions for F32 data type

* Make updates to use cast instead of set and fix issues with raw C implementation

* Add i8 host codes

* Add updates for F16 Bilinear Code

* Update the initial HIP code for warp perspective

* Update fixes for HIP code

* Add Warp Perspective Nearest Neighbors F16 code for PKD3_to_PLN3 and PLN3_to_PLN3

* Add updates for PLN to PLN configuration

* Add updates for PKD3 to PKD3 case

* Rename variables

* Update changes to log images separately for Bilinear and Nearest Neighbors

* fixed bug in raw c code of PKD-PLN variant

* minor bug fix for F16 PLN variants

* minor fixes in HOST test suite

* Update the HIP code for review comments and refactoring of device functions

* Update the comments alignment

* Rename functions and add cases in HOST and HIP runTests.py

* Update indentations for compuatations and rename vectors

* Update documentations and add more reference variables

* Make more formatting changes

* Make further updates by including test cases

* Make updates to use reinterpret cast

* Update reinterpret casts for PLN to PLN configuration u8 and i8 codes

* Make updates to enclose code inside AVX2 flag

* Make further changes to update type casting

* Update the version

* Make updates to add warp perspective image

* Modify comments, update CHANGELOG and update flags

* Update further comments in warp perspective

* Add more comments for warp perspective

* Update based on further review comments

* Update the case number for warp_perspective in common.py

* Address review comments

* Make initial changes for raw CPP version of warp perspective

* Fix calls to compute_warp_perspective_src_loc function

* Update changes to go through nearest neighbours case

* AVX HOST codes for warp perspective initial

* Fixes for accuracy in warp perspective

* More fixes for accuracy in warp perspective

* Update the cide for AVX version of Planar to Planar

* Add bilinear u8 host code for warp perspective

* Make updates to include functions for F32 data type

* Make updates to use cast instead of set and fix issues with raw C implementation

* Add i8 host codes

* Add updates for F16 Bilinear Code

* Update the initial HIP code for warp perspective

* Update fixes for HIP code

* Add Warp Perspective Nearest Neighbors F16 code for PKD3_to_PLN3 and PLN3_to_PLN3

* Add updates for PLN to PLN configuration

* Add updates for PKD3 to PKD3 case

* Rename variables

* Update changes to log images separately for Bilinear and Nearest Neighbors

* fixed bug in raw c code of PKD-PLN variant

* minor bug fix for F16 PLN variants

* minor fixes in HOST test suite

* Update the HIP code for review comments and refactoring of device functions

* Update the comments alignment

* Rename functions and add cases in HOST and HIP runTests.py

* Update indentations for compuatations and rename vectors

* Update documentations and add more reference variables

* Make more formatting changes

* Make further updates by including test cases

* Make updates to use reinterpret cast

* Update reinterpret casts for PLN to PLN configuration u8 and i8 codes

* Make updates to enclose code inside AVX2 flag

* Make further changes to update type casting

* Make updates to add warp perspective image

* Modify comments, update CHANGELOG and update flags

* Update further comments in warp perspective

* Add more comments for warp perspective

* Update based on further review comments

* Update the case number for warp_perspective in common.py

* Address review comments

* Fix conflits with warp perspective

* Update version details

* Merge branch 'ar/opt_warp_perspective' of https://github.com/r-abishek/rpp into opt_warp_perspective_rebased

* Update version to 1.9.10 including warp perspective

* Updates to convert to XYWH from LTRB instead of opposite

* Update CHANGELOG.md

Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>

* Revert changes and convert to ltrb instead of xywh

---------

Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com>
Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
Co-authored-by: Rajy Rawther <Rajy.MeeyakhanRawther@amd.com>
Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>
* Package - remove clang from test

* CMakeLists - remove BUILD_WITH_AMD_ADVANCE

* Package - Add OMP dependency

* Find Packages - Updates

* Test Package - Deps

* Test - backend Info

* Tests - Fix HIP Test Add

* Test - Add HIP Path

* Test - Find HIP Updates

* Tests - Fix HIP compilation
* Jenkins - Fix Test

* Test - Create sepreate test folder

* FileSystem - Find and process

* Find Filesystem - Updates

* Test - Compiler Updates
…ocs/sphinx (ROCm#491)

Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.11.0 to 1.12.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](ROCm/rocm-docs-core@v1.11.0...v1.12.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0