-
Notifications
You must be signed in to change notification settings - Fork 6
RPP Tensor Support - Snow on HOST and HIP #347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
RPP Tensor Support - Snow on HOST and HIP #347
Conversation
- Adds tensor support for Snow Augmentation optimized using AVX2 on HOST backend
- Adds tensor support for Snow Augmentation on HIP backend
- Adds unit and performance tests support for the Snow Augmentation in test suite
I think this output should be on the unit tests default 150x150 image. Pls check once @Dineshbabu-Ravichandran |
* \details The Snow augmentation does a modification of brightness on a batch of RGB(3 channel) / greyscale(1 channel) images with an NHWC/NCHW tensor layout.<br> | ||
* - srcPtr depth ranges - Rpp8u (0 to 255), Rpp16f (0 to 1), Rpp32f (0 to 1), Rpp8s (-128 to 127). | ||
* - dstPtr depth ranges - Will be same depth as srcPtr. | ||
* \image html img640x480.png Sample Input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same doubt here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We used this dashcam image for fog PR . https://github.com/r-abishek/rpp/pull/332/files#diff-006ea80a28a2d71eeb553a7a9c8b32912f4d35f871150a5094fbe85e3503575f . So I used same .
* \param [in] brightnessCoefficient brightness modification parameter for snow calculation (1D tensor in HOST memory, of size batchSize with 1 < brightnessCoefficient[i] <= 4 for each image in batch) | ||
* \param [in] snowThreshold threshold parameter for snow calculation (1D tensor in HOST memory, of size batchSize with 0 < snowThresholdTensor[i] <= 1 for each image in batch) | ||
* \param [in] darkMode darkMode values to set dark mode on/off (1D tensor in HOST memory, of size batchSize, with darkModeTensor[i] = 0/1) | ||
* \param [in] roiTensorSrc ROI data in HOST memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to roiTensorPtrSrc
* \param [in] brightnessCoefficient brightness modification parameter for snow calculation (1D tensor in pinned/HIP memory, of size batchSize with 1 < brightnessCoefficient[i] <= 4 for each image in batch) | ||
* \param [in] snowThreshold threshold parameter for snow calculation (1D tensor in pinned/HIP memory, of size batchSize with 0 < snowThreshold[i] <= 1 for each image in batch) | ||
* \param [in] darkMode darkMode values to set dark mode on/off (1D tensor in pinned/HIP memory, of size batchSize, with darkModeTensor[i] = 0/1) | ||
* \param [in] roiTensorSrc ROI data in HIP memory, for each image in source tensor (2D tensor of size batchSize * 4, in either format - XYWH(xy.x, xy.y, roiWidth, roiHeight) or LTRB(lt.x, lt.y, rb.x, rb.y)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to roiTensorPtrSrc
@@ -3135,6 +3138,217 @@ inline void compute_color_temperature_24_host(__m256 *p, __m256 pAdj) | |||
p[2] = _mm256_sub_ps(p[2], pAdj); // color_temperature adjustment Bs | |||
} | |||
|
|||
inline void compute_snow_host(RpptFloatRGB *pixel, Rpp32f brightnessCoefficient, Rpp32f snowCoefficient, Rpp32s darkMode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions cannot be part of the host code for snow itself snow.hpp @Dineshbabu-Ravichandran @sampath1117 Pls share your thoughts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because in warp perspective case I remember the helper functions to be part of same host file
src/include/cpu/rpp_cpu_common.hpp
Outdated
pH = avx_p0; // hue = 0.0f; | ||
pS = avx_p0; // sat = 0.0f; | ||
pAdd = avx_p0; // add = 0.0f; | ||
pL = _mm256_mul_ps(_mm256_add_ps(pCmax, pCmin), _mm256_set1_ps(0.5f)); // l = delta * 0.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment is wrong in this line
src/include/cpu/rpp_cpu_simd.hpp
Outdated
@@ -1555,6 +1574,13 @@ inline void rpp_load24_f32pln3_to_f32pln3_avx(Rpp32f *srcPtrR, Rpp32f *srcPtrG, | |||
p[2] = _mm256_loadu_ps(srcPtrB); | |||
} | |||
|
|||
inline void rpp_load24_f16pln3_to_f32pln3_avx(Rpp16f *srcPtrR, Rpp16f *srcPtrG, Rpp16f *srcPtrB, __m256 *p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls move this function before rpp_load24_f16pln3_to_f32pln3_avx as in threshold implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/include/cpu/rpp_cpu_simd.hpp
Outdated
@@ -1647,6 +1673,11 @@ inline void rpp_load8_f32_to_f32_avx(Rpp32f *srcPtr, __m256 *p) | |||
p[0] = _mm256_loadu_ps(srcPtr); | |||
} | |||
|
|||
inline void rpp_load8_f16_to_f32_avx(Rpp16f *srcPtr, __m256 *p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls move this function before rpp_load8_f32_to_f64_avx
src/modules/hip/kernel/snow.hpp
Outdated
|
||
int globalThreads_x = (dstDescPtr->strides.hStride + 7) >> 3; | ||
int globalThreads_y = dstDescPtr->h; | ||
int globalThreads_z = handle.GetBatchSize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int globalThreads_z = dstDescPtr->n; maybe could be used here?
} | ||
else if ((srcDescPtr->dataType == RpptDataType::F16) && (dstDescPtr->dataType == RpptDataType::F16)) | ||
{ | ||
snow_f16_f16_host_tensor((Rpp16f*) (static_cast<Rpp8u*>(srcPtr) + srcDescPtr->offsetInBytes), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls do reinterpret_cast<Rpp16f*> here
} | ||
else if ((srcDescPtr->dataType == RpptDataType::F32) && (dstDescPtr->dataType == RpptDataType::F32)) | ||
{ | ||
snow_f32_f32_host_tensor((Rpp32f*) (static_cast<Rpp8u*>(srcPtr) + srcDescPtr->offsetInBytes), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls do reinterpret_cast<Rpp32f*> here
} | ||
else if ((srcDescPtr->dataType == RpptDataType::F16) && (dstDescPtr->dataType == RpptDataType::F16)) | ||
{ | ||
hip_exec_snow_tensor((half*) (static_cast<Rpp8u*>(srcPtr) + srcDescPtr->offsetInBytes), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reinterpret_cast<half*> here
} | ||
else if ((srcDescPtr->dataType == RpptDataType::F32) && (dstDescPtr->dataType == RpptDataType::F32)) | ||
{ | ||
hip_exec_snow_tensor((Rpp32f*) (static_cast<Rpp8u*>(srcPtr) + srcDescPtr->offsetInBytes), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use reinerpret_cast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dineshbabu-Ravichandran please resolve the comments
src/modules/hip/kernel/snow.hpp
Outdated
} | ||
else if ((srcDescPtr->layout == RpptLayout::NCHW) && (dstDescPtr->layout == RpptLayout::NHWC)) | ||
{ | ||
globalThreads_x = (srcDescPtr->strides.hStride + 7) >> 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line is repeated , the same code is there in L347
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the line L406.
src/modules/hip/kernel/snow.hpp
Outdated
if (roiType == RpptRoiType::LTRB) | ||
hip_exec_roi_converison_ltrb_to_xywh(roiTensorPtrSrc, handle); | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the empty line here
src/modules/hip/kernel/snow.hpp
Outdated
rpp_hip_load8_and_unpack_to_float8(srcPtr + srcIdx, &pix_f8); | ||
snow_hip_compute(srcPtr, &pix_f8, brightnessCoefficient, snowThreshold, darkMode); | ||
rpp_hip_pack_float8_and_store8(dstPtr + dstIdx, &pix_f8); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the empty line
src/modules/hip/kernel/snow.hpp
Outdated
|
||
__device__ __forceinline__ void snow_8RGB_hip_compute(d_float24 *pix_f24, float *brightnessCoefficient, float *snowThreshold, int *darkMode) | ||
{ | ||
snow_1RGB_hip_compute(&(pix_f24->f1[ 0]), &(pix_f24->f1[ 8]), &(pix_f24->f1[16]), brightnessCoefficient, snowThreshold, darkMode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the extra space inside [] brackets
src/modules/hip/kernel/snow.hpp
Outdated
|
||
__device__ __forceinline__ void snow_1GRAY_hip_compute(float *pixel, float *brightnessCoefficient, float *snowThreshold, int *darkMode) | ||
{ | ||
float l = *pixel; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use some meaningful variable name instead of 'l'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chnaged l to lightness.
src/modules/hip/kernel/snow.hpp
Outdated
l = l * fmaf((brightnessFactor - 1.0f), (1.0f - (l - lower_threshold) / (upper_threshold - lower_threshold)), 1.0f); | ||
} | ||
// Modify L | ||
if(l <= *snowThreshold) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the {} brackets for this IF statement
src/include/cpu/rpp_cpu_common.hpp
Outdated
pixel->R = hueCoefficient[0]; | ||
pixel->G = hueCoefficient[1]; | ||
pixel->B = hueCoefficient[2]; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please the empty line here
src/modules/hip/kernel/snow.hpp
Outdated
__device__ __forceinline__ void snow_1GRAY_hip_compute(float *pixel, float *brightnessCoefficient, float *snowThreshold, int *darkMode) | ||
{ | ||
float l = *pixel; | ||
float lower_threshold = 0.0f; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use snake case for variable names, use only camel case
src/modules/hip/kernel/snow.hpp
Outdated
|
||
// Modify L | ||
if(l <= *snowThreshold && !((hue >= 0.514f && hue <= 0.63f) && (sat >= 0.196f) && (l >= 0.196f))) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove {} brackets if there is only one line inside conditional statements or loops,
Please check all such instances in this PR and remove the brackets
*pixelG = rgb_f4.y; | ||
*pixelB = rgb_f4.z; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove the empty line here
* Make initial changes for raw CPP version of warp perspective * Fix calls to compute_warp_perspective_src_loc function * Update changes to go through nearest neighbours case * AVX HOST codes for warp perspective initial * Fixes for accuracy in warp perspective * More fixes for accuracy in warp perspective * Update the cide for AVX version of Planar to Planar * Add bilinear u8 host code for warp perspective * Make updates to include functions for F32 data type * Make updates to use cast instead of set and fix issues with raw C implementation * Add i8 host codes * Add updates for F16 Bilinear Code * Update the initial HIP code for warp perspective * Update fixes for HIP code * Add Warp Perspective Nearest Neighbors F16 code for PKD3_to_PLN3 and PLN3_to_PLN3 * Add updates for PLN to PLN configuration * Add updates for PKD3 to PKD3 case * Rename variables * Update changes to log images separately for Bilinear and Nearest Neighbors * fixed bug in raw c code of PKD-PLN variant * minor bug fix for F16 PLN variants * minor fixes in HOST test suite * Update the HIP code for review comments and refactoring of device functions * Update the comments alignment * Rename functions and add cases in HOST and HIP runTests.py * Update indentations for compuatations and rename vectors * Update documentations and add more reference variables * Make more formatting changes * Make further updates by including test cases * Make updates to use reinterpret cast * Update reinterpret casts for PLN to PLN configuration u8 and i8 codes * Make updates to enclose code inside AVX2 flag * Make further changes to update type casting * Update the version * Make updates to add warp perspective image * Modify comments, update CHANGELOG and update flags * Update further comments in warp perspective * Add more comments for warp perspective * Update based on further review comments * Update the case number for warp_perspective in common.py * Address review comments * Make initial changes for raw CPP version of warp perspective * Fix calls to compute_warp_perspective_src_loc function * Update changes to go through nearest neighbours case * AVX HOST codes for warp perspective initial * Fixes for accuracy in warp perspective * More fixes for accuracy in warp perspective * Update the cide for AVX version of Planar to Planar * Add bilinear u8 host code for warp perspective * Make updates to include functions for F32 data type * Make updates to use cast instead of set and fix issues with raw C implementation * Add i8 host codes * Add updates for F16 Bilinear Code * Update the initial HIP code for warp perspective * Update fixes for HIP code * Add Warp Perspective Nearest Neighbors F16 code for PKD3_to_PLN3 and PLN3_to_PLN3 * Add updates for PLN to PLN configuration * Add updates for PKD3 to PKD3 case * Rename variables * Update changes to log images separately for Bilinear and Nearest Neighbors * fixed bug in raw c code of PKD-PLN variant * minor bug fix for F16 PLN variants * minor fixes in HOST test suite * Update the HIP code for review comments and refactoring of device functions * Update the comments alignment * Rename functions and add cases in HOST and HIP runTests.py * Update indentations for compuatations and rename vectors * Update documentations and add more reference variables * Make more formatting changes * Make further updates by including test cases * Make updates to use reinterpret cast * Update reinterpret casts for PLN to PLN configuration u8 and i8 codes * Make updates to enclose code inside AVX2 flag * Make further changes to update type casting * Make updates to add warp perspective image * Modify comments, update CHANGELOG and update flags * Update further comments in warp perspective * Add more comments for warp perspective * Update based on further review comments * Update the case number for warp_perspective in common.py * Address review comments * Fix conflits with warp perspective * Update version details * Merge branch 'ar/opt_warp_perspective' of https://github.com/r-abishek/rpp into opt_warp_perspective_rebased * Update version to 1.9.10 including warp perspective * Updates to convert to XYWH from LTRB instead of opposite * Update CHANGELOG.md Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com> * Revert changes and convert to ltrb instead of xywh --------- Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: Rajy Rawther <Rajy.MeeyakhanRawther@amd.com> Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>