8000 Error reporting improvements by dspinellis · Pull Request #92 · uutils/sed · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Error reporting improvements #92

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

dspinellis
Copy link
Collaborator
@dspinellis dspinellis commented Jun 16, 2025

This PR improves the quality of runtime error reporting to include the associated sed command location and also the input file location when needed.

Specify the argument number rather than its initial string, as the
latter can be ambiguous.
Following row numbering and most common convention.
These can be used to improve runtime error reporting.

The implementation required moving ScriptValue to script_line_provider
to avoid the circular use chain.
This is needed for improved error reporting.
This will be used to also host the runtime error function.
In contrast to the whole mutable and large Command struct, this is immutable
and small, and can therefore can be easily copied around for reporting
runtime errors.
Only provide location to the Fancy RE engine, which is the only one
requiring it. This should improve the following 2-10% fall in benchmark
results that appeared with the improved error reporting.

This is the original fall in performance after adding error location
information..
           no-op-short previous is  1.05 times faster than error-loc
      access-log-no-op previous is  1.05 times faster than error-loc
   access-log-no-subst previous is  1.06 times faster than error-loc
      access-log-subst previous is  1.05 times faster than error-loc
     access-log-no-del previous is  1.07 times faster than error-loc
    access-log-all-del previous is  1.05 times faster than error-loc
   access-log-translit previous is  1.06 times faster than error-loc
access-log-complex-sub previous is  1.01 times faster than error-loc
             remove-cr previous is  1.07 times faster than error-loc
          genome-subst previous is  1.10 times faster than error-loc
            number-fix previous is  1.11 times faster than error-loc
           long-script previous is  1.03 times faster than error-loc
                 hanoi error-loc     is similarly fast as   previous
             factorial previous is  1.02 times faster than error-loc

The change of this commit improves performance in most cases as follows.
           no-op-short error-loc is  1.05 times faster than collaped-re
      access-log-no-op collaped-re is  1.04 times faster than error-loc
   access-log-no-subst collaped-re is  1.05 times faster than error-loc
      access-log-subst collaped-re is  1.04 times faster than error-loc
     access-log-no-del collaped-re is  1.05 times faster than error-loc
    access-log-all-del collaped-re is  1.05 times faster than error-loc
   access-log-translit error-loc     is similarly fast as   collaped-re
access-log-complex-sub collaped-re is  1.02 times faster than error-loc
             remove-cr collaped-re is  1.05 times faster than error-loc
          genome-subst collaped-re is  1.06 times faster than error-loc
            number-fix collaped-re is  1.05 times faster than error-loc
           long-script collaped-re is  1.01 times faster than error-loc
                 hanoi error-loc is  1.01 times faster than collaped-re
             factorial collaped-re is  1.03 times faster than error-loc

This results in a smaller pessimization over the original version before
the error location information was added.
           no-op-short previous is  1.10 times faster than collaped-re
      access-log-no-op previous is  1.02 times faster than collaped-re
   access-log-no-subst previous is  1.01 times faster than collaped-re
      access-log-subst collaped-re     is similarly fast as   previous
     access-log-no-del previous is  1.02 times faster than collaped-re
    access-log-all-del collaped-re     is similarly fast as   previous
   access-log-translit previous is  1.06 times faster than collaped-re
access-log-complex-sub collaped-re is  1.01 times faster than previous
             remove-cr previous is  1.02 times faster than collaped-re
          genome-subst previous is  1.04 times faster than collaped-re
            number-fix previous is  1.06 times faster than collaped-re
           long-script previous is  1.02 times faster than collaped-re
                 hanoi previous is  1.01 times faster than collaped-re
             factorial collaped-re     is similarly fast as   previous
Copy link
codecov bot commented Jun 16, 2025

Codecov Report

Attention: Patch coverage is 0% with 175 lines in your changes missing coverage. Please review.

Project coverage is 0.00%. Comparing base (1d3f719) to head (9986b8d).

Files with missing lines Patch % Lines
src/uu/sed/src/error_handling.rs 0.00% 66 Missing ⚠️
src/uu/sed/src/processor.rs 0.00% 64 Missing ⚠️
src/uu/sed/src/named_writer.rs 0.00% 20 Missing ⚠️
src/uu/sed/src/command.rs 0.00% 11 Missing ⚠️
src/uu/sed/src/compiler.rs 0.00% 11 Missing ⚠️
src/uu/sed/src/fast_io.rs 0.00% 1 Missing ⚠️
src/uu/sed/src/script_line_provider.rs 0.00% 1 Missing ⚠️
src/uu/sed/src/sed.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff          @@
##            main     #92   +/-   ##
=====================================
  Coverage   0.00%   0.00%           
=====================================
  Files         12      13    +1     
  Lines       2632    2714   +82     
  Branches     225     223    -2     
=====================================
- Misses      2632    2714   +82     
Flag Coverage Δ
macos_latest 0.00% <0.00%> (ø)
ubuntu_latest 0.00% <0.00%> (ø)
windows_latest 0.00% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Instead, detect and handle runtime errors at the point where regex
methods are called.

This is always faster or the same as the preceding method.

           no-op-short current is  1.03 times faster than preceding
      access-log-no-op current     is similarly fast as   preceding
   access-log-no-subst current is  1.05 times faster than preceding
      access-log-subst current     is similarly fast as   preceding
     access-log-no-del current is  1.05 times faster than preceding
    access-log-all-del current is  1.04 times faster than preceding
   access-log-translit current     is similarly fast as   preceding
access-log-complex-sub current     is similarly fast as   preceding
             remove-cr current is  1.03 times faster than preceding
          genome-subst current     is similarly fast as   preceding
            number-fix current     is similarly fast as   preceding
           long-script current is  1.01 times faster than preceding
                 hanoi current is  1.02 times faster than preceding
             factorial current     is similarly fast as   preceding

Also, this rectified most performance pessimization introduced by adding
error locations in fast_regex.

           no-op-short previous  is  1.07 times faster than error-loc
      access-log-no-op previous     is similarly fast as    error-loc
   access-log-no-subst error-loc is  1.04 times faster than previous
      access-log-subst previous     is similarly fast as    error-loc
     access-log-no-del error-loc is  1.03 times faster than previous
    access-log-all-del error-loc is  1.04 times faster than previous
   access-log-translit previous  is  1.06 times faster than error-loc
access-log-complex-sub error-loc is  1.02 times faster than previous
             remove-cr previous     is similarly fast as    error-loc
          genome-subst previous  is  1.04 times faster than error-loc
            number-fix previous  is  1.06 times faster than error-loc
           long-script previous     is similarly fast as    error-loc
                 hanoi previous     is similarly fast as    error-loc
             factorial previous     is similarly fast as    error-loc

TODO: In fast_regex distinguish between UTF-8 conversion and regex
errors.  The former should be reported as I/O errors with input file
location info, which the latter should be reported as script errors
with script file location info.
@dspinellis
Copy link
Collaborator Author

@sylvestre This PR is ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0