Don't insert a duplicate import when applying package autocorrects #8765

neilparikh · 2025-04-12T00:52:10Z

When applying autocorrects in core/ErrorFlusherStdout.cc, we will now ignore the autocorrects that are set on the missing import errors. Instead, we store all the missing imports for a given package in PackageInfo, and then aggregate those to compute the autocorrect. This means that 2 missing import errors for the same missing import will only lead to the import being inserted once

Something similar is done in code_action.cc. There, we have a list of the all the edits for autocorrects for the current file. While constructing that list, we will skip missing import errors. Then, we aggregate the autocorrects from the PackageInfo, but only those that apply for the current file.

Easier to review commit-by-commit.

Motivation

Don't import the same package more than once.

Test plan

See included automated tests.

froydnj

I think this is the right general idea, but it is the wrong layer for the implementation to live at -- ErrorFlusherStdout should not have any knowledge of the packager errors, for instance. (Note also that we'd have to write separate support for the LSP implementation, which is not desirable.)

I think it's possible to rewrite this such that the worker threads in the VisibilityChecker maintain their own lists of any import errors that have happened, and they can send those accumulated errors off to a queue which gets drained and deduplicates errors before sending them off to the actual error queue in the GlobalState. (I believe that we do this kind of pattern in at least one other place?) I do not think we want to be storing this kind of information in the package DB -- we can maintain this same mapping in whatever structure we have in the VisibilityChecker instead. Happy to explain/explore more over Slack.

froydnj · 2025-04-14T15:23:37Z

packager/VisibilityChecker.cc

            size_t idx;
            for (auto result = taskq->try_pop(idx); !result.done(); result = taskq->try_pop(idx)) {
                ast::ParsedFile &f = files[idx];
                if (!f.file.data(gs).isPackage(gs)) {
                    auto pkgName = gs.packageDB().getPackageNameForFile(f.file);
                    if (pkgName.exists()) {
                        core::Context ctx{gs, core::Symbols::root(), f.file};
-                        VisibilityCheckerPass pass{ctx, gs.packageDB().getPackageInfo(pkgName)};
+                        VisibilityCheckerPass pass{ctx, *nonConstGs.packageDB().getPackageInfoNonConst(pkgName)};


It is possible for this kind of pattern -- having non-constant shared data used in worker threads -- to be safe, but it is not safe in this instance. We could have multiple files that share the same package both have import errors, and then separate threads would be modifying the same package without any locking between them.

froydnj

Sorry about the review delay here, I think this is better than the previous version. Still a few things to fix.

IIUC, this is going to collide with @jez's PackageRef work, so you may want to coordinate -- if you haven't already -- on what the right thing to do there is.

froydnj · 2025-05-09T15:55:25Z

packager/VisibilityChecker.cc

+                resultq->push({f.file, std::move(pass.missingImports)}, 1);
            }

            barrier.DecrementCount();
        });

        barrier.Wait();

+        std::pair<core::FileRef, UnorderedMap<std::pair<core::packages::MangledName, core::packages::ImportType>,


I think this will have to be rebased across #8852.

Even after that's done, we should make the previous loop here use multiplexJob so that the aggregation loop below can proceed in parallel with the processing of the visibility checking and not wait for all the visibility checks to be done.

froydnj · 2025-05-09T16:01:57Z

packager/packager.cc

+        for (auto &[key, value] : trackedMissingImports_) {
+            auto importName = key.first;
+            auto importType = key.second;
+            auto autocorrect = value.second;


Suggested change

auto autocorrect = value.second;

auto &autocorrect = value.second;

froydnj · 2025-05-09T16:02:21Z

packager/packager.cc

+                // test_import.
+                continue;
+            }
+            allEdits.insert(allEdits.end(), autocorrect.edits.begin(), autocorrect.edits.end());


These autocorrect.edits should probably be move iterators?

froydnj · 2025-05-09T16:02:37Z

packager/packager.cc

+        std::vector<core::AutocorrectSuggestion::Edit> allEdits;
+        for (auto &[_, value] : trackedMissingImports_) {
+            auto files = value.first;
+            auto autocorrect = value.second;


Suggested change

auto autocorrect = value.second;

auto &autocorrect = value.second;

froydnj · 2025-05-09T16:02:54Z

packager/packager.cc

+            auto files = value.first;
+            auto autocorrect = value.second;
+            if (files.contains(file)) {
+                allEdits.insert(allEdits.end(), autocorrect.edits.begin(), autocorrect.edits.end());


Same comment on move iterators here.

froydnj · 2025-05-09T16:05:24Z

packager/VisibilityChecker.cc

+                // TODO(neil): find out why this is sometimes not true and enable this ENFORCE
+                /* ENFORCE(autocorrect.has_value()); */


Are we doing this TODO before committing?

froydnj · 2025-05-09T16:06:56Z

core/AutocorrectSuggestion.h

-        : title(title), edits(edits), isDidYouMean(isDidYouMean) {}
+    bool skipWhenAggregated;
+
+    AutocorrectSuggestion() : title(""), edits({}), isDidYouMean(false), skipWhenAggregated(false) {}


Minor nit: I would be inclined to declare the (bool-only) fields as bool isDidYouMean = false; in the class definition and then this constructor could just be AutocorrectSuggestion() = default;.

froydnj · 2025-05-09T16:09:07Z

packager/packager.cc

+            if (files.contains(file)) {
+                files.erase(file);
+            }


Nit: better to do something like:

auto it = files.find(file); if (it == files.end()) { continue; } files.erase(it);

so you don't do unnecessary hash lookups.

neilparikh changed the title ~~Neil dedup import cli~~ Don't insert a duplicate import when applying package autocorrects in -a mode Apr 12, 2025

neilparikh marked this pull request as ready for review April 12, 2025 00:54

neilparikh requested a review from a team as a code owner April 12, 2025 00:54

neilparikh requested review from froydnj and removed request for a team April 12, 2025 00:54

froydnj requested changes Apr 14, 2025

View reviewed changes

neilparikh force-pushed the neil-dedup-import-cli branch 2 times, most recently from 6de52db to 93edb89 Compare April 21, 2025 01:10

neilparikh added 3 commits April 24, 2025 16:34

simplify some code

04f6bf4

add a CLI test

c333dbc

add a LSP test

6511595

neilparikh force-pushed the neil-dedup-import-cli branch 2 times, most recently from e0f5469 to b011b82 Compare April 25, 2025 00:24

neilparikh changed the title ~~Don't insert a duplicate import when applying package autocorrects in -a mode~~ Don't insert a duplicate import when applying package autocorrects Apr 25, 2025

neilparikh requested a review from froydnj April 25, 2025 00:26

neilparikh added 4 commits April 25, 2025 11:55

track missing imports

c80abe4

aggregate imports

f21d80e

use the tracked missing imports to compute the final edit

55b329f

use the tracked missing imports to compute apply all autocorrect in lsp

767cd2b

neilparikh force-pushed the neil-dedup-import-cli branch from b011b82 to 767cd2b Compare April 25, 2025 18:55

froydnj requested changes May 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Don't insert a duplicate import when applying package autocorrects #8765

Don't insert a duplicate import when applying package autocorrects #8765

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

	auto autocorrect = value.second;
	auto &autocorrect = value.second;

		// TODO(neil): find out why this is sometimes not true and enable this ENFORCE
		/* ENFORCE(autocorrect.has_value()); */

Don't insert a duplicate import when applying package autocorrects #8765

Are you sure you want to change the base?

Don't insert a duplicate import when applying package autocorrects #8765

Uh oh!

Conversation

Uh oh!

Motivation

Test plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!