10000 Change MaxBy/MinBy to return all extrema elements (#328) by atifaziz · Pull Request #369 · morelinq/MoreLINQ · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Change MaxBy/MinBy to return all extrema elements (#328) #369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 10, 2018

Conversation

atifaziz
Copy link
Member

This PR addresses #328.

@atifaziz atifaziz changed the title Change MaxBy/MinBy to return all matching elements (#328) Change MaxBy/MinBy to return all extrema elements (#328) Jan 25, 2018
@atifaziz atifaziz requested a review from fsateler February 10, 2018 14:57
@fsateler
Copy link
Member

Overall looks good to me. I have just one doubt: Now that this metod returns multiple elements, it could be deferred? Or in other words, why make execution immediate?

@atifaziz
Copy link
Member Author

@fsateler That's a fair point. I'll revise the methods to use deferred execution semantics.

@atifaziz
Copy link
Member Author

@fsateler Let me know what you make of 0b14a71. I ended up using a lazy list so that any use of downstream operators optimized for lists will be efficient.

Copy link
Member
@fsateler fsateler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up using a lazy list so that any use of downstream operators optimized for lists will be efficient.

Normally I wouldn't like this much, as this would force entire evaluation for doing operations such as First() or Any(). However, in this case we still need to evaluate the entire sequence to return the first element. So this removes the objection, and thus seems fine to me (although implementing IReadOnlyList<T> too would be nice).

However, if the purpose of returning this lazy list is to allow downstream operators to optimize, why not return IList<T> directly?

(I'm posting this as Comment rather than Request changes because I don't think getting the answers to my questions wrong is bad enough to block merging).

var comparison = comparer(key, extremaKey);
if (comparison > 0)
{
extrema = new List<TSource> { item };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for not using extrema.Clear() + extrema.Add(item) ? Using this could save some allocations and GC pressure, at the cost of potentially keeping more memory allocated than strictly necessary (ie, if there are many equal keys that appear before the true maximal key).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at the cost of potentially keeping more memory allocated than strictly necessary

I think you answered the question yourself.

The cost is going to be different for everyone depending on what's going to be their most common case. If you've collected the wrong list of extrema and going to restart it then Clear is going to spend time zero-ing out the internal array to release the elements. You then also risk, under memory pressure conditions, promoting the list and the internal array to the next generation. If you use TrimExcess then you still end up with an allocation and so you're only saving the trivial cost of a List<> allocation. The simplest thing to do therefore is to let the 8000 system do the balancing act and start afresh with a new list each time it needs to be restarted. The clever thing to do would be to build an adaptive system, but I don't think we want to get into that level of complexity (YAGNI?). Even OrderBy is hard-wired to one sorting algorithm.

@atifaziz
Copy link
Member Author

However, if the purpose of returning this lazy list is to allow downstream operators to optimize, why not return IList<T> directly?

How do you propose to do that lazily? Perhaps I'm not seeing something you are. An IList<> is what was being returned before it was made lazy. Are you suggesting to change the return type? The optimisation usually rely on the run-time type so they'll kick-in nonetheless.

@leandromoh
Copy link
Collaborator
leandromoh commented Mar 10, 2018

Why not simple follow the same approach as ScanRight, TakeLast and CountBy: store the resulting elements in a collection and so yield each element of the collection to do the operator lazy?

@atifaziz
Copy link
Member Author

@leandromoh Good suggestion and addressed in f6586bc.

@leandromoh
Copy link
Collaborator

LGTM. When do you plan to merge?

@atifaziz atifaziz merged commit 82e6538 into morelinq:master Mar 10, 2018
@atifaziz atifaziz deleted the extrema-by branch March 10, 2018 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0