-
Notifications
You must be signed in to change notification settings - Fork 420
Change MaxBy/MinBy to return all extrema elements (#328) #369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Overall looks good to me. I have just one doubt: Now that this metod returns multiple elements, it could be deferred? Or in other words, why make execution immediate? |
@fsateler That's a fair point. I'll revise the methods to use deferred execution semantics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up using a lazy list so that any use of downstream operators optimized for lists will be efficient.
Normally I wouldn't like this much, as this would force entire evaluation for doing operations such as First()
or Any()
. However, in this case we still need to evaluate the entire sequence to return the first element. So this removes the objection, and thus seems fine to me (although implementing IReadOnlyList<T>
too would be nice).
However, if the purpose of returning this lazy list is to allow downstream operators to optimize, why not return IList<T>
directly?
(I'm posting this as Comment rather than Request changes because I don't think getting the answers to my questions wrong is bad enough to block merging).
var comparison = comparer(key, extremaKey); | ||
if (comparison > 0) | ||
{ | ||
extrema = new List<TSource> { item }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for not using extrema.Clear()
+ extrema.Add(item)
? Using this could save some allocations and GC pressure, at the cost of potentially keeping more memory allocated than strictly necessary (ie, if there are many equal keys that appear before the true maximal key).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at the cost of potentially keeping more memory allocated than strictly necessary
I think you answered the question yourself.
The cost is going to be different for everyone depending on what's going to be their most common case. If you've collected the wrong list of extrema and going to restart it then Clear
is going to spend time zero-ing out the internal array to release the elements. You then also risk, under memory pressure conditions, promoting the list and the internal array to the next generation. If you use TrimExcess
then you still end up with an allocation and so you're only saving the trivial cost of a List<>
allocation. The simplest thing to do therefore is to let the
8000
system do the balancing act and start afresh with a new list each time it needs to be restarted. The clever thing to do would be to build an adaptive system, but I don't think we want to get into that level of complexity (YAGNI?). Even OrderBy
is hard-wired to one sorting algorithm.
How do you propose to do that lazily? Perhaps I'm not seeing something you are. An |
@leandromoh Good suggestion and addressed in f6586bc. |
LGTM. When do you plan to merge? |
This PR addresses #328.