8000 Support sorting based on values from a dictionary. by Subv · Pull Request #5140 · realm/realm-core · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Support sorting based on values from a dictionary. #5140

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

Subv
Copy link
Contributor
@Subv Subv commented Jan 7, 2022

What, How & Why?

This is a draft and a request for comments about the feature and its implementation, I figured the best way to ask for help was to open a PR.

Note that it is currently possible to filter by a value stored under a specific dictionary key (using the dictionary['key'] syntax) but is impossible to sort by that same key, this PR aims to change that.

I added some tests and FIXME comments about what I'm unsure about. This is my first time actually touching Realm's core code :)

Sorting is implemented by storing the parent dictionary's ColKey and the specific dictionary key we want to sort by.

This is a work-in-progress, the performance characteristics of searching the dictionary for the desired key on every iteration of the sort predicate are unknown.

The dictionary ColKey and the desired sort key are currently stored inside the ColKey structure to avoid changing too much of the SortDescriptor interface.

Sorting by dictionary keys using keyPath is currently not implemented (but desired).

In the future this could be extended to support sorting on specific indexes of a collection, ie. sort('array[0]').

Pain points

  • Changing the ColKey structure to store the extra fields needed feels like a hack, and makes it way bigger than a single 64 bit integer, this probably affects the cost of copying it all throughout the codebase. Maybe it would be better to store this data directly inside the SortDescriptor? This data is now stored in DictionaryKey and SortableColumnKey.
  • The sort currently gets the parent dictionary and queries it for the sort key on every iteration of the Sorter predicate, I'm not sure how fast that is, but the answer is probably "not at all".
  • Currently only dictionaries of integers/strings are supported, this could probably be extended to arbitrary-length paths. Following links through the dictionary keys is now supported, it was nice low-hanging fruit after the SortableColumnKey implementation.
  • There is very little error-checking in this patch, I wanted to get it out there and see what reaction I get about the feature.

Example scenario

When you have data in the Entity-Attribute-Value pattern with dynamic attributes and want to allow the users of your application to sort by any of the fields, it's more convenient to have a dictionary of attributes keyed by attribute id in your table and be able to filter and sort by that directly.
Currently the way to solve that is to extract your Attributes into another table, filter that table by AttributeId, and then sort it based on the AttributeValue, after that you can grab the parent objects by looping through the attribute objects and grabbing a backlink to the parent Entity. But this seems convoluted when all you want is a simple sort.

Update 1

Following feedback I reworked this patch so that the ColKey structure is unchanged. A new DictionaryKey structure was added to hold the required information to reference the specific dictionary keys, and the SortDescriptor and Sorter now deal with a derived type of std::variant<ColKey, DictionaryKey> called SortableColumnKey that deals with dispatching the proper key-specific behavior depending on the key type.

☑️ ToDos

  • 📝 Changelog update
  • 🚦 Tests (or not relevant)

@cla-bot
Copy link
cla-bot bot commented Jan 7, 2022

Realm welcomes all contributions! The only requirement we have is that, like many other projects, we need to have a Contributor License Agreement (CLA) in place before we can accept any external code. Our own CLA is a modified version of the Apache Software Foundation’s CLA. Our records show that CLA has not been signed by @Subv. Please submit your CLA electronically using our Google form so we can accept your submissions. After signing the CLA you can recheck this PR with a @cla-bot check comment. The GitHub usernames you file there will need to match that of your Pull Requests. If you have any questions or cannot file the CLA electronically, make a comment here and we will be happy to help you out.

@Subv
Copy link
Contributor Author
Subv commented Jan 7, 2022

@cla-bot check

@cla-bot cla-bot bot added the cla: yes label Jan 7, 2022
@cla-bot
Copy link
cla-bot bot commented Jan 7, 2022

The cla-bot has been summoned, and re-checked this pull request!

@jedelbo
Copy link
Contributor
jedelbo commented Jan 7, 2022

@Subv Thank you for your work. I absolutely get the motivation for doing this, but as you have found out, the current design of the sort mechanism is not exactly geared towards this. I will definitely avoid adding to the ColKey struct! A more viable solution could be to have ColumnsDescriptor::m_column_keys be a vector of ColKey/Mixed union type of thing. Whether this can be done in an elegant way, I am not sure.

@Subv Subv force-pushed the subv/sort-on-dictionary-keys branch 3 times, most recently from 4e3bbc7 to e2a55e7 Compare January 8, 2022 03:29
@Subv
Copy link
Contributor Author
Subv commented Jan 8, 2022

@jedelbo Thank you for the feedback!

I reworked the patch a little to use a custom type derived from std::variant called SortableColumnKey and modified the sort descriptor code so that it can work with this new type. The variant takes care of abstracting the key-type-specific behaviors using std::visit.

@tgoyne
Copy link
Member
tgoyne commented Jan 10, 2022

This seems like a reasonable approach to me. child_key probably needs to be a std::string as SortDescriptors can outlive the scope in which they're created and are used on different threads. We can't use std::variant because it requires a higher iOS deployment target than we currently have, but mpark::variant is a drop-in replacement for it.

@Subv
Copy link
Contributor A 8000 uthor
Subv commented Jan 11, 2022

@tgoyne Thanks! I included the change for std::string child keys and the std::variant -> mpark::variant change.

@Subv Subv marked this pull request as ready for review January 14, 2022 04:00
@Subv Subv changed the title [WIP] Support sorting based on values from a dictionary. Support sorting based on values from a dictionary. Jan 14, 2022
Subv added 4 commits February 19, 2022 09:31
Sorting is implemented by storing the parent dictionary's ColKey and the specific dictionary key we want to sort by.  This information is stored inside a new type of key called DictionaryKey.

A new variant-derived structure called SortableColumnKey was added to handle dispatching the key-specific behavior needed to support both ColKeys and DictionaryKeys in the Sorter. There is currently a workaround to a GCC bug, see https://stackoverflow.com/questions/51309467/using-stdvisit-on-a-class-inheriting-from-stdvariant-libstdc-vs-libc and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90943 for more information.

Sorting by direct dictionary keys and following links through a dictionary during sorting both work.

This is a work-in-progress, the performance characteristics of searching the dictionary for the desired key on every iteration of the sort predicate are unknown.

Sorting by dictionary keys using keyPath is currently not implemented (but desired).

In the future this could be extended to support sorting on specific indexes of a collection, ie. sort('array[0]'), by adding a new variant type to the SortableColumnKey structure and implementing the necessary visit lambdas.
…ctionaryKey.

The DictionaryKeys may live on their own without being attached to their creator's lifetime.
@Subv Subv force-pushed the subv/sort-on-dictionary-keys branch from 218da95 to 2711893 Compare February 19, 2022 14:43
@Subv
Copy link
Contributor Author
Subv commented Feb 19, 2022

Rebased the PR to fix the conflicts and use the Table::check_column function instead of Table::report_invalid_link

@jedelbo
Copy link
Contributor
jedelbo commented Mar 7, 2022

@Subv Your recent updates is more in line with what I would expect. However, I think that using the mpark::variant framework here is a bit of an overkill in this case. Instead of asking you tome make another attempt, I have created a competing PR (#5311). It is building on your work, but simplifies it a bit. This new PR is still in draft as I have to make some performance measurements in order to check how much the changes matter. Also some work on the query parser is missing.

@jedelbo jedelbo removed their request for review March 8, 2022 08:07
@bmunkholm bmunkholm requested a review from jedelbo March 29, 2023 11:01
@jedelbo
Copy link
Contributor
jedelbo commented Apr 3, 2023

Closing as this is superseded by #5311

@jedelbo jedelbo closed this Apr 3, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0