8000 Self-contain `sleap.nn` code · Issue #2159 · talmolab/sleap · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Self-contain sleap.nn code #2159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
roomrys opened this issue Apr 9, 2025 · 1 comment
Open

Self-contain sleap.nn code #2159

roomrys opened this issue Apr 9, 2025 · 1 comment

Comments

@roomrys
Copy link
Collaborator
roomrys commented Apr 9, 2025

Goal

The goal of the PRs that solve this issue is to self-contain the sleap.nn code and essentially create a "sleap-nn" placeholder. Unlike sleap-nn which has turned into a long-develop project, the PRs that solve this issue will basically just rearrange existing code so that all the sleap.nn code is self contained and can be moved to it's own package. This will also support 🤞🏼 seamless replacement with the actual sleap-nn package.

Caveat

However, there are a few places where sleap.gui imports from the sleap.nn code. Here we keep track of those places and plan resolutions (which will also affect the main sleap repo). Ideally, sleap.nn code is ONLY used (and only needs sleap-nn installation) IF the user wants to run training through the GUI.

Suggestion

We find that often, we access the model and training job config classes through the GUI - which is expected as we want to be able to write training configuration and load model/training configurations from the GUI. Perhaps the best place for these classes is in the middle-man sleap-io library?

Places where sleap.nn code is used (outside of sleap.nn)

sleap/__init__.py

sleap/sleap/__init__.py

Lines 14 to 21 in ad7c563

import sleap.nn
from sleap.nn.data import pipelines
from sleap.nn import inference
from sleap.nn.inference import load_model, export_model
from sleap.nn.system import use_cpu_only, disable_preallocation
from sleap.nn.system import summary as system_summary
from sleap.nn.config import TrainingJobConfig, load_config
from sleap.nn.evals import load_metrics


sleap/instance.py


sleap/gui/learning/config.py

  • ConfigFileInfo.from_config_file uses sleap.nn.config.TrainingJobConfig
  • TrainingJobConfigsGetter.try_loading_path uses sleap.nn.config.TrainingJobConfig

sleap/gui/learning/datagen.py

  • show_datagen_preview uses sleap.nn.data.providers.LabelsReader
  • make_datagen_results uses methods from sleap.nn.data

sleap/gui/learning/receptivefield.py

  • receptive_field_info_from_model_cfg uses sleap.nn.model.ModelConfig and sleap.nn.model.Model
  • ReceptiveFieldWidget.setModelConfig uses sleap.nn.model.ModelConfig

sleap/gui/learning/runners.py

  • write_pipeline_files uses sleap.nn.training.setup_new_run_folder
  • run_gui_training uses sleap.nn.training.setup_new_run_folder
  • train_subprocess uses sleap.nn.config.TrainingJobConfig

sleap/gui/learning/scopedkeydict.py

  • make_training_config_from_key_val_dict uses sleap.nn.config.TrainingJobConfig
  • make_model_config_from_key_val_dict uses sleap.nn.config.ModelConfig

sleap/gui/learning/base.py


sleap/gui/widgets/monitor.py


sleap/info/trackcleaner.py


sleap/io/dataset.py


sleap/io/video.py


8000
@talmo
Copy link
Collaborator
talmo commented Apr 10, 2025

I think the tough part of the refactor here will be the config.

The config (TrainingJobConfig) serves multiple purposes:

  1. Specify the hyper parameters for training
  2. Document metadata about the model

(1) is specific to the backend implementation. The sleap-nn config will be a bit different and won't have the exact same fields. The problem is that the GUI right now is pretty tightly coupled to this. We could remap the fields, but the logic in the config getter and builder are a nightmare.

The training config editor is overdue for a refresh and we have to gut it no matter what. I propose we have two UIs: a simple one with the most common presets and parameters, and another that is basically a field-value table that has the full list of config items. The first requires explicit mappings, but this would be more manageable with fewer fields. The second could be auto populated from a schema, so no change required for the different backends -- and it exposes all the settings for advanced users automatically.

(2) could be made to be a bit more abstracted from the backend and is probably the right pattern, but it'll take a bit of work to implement. We need to list out what properties of the model are necessary in the GUI (e.g., model type, number of labeled frames, etc.) in a dataclass. Then we have something that pulls out or infers these values from the config.

The tricky decoupling will need to deal with config fields that are computed on the fly downstream. For example, some config fields are inferred from the data, which requires iterating over the dataset so we defer those to model building in the backend. Some of this logic is in the network factories themselves (e.g., the UNet class), so it's hard to not duplicate that logic if we can't import sleap.nn -- unless we define the high level set of model metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0