enh: Device and dtype management #93

stephane-rivaud · 2025-04-20T15:00:09Z

Implement device and dtype handling in 8000 GrowingContainer, GrowingModule, and TensorStatistic classes. Before these modifications, device and dtype was handled with native pytorch function. The following pull request give a better implementation.

…Module, and TensorStatistic classes.

…, GrowingModule, and TensorStatistic classes

…ner, GrowingModule, and TensorStatistic classes; refactor growing layers initialization to use ModuleList in GrowingMLPBlock, GrowingMixerLayer, and GrowingResidualMLP.

codecov · 2025-04-20T15:07:51Z

Codecov Report

Attention: Patch coverage is 20.00000% with 76 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/gromo/containers/growing_mlp_mixer.py	22.85%	27 Missing ⚠️
src/gromo/modules/growing_module.py	3.70%	26 Missing ⚠️
src/gromo/containers/growing_residual_mlp.py	23.52%	13 Missing ⚠️
src/gromo/containers/growing_mlp.py	28.57%	5 Missing ⚠️
src/gromo/utils/tensor_statistic.py	16.66%	5 Missing ⚠️

Flag	Coverage Δ
unittests	`85.36% <20.00%> (-2.69%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/gromo/containers/growing_container.py	`87.30% <100.00%> (+0.41%)`	⬆️
src/gromo/modules/linear_growing_module.py	`73.10% <ø> (ø)`
src/gromo/containers/growing_mlp.py	`88.46% <28.57%> (-5.99%)`	⬇️
src/gromo/utils/tensor_statistic.py	`88.63% <16.66%> (-11.37%)`	⬇️
src/gromo/containers/growing_residual_mlp.py	`73.33% <23.52%> (-7.50%)`	⬇️
src/gromo/modules/growing_module.py	`84.30% <3.70%> (-4.83%)`	⬇️
src/gromo/containers/growing_mlp_mixer.py	`85.11% <22.85%> (-12.17%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull Request Overview

This PR refactors device and dtype management by introducing consistent “to” methods and updating growing layer collections in several modules. Key changes include adding a “to” method to TensorStatistic and GrowingModule, converting plain Python lists to torch.nn.ModuleList in container classes, and adjusting the device selection when creating additional tensors.

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 2 comments.

File	Description
src/gromo/utils/tensor_statistic.py	Added a new “to” method to update tensor and device attributes.
src/gromo/modules/linear_growing_module.py	Updated inner tensor creation to use the device of the input tensor.
src/gromo/modules/growing_module.py	Added a “to” method to propagate device/dtype updates across module attributes.
src/gromo/containers/growing_residual_mlp.py, growing_mlp_mixer.py, growing_mlp.py, growing_container.py	Replaced plain list constructs with torch.nn.ModuleList to improve parameter registration and maintain consistency across growing layers.

Files not reviewed (1)

docs/source/whats_new.rst: Language not supported

Comments suppressed due to low confidence (1)

src/gromo/modules/linear_growing_module.py:388

[nitpick] Ensure that switching from using self.device to self.input.device for creating the ones-tensor is intentional and consistent with the overall device management strategy.

torch.ones(*self.input.shape[:-1], 1, device=self.input.device)

Copilot · 2025-04-20T17:51:34Z

src/gromo/containers/growing_mlp.py

@@ -84,7 +84,7 @@ def __init__(
        self.set_growing_layers()

    def set_growing_layers(self) -> None:
-        self._growing_layers = list(self.layers[1:])
+        self._growing_layers = self.layers[1:]


Consider wrapping the sliced growing layers in torch.nn.ModuleList rather than using a plain list to ensure proper parameter registration.

Suggested change

self._growing_layers = self.layers[1:]

self._growing_layers = nn.ModuleList(self.layers[1:])

src/gromo/containers/growing_container.py

Copilot

Pull Request Overview

This PR implements improved device and dtype management across GrowingContainer, GrowingModule, and tensor statistics classes. Key changes include adding to() methods for moving modules and tensors to a specified device/dtype, and converting growing layers to torch.nn.ModuleList in several containers for consistent behavior.

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
src/gromo/utils/tensor_statistic.py	Added a to() method to update device and dtype for tensors
src/gromo/modules/linear_growing_module.py	Updated bias creation to use self.input.device instead of self.device
src/gromo/modules/growing_module.py	Added a to() method for the module and its attributes
src/gromo/containers/growing_residual_mlp.py	Converted _growing_layers to torch.nn.ModuleList for consistency
src/gromo/containers/growing_mlp_mixer.py	Converted _growing_layers and aggregated layers to torch.nn.ModuleList
src/gromo/containers/growing_mlp.py	Removed list() wrapping for _growing_layers
src/gromo/containers/growing_container.py	Added a to() method that iterates through layers to set device/dtype

Files not reviewed (1)

docs/source/whats_new.rst: Language not supported

Comments suppressed due to low confidence (1)

src/gromo/containers/growing_mlp.py:87

[nitpick] For consistency with other growing containers that use torch.nn.ModuleList for _growing_layers, consider converting self.layers[1:] to a ModuleList, e.g. torch.nn.ModuleList(self.layers[1:]).

self._growing_layers = self.layers[1:]

Copilot · 2025-04-20T18:56:48Z

src/gromo/modules/linear_growing_module.py

-                (self.input, torch.ones(*self.input.shape[:-1], 1, device=self.device)),
+                (
+                    self.input,
+                    torch.ones(*self.input.shape[:-1], 1, device=self.input.device),


The change from using self.device to self.input.device could lead to inconsistent device handling if self.input.device is not kept in sync with the module's device. Consider using self.device for bias creation to maintain a unified device management strategy.

Suggested change

torch.ones(*self.input.shape[:-1], 1, device=self.input.device),

torch.ones(*self.input.shape[:-1], 1, device=self.device),

…uleList for improved layer management

…MLP, GrowingResidualMLP, and various GrowingContainer subclasses; remove device handling from GrowingContainer.

Copilot

Pull Request Overview

This pull request improves device and dtype management by implementing new to() methods across various modules and containers. Key changes include updating tensor statistic handling, converting growing layers to torch.nn.ModuleList, and adding device/dtype transfer methods.

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/gromo/utils/tensor_statistic.py	Added a to() method for transferring tensor to a new device/dtype
src/gromo/modules/linear_growing_module.py	Reformatted torch.cat call with multi-line tuple syntax
src/gromo/modules/growing_module.py	Implemented a to() method to move module sub-components to a new device/dtype
src/gromo/containers/growing_residual_mlp.py	Updated growing layers to a ModuleList and added to() methods for device management
src/gromo/containers/growing_mlp_mixer.py	Updated growing layers to a ModuleList and added to() methods for device management
src/gromo/containers/growing_mlp.py	Converted growing layers to nn.ModuleList and added to() method for device/dtype handling
src/gromo/containers/growing_container.py	Converted growing layers to ModuleList and introduced a placeholder to() method

Files not reviewed (1)

docs/source/whats_new.rst: Language not supported

src/gromo/containers/growing_container.py

src/gromo/utils/tensor_statistic.py

…arameters and implementation requirements for child classes.

… for consistent behavior

Copilot

Pull Request Overview

This PR improves device and dtype management by replacing native PyTorch calls with a unified .to() method across various modules and containers within the gromo library. The changes update the GrowingContainer, GrowingModule, and TensorStatistic classes, and also refactor layer storage to use torch.nn.ModuleList.

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
src/gromo/utils/tensor_statistic.py	Added a .to() method to update device and dtype of tensor statistics.
src/gromo/modules/linear_growing_module.py	Reformatted tensor concatenation in input_extended for clarity.
src/gromo/modules/growing_module.py	Introduced a comprehensive .to() method to move modules and attributes.
src/gromo/containers/growing_residual_mlp.py	Replaced list with ModuleList and added .to() methods for layers.
src/gromo/containers/growing_mlp_mixer.py	Replaced list with ModuleList and introduced multiple .to() methods.
src/gromo/containers/growing_mlp.py	Updated layer storage using nn.ModuleList and added a .to() method.
src/gromo/containers/growing_container.py	Added an abstract .to() method raising NotImplementedError.

Files not reviewed (1)

docs/source/whats_new.rst: Language not supported

Comments suppressed due to low confidence (1)

src/gromo/containers/growing_mlp_mixer.py:121

Multiple definitions of the 'to' method are present in this file, which may lead to unexpected behavior. Consider consolidating these implementations into a single method to ensure consistent device and dtype management.

def to(

…okenMixer, GrowingChannelMixer, and GrowingMixerLayer for improved device management

…GrowingResidualMLP for improved device management

Copilot

Pull Request Overview

This pull request enhances device and dtype handling by introducing custom .to methods across several module, container, and utility classes while also standardizing how growing layers are managed. Key changes include:

Adding new .to methods to GrowingModule, GrowingMLP, GrowingMLPMixer, and TensorStatistic classes for consistent device/dtype conversion.
Replacing plain lists with torch.nn.ModuleList for managing growing layers.
Updating function signatures and docstrings to reflect the new device and dtype handling.

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
src/gromo/utils/tensor_statistic.py	Added a .to method that updates the tensor and device attributes.
src/gromo/modules/linear_growing_module.py	Reformatted torch.cat arguments for clarity.
src/gromo/modules/growing_module.py	Introduced a comprehensive .to method that moves module components and attributes.
src/gromo/containers/growing_residual_mlp.py	Replaced list with torch.nn.ModuleList and added relevant .to functionality.
src/gromo/containers/growing_mlp_mixer.py	Added several .to methods; some formatting changes to ensure device/dtype propagation.
src/gromo/containers/growing_mlp.py	Updated growing layer initialization and added a .to method iterating over layers.
src/gromo/containers/growing_container.py	Changed growing layer initialization and introduced an abstract .to method.

Files not reviewed (1)

docs/source/whats_new.rst: Language not supported

Comments suppressed due to low confidence (2)

src/gromo/modules/growing_module.py:581

Ensure that the new device and dtype handling in the .to method is covered by tests, particularly for the movement of non-standard attributes like optimal_delta_layer and tensor statistics.

def to(

src/gromo/containers/growing_mlp_mixer.py:349

Multiple definitions of the 'to' method are present in this file; please verify that each one belongs to a distinct class to prevent unintended overrides.

def to(

src/gromo/modules/growing_module.py

Using keyword arguments (i.e. device=device, dtype=dtype) when calling .to on PyTorch modules for consistency. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull Request Overview

This PR improves device and dtype management across the growing network components by adding uniform “to” methods and updating layer collections. Key changes include:

Adding a “to” method in multiple classes (e.g. GrowingModule, GrowingContainer, TensorStatistic) to support moving modules and tensors to a specified device/dtype.
Replacing plain Python lists with torch.nn.ModuleList for growing layers to ensure proper module registration.
Updating device handling in various container and module classes for a cleaner and more scalable implementation.

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
src/gromo/utils/tensor_statistic.py	Added a “to” method to update tensor device/dtype.
src/gromo/modules/linear_growing_module.py	Reformatted the input extension and maintained device handling in the “to” method.
src/gromo/modules/growing_module.py	Introduced a “to” method to propagate device/dtype changes to submodules.
src/gromo/containers/growing_residual_mlp.py	Updated growing layers to torch.nn.ModuleList and added a “to” method.
src/gromo/containers/growing_mlp_mixer.py	Added several “to” methods across components for device and dtype propagation.
src/gromo/containers/growing_mlp.py	Converted layer collections to nn.ModuleList and added a “to” method.
src/gromo/containers/growing_container.py	Updated growing layers and defined an abstract “to” method for child classes.

Files not reviewed (1)

docs/source/whats_new.rst: Language not supported

stephane-rivaud · 2025-04-21T14:01:39Z

The Code Coverage is too low for merging but I do not know if we can actually implement testing functions for device management. Does Github have access to GPU or MPS (macos) device so I can implement proper testing ?
So far, I have tested it locally by switching between Apple GPU and CPU. Maybe I can write a test that we will exclude from the ones that are executed on GitHub.

stephane-rivaud added 2 commits April 20, 2025 16:55

enh: Implement device and dtype handling in GrowingContainer, Growing…

8ed1252

…Module, and TensorStatistic classes.

refactor: Improve code formatting and readability in GrowingContainer…

d8e6dde

…, GrowingModule, and TensorStatistic classes

stephane-rivaud changed the title ~~Fix: Device and dtype management~~ enh: Device and dtype management Apr 20, 2025

enh: Add to method for device and dtype management in GrowingContai…

8925b88

…ner, GrowingModule, and TensorStatistic classes; refactor growing layers initialization to use ModuleList in GrowingMLPBlock, GrowingMixerLayer, and GrowingResidualMLP.

stephane-rivaud requested a review from Copilot April 20, 2025 17:50

Copilot AI reviewed Apr 20, 2025

View reviewed changes

refactor: Remove debug print statement from GrowingContainer class

57ab401

stephane-rivaud requested a review from Copilot April 20, 2025 18:56

Copilot AI reviewed Apr 20, 2025

View reviewed changes

stephane-rivaud added 2 commits April 20, 2025 20:57

fix: Update growing layers initialization in GrowingMLP to use nn.Mod…

3506b83

…uleList for improved layer management

enh: Implement to method for device and dtype management in Growing…

6bb3f2f

…MLP, GrowingResidualMLP, and various GrowingContainer subclasses; remove device handling from GrowingContainer.

stephane-rivaud requested a review from Copilot April 20, 2025 23:12

Copilot AI reviewed Apr 20, 2025

View reviewed changes

src/gromo/containers/growing_container.py Show resolved Hide resolved

src/gromo/utils/tensor_statistic.py Show resolved Hide resolved

stephane-rivaud added 2 commits April 21, 2025 15:22

doc: Update docstring in to method of GrowingContainer to clarify p…

d6da438

…arameters and implementation requirements for child classes.

fix: Ensure to method in TensorStatistic returns the updated tensor…

1ac28d7

… for consistent behavior

stephane-rivaud requested a review from Copilot April 21, 2025 13:31

Copilot AI reviewed Apr 21, 2025

View reviewed changes

stephane-rivaud added 3 commits April 21, 2025 15:35

enh: Add device handling in to methods of GrowingMLPBlock, GrowingT…

5fd2b95

…okenMixer, GrowingChannelMixer, and GrowingMixerLayer for improved device management

enh: Add device handling in to methods of GrowingResidualBlock and …

d1cabd3

…GrowingResidualMLP for improved device management

enh: Updatewhats_new.rst file.

77e933c

stephane-rivaud requested a review from Copilot April 21, 2025 13:50

Copilot AI reviewed Apr 21, 2025

View reviewed changes

src/gromo/modules/growing_module.py Outdated Show resolved Hide resolved

Update src/gromo/modules/growing_module.py

53c0a64

Using keyword arguments (i.e. device=device, dtype=dtype) when calling .to on PyTorch modules for consistency. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

stephane-rivaud requested a review from Copilot April 21, 2025 13:54

Copilot AI reviewed Apr 21, 2025

View reviewed changes

Merge branch 'main' into fix-device-management

834ef0a

sylvchev approved these changes May 8, 2025

View reviewed changes

Merge branch 'main' into fix-device-management

ff5808d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

enh: Device and dtype management #93

enh: Device and dtype management #93

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

	self._growing_layers = self.layers[1:]
	self._growing_layers = nn.ModuleList(self.layers[1:])

	torch.ones(*self.input.shape[:-1], 1, device=self.input.device),
	torch.ones(*self.input.shape[:-1], 1, device=self.device),

enh: Device and dtype management #93

Are you sure you want to change the base?

enh: Device and dtype management #93

Uh oh!

Conversation

Uh oh!

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!