8000 Releases · LLNL/merlin · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Releases: LLNL/merlin

Version 1.13.0b2

01 Jul 14:51
b79c567
Compare
Choose a tag to compare
Version 1.13.0b2 Pre-release
Pre-release

[1.13.0b2]

Added

  • Ability to turn off the auto-restart functionality of the monitor with --no-restart
  • Tests for the monitor files

Changed

  • Refactored the main.py module so that it's broken into smaller, more-manageable pieces

@bgunnar5 @lucpeterson

Version 1.13.0b1

13 Jun 00:47
5d4d014
Compare
Choose a tag to compare
Version 1.13.0b1 Pre-release
Pre-release

[1.13.0b1]

Added

  • API documentation for Merlin's core codebase
  • New merlin database command to interact with new database functionality
    • When running locally, SQLite will be used as the database. Otherwise your current results backend will be used
    • merlin database info: prints some basic information about the database
    • merlin database get: allows you to retrieve and print entries in the database
    • merlin database delete: allows you to delete entries in the database
  • Added db_scripts/ folder containing several new files all pertaining to database interaction
    • data_models: a module that houses dataclasses that define the format of the data that's stored in Merlin's database.
    • db_commands: an interface for user commands of merlin database to be processed
    • merlin_db: houses the MerlinDatabase class, used as the main point of contact for interactions with the database
    • entities/: A folder containing modules that define a structured interface for interacting with persisted data.
    • entity_managers/: A folder containing classes responsible for managing high-level database operations across all entities.
  • Added backends/ folder containing a new OOP way to interact with results backend databases
    • results_backend: houses an abstract class ResultsBackend that defines what every supported backend implement in Merlin
    • redis/: A folder containing the RedisBackend class that defines specific interactions with the Redis database
    • sqlite/: A folder containing the SQLiteBackend class that defines specific interactions with the SQLite database
    • backend_factory: houses a factory class MerlinBackendFactory that initializes an appropriate ResultsBackend instance
  • Added monitors/ folder containing a refactored, OOP approach to handling the merlin monitor command
    • celery_monitor: houses the CeleryMonitor class a concrete subclass of TaskServerMonitor for monitoring Celery task servers
    • monitor_factory: houses a factory class MonitorFactory that initializes an appropriate TaskServerMonitor instance
    • monitor: houses the Monitor class, used as the top-level point of interaction for the monitor command
    • task_server_monitor: houses the TaskServerMonitor ABC class, which serves as a common interface for monitoring task servers
  • A new celery task called mark_run_as_complete that is automatically added to the task queue associated with the final step in a workflow
  • Added support for Python 3.12 and 3.13
  • Added additional tests for the merlin run and merlin purge commands
  • Aliased types to represent different types of pytest fixtures
  • New test condition StepFinishedFilesCount to help search for MERLIN_FINISHED files in output workspaces
  • Added "Unit-tests" GitHub action to run the unit test suite
  • Added CeleryTaskManager context manager to the test suite to ensure tasks are safely purged from queues if tests fail
  • Added command-tests, workflow-tests, and integration-tests to the Makefile
  • Added tests and docs for the new merlin config options
  • Python 3.8 now requires orderly-set==5.3.0 to avoid a bug with the deepdiff library
  • New step 'Reinstall pip to avoid vendored package corruption' to CI workflow jobs that use pip
  • New GitHub actions to reduce common code in CI
  • COPYRIGHT file for ownership details
  • New check for copyright headers in the Makefile

Changed

  • Updated the merlin monitor command
    • it will now attempt to restart workflows automatically if a workflow is hanging
    • it utilizes an object oriented approach in the backend now
  • Celery's default settings have been updated to add:
    • interval_max: 300 -> tasks will retry for up to 5 minutes instead of 1 minute like it previously was
    • new broker_transport_options:
      • socket_timeout: 300 -> increases the socket timeout to 5 minutes instead of the default 2 minutes
      • retry_policy: {timeout: 600} -> sets the maximum amount of time that Celery will keep trying to connect to the broker to 10 minutes
    • broker_connection_timeout: 60 -> establishing a connection to the broker will not timeout for an entire minute now instead of the previous 4 seconds
    • new generic backend settings:
      • result_backend_always_retry: True -> backend will now auto-retry on the event of recoverable exceptions
      • result_backend_max_retries: 20 -> maximum number of retries in the event of recoverable exceptions
    • new Redis specific settings:
      • redis_retry_on_timeout: True -> retries read/write operations on TimeoutError to the Redis server
      • redis_socket_connect_timeout: 300 -> 5 minute socket timeout for connections to Redis
      • redis_socket_timeout: 300 -> 5 minute socket timeout for read/write operations to Redis
      • redis_socket_keepalive: True -> socket TCP keepalive to keep connections healthy to the Redis server
  • The merlin config command:
    • Now defaults to the LaunchIT setup
    • No longer required to have configuration named app.yaml
    • New subcommands:
      • create: Creates a new configuration file
      • update-broker: Updates the broker section of the configuration file
      • update-backend: Updates the results_backend section of the configuration file
      • use: Point your active configuration to a new configuration file
  • The merlin server command no longer modifies the ~/.merlin/app.yaml file by default. Instead, it modifies the ./merlin_server/app.yaml file.
  • Dropped support for Python 3.7
  • Ported all distributed tests of the integration test suite to pytest
    • There is now a commands/ directory and a workflows/ directory under the integration suite to house these tests
    • Removed the "Distributed-tests" GitHub action as these tests will now be run under "Integration-tests"
  • Removed e2e-distributed* definitions from the Makefile
  • Modified GitHub CI to use shared testing servers hosted by LaunchIT rather than the jackalope server
  • CI to use new actions
  • Copyright headers in all files
    • These now point to the LICENSE and COPYRIGHT files
    • LICENSE: Legal permissions (e.g., MIT terms)
    • COPYRIGHT: Ownership, institutional metadata
    • Make commands that change version/copyright year have been modified

Fixed

  • Running Merlin locally no longer requires an app.yaml configuration file
  • Removed dead lgtm link
  • Potential security vulnerabilities related to logging

Deprecated

  • The --steps argument of the merlin monitor command is now deprecated and will be removed in Version 1.14.0.

@lucpeterson @ryannova @doutriaux1 @woutdenolf

Version 1.12.2

28 Oct 21:33
edbfe9b
Compare
Choose a tag to compare

[1.12.2]

Added

  • Conflict handler option to the dict_deep_merge function in utils.py
  • Ability to add module-specific pytest fixtures
  • Added fixtures specifically for testing status functionality
  • Added tests for reading and writing status files, and status conflict handling
  • Added tests for the dict_deep_merge function
  • Pytest-mock as a dependency for the test suite (necessary for using mocks and fixtures in the same test)
  • New github action test to make sure target branch has been merged into the source first, so we know histories are ok
  • Check in the status commands to make sure we're not pulling statuses from nested workspaces
  • Added setuptools as a requirement for python 3.12 to recognize the pkg_resources library
  • Patch to celery results backend to stop ChordErrors being raised and breaking workflows when a single task fails
  • New step return code $(MERLIN_RAISE_ERROR) to force an error to be raised by a task (mainly for testing)
    • Added description of this to docs
  • New test to ensure a single failed task won't break a workflow
  • Several new unit tests for the following subdirectories:
    • merlin/common/
    • merlin/config/
    • merlin/examples/
    • merlin/server/
  • Context managers for the conftest.py file to ensure safe spin up and shutdown of fixtures
    • RedisServerManager: context to help with starting/stopping a redis server for tests
    • CeleryWorkersManager: context to help with starting/stopping workers for tests
  • Ability to copy and print the Config object from merlin/config/__init__.py
  • Equality method to the ContainerFormatConfig and ContainerConfig objects from merlin/server/server_util.py

Changed

  • merlin info is cleaner and gives python package info
  • merlin version now prints with every banner message
  • Applying filters for merlin detailed-status will now log debug statements instead of warnings
  • Modified the unit tests for the merlin status command to use pytest rather than unittest
  • Added fixtures for merlin status tests that copy the workspace to a temporary directory so you can see exactly what's run in a test
  • Batch block and workers now allow for variables to be used in node settings
  • Task id is now the path to the directory
  • Split the start_server and config_server functions of merlin/server/server_commands.py into multiple functions to make testing easier
  • Split the create_server_config function of merlin/server/server_config.py into two functions to make testing easier
  • Combined < 8000 code>set_snapshot_seconds and set_snapshot_changes methods of RedisConfig into one method set_snapshot

Fixed

  • Bugfix for output of merlin example openfoam_wf_singularity
  • A bug with the CHANGELOG detection test when the target branch isn't in the ci runner history
  • Link to Merlin banner in readme
  • Issue with escape sequences in ascii art (caught by python 3.12)
  • Bug where Flux wasn't identifying total number of nodes on an allocation
    • Not supporting Flux versions below 0.17.0

@bgunnar5 @lucpeterson

Version 1.12.2b1

12 Jun 20:46
9e27798
Compare
Choose a tag to compare
Version 1.12.2b1 Pre-release
Pre-release

[1.12.2b1]

Added

  • Conflict handler option to the dict_deep_merge function in utils.py
  • Ability to add module-specific pytest fixtures
  • Added fixtures specifically for testing status functionality
  • Added tests for reading and writing status files, and status conflict handling
  • Added tests for the dict_deep_merge function
  • Pytest-mock as a dependency for the test suite (necessary for using mocks and fixtures in the same test)
  • New github action test to make sure target branch has been merged into the source first, so we know histories are ok
  • Check in the status commands to make sure we're not pulling statuses from nested workspaces
  • Added setuptools as a requirement for python 3.12 to recognize the pkg_resources library
  • Patch to celery results backend to stop ChordErrors being raised and breaking workflows when a single task fails
  • New step return code $(MERLIN_RAISE_ERROR) to force an error to be raised by a task (mainly for testing)
    • Added description of this to docs
  • New test to ensure a single failed task won't break a workflow

Changed

  • merlin info is cleaner and gives python package info
  • merlin version now prints with every banner message
  • Applying filters for merlin detailed-status will now log debug statements instead of warnings
  • Modified the unit tests for the merlin status command to use pytest rather than unittest
  • Added fixtures for merlin status tests that copy the workspace to a temporary directory so you can see exactly what's run in a test
  • Batch block and workers now allow for variables to be used in node settings
  • Task id is now the path to the directory

Fixed

  • Bugfix for output of merlin example openfoam_wf_singularity
  • A bug with the CHANGELOG detection test when the target branch isn't in the ci runner history
  • Link to Merlin banner in readme
  • Issue with escape sequences in ascii art (caught by python 3.12)
  • Bug where Flux wasn't identifying total number of nodes on an allocation
    • Not supporting Flux versions below 0.17.0

@lucpeterson @bgunnar5

Version 1.12.1

02 May 21:50
b4321d0
Compare
Choose a tag to compare

[1.12.1]

Added

  • New Priority.RETRY value for the Celery task priorities. This will be the new highest priority.
  • Support for the status command to handle multiple workers on the same step
  • Documentation on how to run cross-node workflows with a containerized server (merlin server)

Changed

  • Modified some tests in test_status.py and test_detailed_status.py to accommodate bugfixes for the status commands

Fixed

  • Bugfixes for the status commands:
    • Fixed "DRY RUN" naming convention so that it outputs in the progress bar properly
    • Fixed issue where a step that was run with one sample would delete the status file upon condensing
    • Fixed issue where multiple workers processing the same step would break the status file and cause the workflow to crash
    • Added a catch for the JSONDecodeError that would potentially crash a run
    • Added a FileLock to the status write in _update_status_file() of MerlinStepRecord to avoid potential race conditions (potentially related to JSONDecodeError above)
    • Added in export MANPAGER="less -r" call behind the scenes for detailed-status to fix ASCII error

@bgunnar5 @lucpeterson @koning @ryannova

Version 1.12.0

15 Feb 22:46
60e46d7
Compare
Choose a tag to compare

[1.12.0]

Added

  • A new command merlin queue-info that will print the status of your celery queues
    • By default this will only pull information from active queues
    • There are options to look for specific queues (--specific-queues), queues defined in certain spec files (--spec; this is the same functionality as the merlin status command prior to this update), and queues attached to certain steps (--steps)
    • Queue info can be dumped to outfiles with --dump
  • A new command merlin detailed-status that displays task-by-task status information about your study
    • This has options to filter by return code, task queues, task statuses, and workers
    • You can set a limit on the number of tasks to display
    • There are 3 options to modify the output display
  • Docs for all of the monitoring commands
  • New file merlin/study/status.py dedicated to work relating to the status command
    • Contains the Status and DetailedStatus classes
  • New file merlin/study/status_renderers.py dedicated to formatting the output for the detailed-status command
  • New file merlin/common/dumper.py containing a Dumper object to help dump output to outfiles
  • Study name and parameter info now stored in the DAG and MerlinStep objects
  • Added functions to merlin/display.py that help display status information:
    • display_task_by_task_status handles the display for the merlin detailed-status command
    • display_status_summary handles the display for the merlin status command
    • display_progress_bar generates and displays a progress bar
  • Added new methods to the MerlinSpec class:
    • get_worker_step_map()
    • get_queue_step_relationship()
    • get_tasks_per_step()
    • get_step_param_map()
  • Added methods to the MerlinStepRecord class to mark status changes for tasks as they run (follows Maestro's StepRecord format mostly)
  • Added methods to the Step class:
    • establish_params()
    • name_no_params()
  • Added a property paramater_labels to the MerlinStudy class
  • Added two new utility functions:
    • dict_deep_merge() that deep merges two dicts into one
    • ws_time_to_dt() that converts a workspace timestring (YYYYMMDD-HHMMSS) to a datetime object
  • A new celery task condense_status_files to be called when sets of samples finish
  • Added a celery config setting worker_cancel_long_running_tasks_on_connection_loss since this functionality is about to change in the next version of celery
  • Tests for the Status and DetailedStatus classes
    • this required adding a decent amount of test files to help with the tests; these can be found under the tests/unit/study/status_test_files directory
  • Pytest fixtures in the conftest.py file of the integration test suite
    • NOTE: an export command export LC_ALL='C' had to be added to fix a bug in the WEAVE CI. This can be removed when we resolve this issue for the merlin server command
  • Tests for the celeryadapter.py module
  • New CeleryTestWorkersManager context to help with starting/stopping workers for tests

Changed

  • Reformatted the entire merlin status command
    • Now accepts both spec files and workspace directories as arguments
    • Removed the --steps flag
    • Replaced the --csv flag with the --dump flag
    • New functionality:
      • Shows step_by_step progress bar for tasks
      • Displays a summary of task statuses below the progress bar
  • Split the add_chains_to_chord function in merlin/common/tasks.py into two functions:
    • get_1d_chain which converts a 2D list of chains into a 1D list
    • launch_chain which launches the 1D chain
  • Pulled the needs_merlin_expansion() method out of the Step class and made it a function instead
  • Removed tabulate_info function; replaced with tabulate from the tabulate library
  • Moved verify_filepath and verify_dirpath from merlin/main.py to merlin/utils.py
  • The entire documentation has been ported to MkDocs and re-organized
    • Dark Mode
    • New "Getting Started" example for a simple setup tutorial
    • More detail on configuration instructions
    • There's now a full page on installation instructions
    • More detail on explaining the spec file
    • More detail with the CLI page
    • New "Running Studies" page to explain different ways to run studies, restart them, and accomplish command line substitution
    • New "Interpreting Output" page to help users understand how the output workspace is generated in more detail
    • New "Examples" page has been added
    • Updated "FAQ" page to include more links to helpful locations throughout the documentation
    • Set up a place to store API docs
    • New "Contact" page with info on reaching Merlin devs
  • The Merlin tutorial defaults to using Singularity rather than Docker for the OpenFoam example. Minor tutorial fixes have also been applied.

Fixed

  • The merlin status command so that it's consistent in its output whether using redis or rabbitmq as the broker
  • The merlin monitor command will now keep an allocation up if the queues are empty and workers are still processing tasks
  • Add the restart keyword to the specification docs
  • Cyclical imports and config imports that could easily cause ci issues

@bgunnar5 @koning @lucpeterson @xorJane

Version 1.11.1

23 Oct 18:42
e731420
Compare
Choose a tag to compare

[1.11.1]

Fixed

  • Typo in batch.py that caused lsf launches to fail (ALL_SGPUS changed to ALL_GPUS)

@bgunnar5

Version 1.11.0

09 Oct 20:58
093c867
Compare
Choose a tag to compare

[1.11.0]

Added

  • New reserved variable:
    • VLAUNCHER: The same functionality as the LAUNCHER variable, but will substitute shell variables MERLIN_NODES, MERLIN_PROCS, MERLIN_CORES, and MERLIN_GPUS for nodes, procs, cores per task, and gpus

Changed

  • Hardcoded Sphinx v5.3.0 requirement is now removed so we can use latest Sphinx

Fixed

  • A bug where the filenames in iterative workflows kept appending .out, .partial, or .expanded to the filenames stored in the merlin_info/ subdirectory
  • A bug where a skewed sample hierarchy was created when a restart was necessary in the add_merlin_expanded_chain_to_chord task

@koning @bgunnar5

Version 1.10.3

18 Aug 23:55
faf71ed
Compare
Choose a tag to compare

[1.10.3]

Added

  • The *.conf regex for the recursive-include of the merlin server directory so that pip will add it to the wheel
  • A note to the docs for how to fix an issue where the merlin server start command hangs

Changed

  • Bump certifi from 2022.12.7 to 2023.7.22 in /docs
  • Bump pygments from 2.13.0 to 2.15.0 in /docs
  • Bump requests from 2.28.1 to 2.31.0 in /docs

Version 1.10.2

07 Aug 16:55
261e035
Compare
Choose a tag to compare

[1.10.2]

Fixed

  • A bug where the .orig, .partial, and .expanded file names were using the study name rather than the original file name
  • A bug where the openfoam_wf_singularity example was not being found
  • Some build warnings in the docs (unknown targets, duplicate targets, title underlines too short, etc.)
  • A bug where when the output path contained a variable that was overridden, the overridden variable was not changed in the output_path
  • A bug where permission denied errors happened when checking for system scheduler

Added

  • Tests for ensuring $(MERLIN_SPEC_ORIGINAL_TEMPLATE), $(MERLIN_SPEC_ARCHIVED_COPY), and $(MERLIN_SPEC_EXECUTED_RUN) are stored correctly
  • A pdf download format for the docs
  • Tests for cli substitutions

Changed

  • The ProvenanceYAMLFileHasRegex condition for integration tests now saves the study name and spec file name as attributes instead of just the study name
    • This lead to minor changes in 3 tests ("local override feature demo", "local pgen feature demo", and "remote feature demo") with what we pass to this specific condition
  • Updated scikit-learn requirement for the openfoam_wf_singularity example
  • Uncommented Latex support in the docs configuration to get pdf builds working
0