8000 Add hooks to `RunWorkflow`. by samussiah · Pull Request #1961 · Gilead-BioStats/gsm · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add hooks to RunWorkflow. #1961

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 27, 2024
Merged

Add hooks to RunWorkflow. #1961

merged 5 commits into from
Nov 27, 2024

Conversation

samussiah
Copy link
Contributor
@samussiah samussiah commented Nov 22, 2024

Overview

Adds a LoadData hook at the beginning of RunWorkflow and a SaveData hook at the end.

Test Notes/Sample Code

Simplest use case that hits clindata and writes to .csv, also captured in inst/examples/WorkflowIO.R:

load_all()

LoadData <- function(lWorkflow, lConfig) {
    purrr::imap(
        lWorkflow$spec,
        ~ {
            input <- lConfig$Domains[[ .y ]]

            if (is.data.frame(input)) {
                data <- input
            } else if (is.function(input)) {
                data <- input()
            } else if (is.character(input)) {
                data <- read.csv(input)
            } else {
                cli::cli_abort("Invalid data source: {input}.")
            }

            return(ApplySpec(data, .x))
        }
    )
}

SaveData <- function(lWorkflow, lConfig) {
    domain <- paste0(lWorkflow$meta$Type, '_', lWorkflow$meta$ID)
    cli::cli_alert_info(domain)

    if (exists(domain, lConfig$Domains)) {
        output <- lConfig$Domains[[ domain ]]
        cli::cli_alert_info(output)

        cli::cli_alert_info(
            'Saving output of `lWorkflow` to `{output}`.'
        )

        write.csv(
            lWorkflow$lResult,
            output
        )
    } else {
        cli::cli_alert_info(
            '{domain} not f
8000
ound.'
        )
    }
}

lConfig <- list(
    LoadData = LoadData,
    SaveData = SaveData,
    Domains = list(
        Raw_PD = function() { clindata::ctms_protdev },
        Raw_SUBJ = function() { clindata::rawplus_dm },
        Raw_SITE = function() { clindata::ctms_site },

        Mapped_PD = paste0(tempdir(), '/mapped-pd.csv'),
        Mapped_SUBJ = paste0(tempdir(), '/mapped-subj.csv'),
        Mapped_SITE = paste0(tempdir(), '/mapped-site.csv')
    )
)

lMappedData <- RunWorkflows(
    MakeWorkflowList(c('SUBJ', 'PD', 'SITE')),
    lConfig = lConfig
)

Connected Issues

@samussiah samussiah marked this pull request as ready for review November 25, 2024 22:20
@samussiah
Copy link
Contributor Author

Still working on unit tests @lauramaxwell but the functionality is there.

@lauramaxwell
Copy link
Contributor
lauramaxwell commented Nov 25, 2024

Still working on unit tests @lauramaxwell but the functionality is there.

Nice, @samussiah - took a quick look and everything in the example worked as i would expect. I'll dig in a bit more in the morning, but so far so good! let me know if you want me to look into the tests, etc.

@@ -8,6 +8,9 @@
#'
#' @param lWorkflow `list` A named list of metadata defining how the workflow should be run.
#' @param lData `list` A named list of domain-level data frames.
#' @param lConfig `list` A configuration object with two methods:
#' - `LoadData`: A function that loads data specified in `lWorkflow$spec`.
#' - `SaveData`: A function that saves data returned by the last step in `lWorkflow$steps`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be helpful to add the 4_WorkflowIO.R example to the examples in this roxygen header, or just make a reference to the example in this param explanation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in the latest commit.

Copy link
Contributor
@lauramaxwell lauramaxwell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! added a comment about adding an example to the roxygen header, but feel free to do with that as you wish.

@samussiah
Copy link
Contributor Author

looks good! added a comment about adding an example to the roxygen header, but feel free to do with that as you wish.

Fixed the in-memory example and added a configuration-based example. Unit tests are in as well.

is.function(lConfig$SaveData) &&
all(c('lWorkflow', 'lConfig') %in% names(formals(lConfig$SaveData)))
) {
cli::cli_h3('Saving data with `lConfig$SaveData`.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zdz2101 FYI- I'm going to merge this in before the logging PR gets merged, so a few new cli:: calls will need to be changed over in that PR once this gets merged to dev

@lauramaxwell
Copy link
Contributor

looks good! added a comment about adding an example to the roxygen header, but feel free to do with that as you wish.

Fixed the in-memory example and added a configuration-based example. Unit tests are in as well.

Great! the new examples are super helpful. i've noted a few new cli calls that will need to be adjusted in #1939, but that's not a blocker. Merging this now!

@lauramaxwell lauramaxwell merged commit 833f3db into dev Nov 27, 2024
6 checks passed
@lauramaxwell lauramaxwell deleted the fix-1818 branch November 27, 2024 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature: Add hooks to RunWorkflow framework
2 participants
0