8000 Probable Incorrect Error Handling in collective_create() · Issue #761 · LLNL/UnifyFS · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Probable Incorrect Error Handling in collective_create() #761
Open
@rgmiller

Description

@rgmiller

System information

Type Version/Name
Operating System All of them, I suspect
OS Version
Architecture
UnifyFS Version

Describe the problem you're observing

In collective_create() (around about line 303) we attempt to acquire a child request handle. If we can't get a handle for some reason, we do NOT clean up and exit (as we do with most other error conditions). Instead, we log an error and set the pointer to NULL, but otherwise continue processing. If no other errors occur, then the function will return a pointer to a coll_request struct, but that struct will contain an unexpected NULL pointer.

Describe how to reproduce the problem

The bug is in the error handling for this function, so you'd have to trigger the specific error. I'm not sure how you'd do that (and I rather suspect it's never happened on a production system).

Include any warning or errors or releveant debugging data

This was discovered during a code inspection as part of the work on #758. For reference, Michael Brim's comment about it is reproduced below:

Regarding the TODO, the reason we don't return immediately in that error case is because that would leak the allocated coll_request structure and its member array allocations. However, it looks like we're missing the logic to notice that we failed to get one of the child handles and need to cleanup the structure before returning NULL. Unfortunately, it appears it's not as easy as just calling collective_cleanup(), since that would prematurely free the original group rpc request handle, which we use in the callers to collective_create() to send back an error. This is complicated enough that we probably should open an issue and address in a separate PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    To Consider

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0