8000 create `explain_info` context variable in `stream_async` to keep trac… by andompesta · Pull Request #1193 · NVIDIA/NeMo-Guardrails · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

create explain_info context variable in stream_async to keep trac… #1193

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

andompesta
Copy link
Contributor

Description

@Pouyanpi this PR is meant to solve a minor bug on tracking the llm call done by nemo guardrails when used in streaming async mode

Related Issue(s)

issue-1192

Checklist

  • I've read the CONTRIBUTING guidelines.
  • I've updated the documentation if applicable.
  • I've added tests if applicable.
  • @mentions of the person or team responsible for reviewing proposed changes.

NVIDIA#1181)

Fixes NVIDIA#1124

Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>
Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>
@Pouyanpi Pouyanpi self-requested a review May 13, 2025 17:45
Copy link
Collaborator
@Pouyanpi Pouyanpi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Sandro, looks good 👍🏻

Two suggestions:

it'd be nice to make it a method and use it everywhere that same logic is used.

    def _ensure_explain_info(self) -> ExplainInfo:
        explain_info = explain_info_var.get()
        if explain_info is None:
            explain_info = ExplainInfo()
            explain_info_var.set(explain_info)
        return explain_info

and

self.explain_info = self._ensure_expain_info()

so we should remove _update_explain_info in _run_output_rails_in_streaming and use the ensure_explain_info.

If you can make this change and add one test we are good to merge.

Also, don't forget to rebase to develop branch before making the changes.

@Pouyanpi Pouyanpi added this to the v0.14.0 milestone May 13, 2025
@Pouyanpi Pouyanpi added the bug Something isn't working label May 13, 2025
* feat: add RailException support and improve error handling

- Add TypedDict for structured return values
- implement RailException for injection detection (a must have for checks)
- improve error handling for malformed YARA rules

* improve test coverage
@Pouyanpi
Copy link
Collaborator

@mikemckiernan this should resolve the issue you were facing prior opening #1146.

@andompesta
Copy link
Contributor Author
andompesta commented May 13, 2025

@Pouyanpi changes made:

  • _ensure_explain_info is a @staticfunction so has no dependencies on a specific instantiation of the class
  • tests added on the count of llm calls for async and stream async execution

@Pouyanpi
Copy link
Collaborator

Thank you @andompesta , this is GREAT 🥇

@Pouyanpi Pouyanpi self-requested a review May 14, 2025 05:17
Pouyanpi and others added 10 commits May 14, 2025 07:50
…ler.py

- modified the `StreamingConsumer` class in to store its asyncio task
- added a `cancel()` method, and handle `asyncio.CancelledError`
- Updating test functions (`test_single_chunk`,
  `test_sequence_of_chunks`) and the helper function
  (`_test_pattern_case`) to call `consumer.cancel()` in a
  `finally` block.

These changes prevent `RuntimeError: Event loop is closed` and
`Task was destroyed but it is pending!` warnings by ensuring
background tasks are correctly cancelled and awaited upon test
completion.
coverage

- Added tests for various functionalities of StreamingHandler, including:
  - Piping streams between handlers
  - Buffering enable/disable behavior
  - Handling multiple stop tokens
  - Metadata inclusion and processing
  - Suffix and prefix pattern handling
  - Edge cases for __anext__ method
  - First token handling and generation info
- Improved test coverage for async methods and error scenarios.
- Addressed potential issues with streaming termination signals.
…ling

- Replaced inconsistent use of `None` and `""` for stream termination
  in `StreamingHandler` with a dedicated `END_OF_STREAM` sentinel object.
- Modified `push_chunk` to convert `None` to `END_OF_STREAM`.
- Updated `__anext__` to raise `StopAsyncIteration` only for `END_OF_STREAM`
  and to return empty strings or dicts with empty/None text as data.
- Adjusted `_process` to correctly handle `END_OF_STREAM` for buffering
  and queueing logic.
- Updated `on_llm_end` to use `END_OF_STREAM`.
- Revised tests in `test_streaming_handler.py` to reflect these changes,
  including how empty first tokens are handled and how `__anext__` behaves
  with various inputs.
@andompesta andompesta force-pushed the fix/missing-llm-calls-in-explain_info branch from d7f8a4d to fc1b922 Compare May 14, 2025 07:51
@Pouyanpi Pouyanpi requested a review from Copilot May 14, 2025 07:53
Copy link
@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses two areas: fixing a minor tracking bug for LLM call explanations via the new explain_info context variable, and standardizing streaming end‐of‐stream signaling using a sentinel object. In addition, it updates injection detection functions to return structured results and improves documentation on configuration and streaming.

Reviewed Changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file 8000
File Description
tests/test_llm_rails_context_variables.py Added new tests to validate streaming async mode and explain_info tracking.
tests/test_injection_detection.py Added and updated tests for injection detection guardrails and error handling.
nemoguardrails/streaming.py Updated streaming logic to use the END_OF_STREAM sentinel consistently instead of None/empty strings.
nemoguardrails/rails/llm/llmrails.py Refactored explain_info handling by introducing _ensure_explain_info for consistency.
nemoguardrails/library/injection_detection/actions.py Updated injection detection functions to return structured tuples and adjusted their docstrings.
docs/user-guides/configuration-guide.md & docs/user-guides/advanced/streaming.md Revised documentation to reflect updated configuration and streaming usage details.
Files not reviewed (2)
  • nemoguardrails/library/injection_detection/flows.co: Language not supported
  • nemoguardrails/library/injection_detection/flows.v1.co: Language not supported
Comments suppressed due to low confidence (2)

nemoguardrails/library/injection_detection/actions.py:212

  • The _omit_injection function now returns a tuple (bool, str) instead of just a string. Ensure that all downstream consumers are updated to handle this new tuple structure.
def _omit_injection(text: str, matches: list["yara.Match"]) -> Tuple[bool, str]:

nemoguardrails/library/injection_detection/actions.py:292

  • The _reject_injection function now returns a tuple (bool, List[str]) for detected rules instead of a string. Verify that all call sites correctly handle the updated return type.
def _reject_injection(text: str, rules: "yara.Rules") -> Tuple[bool, List[str]]:

Comment on lines +147 to +151
if element is END_OF_STREAM:
break

if isinstance(element, dict):
if element is not None and (
element.get("text") is None or element.get("text") == ""
):
if element is not None and (element.get("text") is END_OF_STREAM):
Copy link
Preview
Copilot AI May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Repeated checks against the END_OF_STREAM sentinel appear throughout the streaming logic. Consider extracting this check into a helper function to improve clarity and reduce duplication.

Copilot uses AI. Check for mistakes.

# We also keep a general reference to this object
self.explain_info = explain_info
self.explain_info = explain_info
self.explain_info = self._ensure_explain_info()
Copy link
Preview
Copilot AI May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The explain_info context handling has been refactored to use _ensure_explain_info. Consider updating the corresponding docstring to clarify its purpose and behavior.

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0