-
Notifications
You must be signed in to change notification settings - Fork 491
create explain_info
context variable in stream_async
to keep trac…
#1193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create explain_info
context variable in stream_async
to keep trac…
#1193
Conversation
NVIDIA#1181) Fixes NVIDIA#1124 Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>
Signed-off-by: Mike McKiernan <mmckiernan@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Sandro, looks good 👍🏻
Two suggestions:
it'd be nice to make it a method and use it everywhere that same logic is used.
def _ensure_explain_info(self) -> ExplainInfo:
explain_info = explain_info_var.get()
if explain_info is None:
explain_info = ExplainInfo()
explain_info_var.set(explain_info)
return explain_info
and
self.explain_info = self._ensure_expain_info()
so we should remove _update_explain_info
in _run_output_rails_in_streaming
and use the ensure_explain_info
.
If you can make this change and add one test we are good to merge.
Also, don't forget to rebase to develop branch before making the changes.
* feat: add RailException support and improve error handling - Add TypedDict for structured return values - implement RailException for injection detection (a must have for checks) - improve error handling for malformed YARA rules * improve test coverage
@mikemckiernan this should resolve the issue you were facing prior opening #1146. |
@Pouyanpi changes made:
|
Thank you @andompesta , this is GREAT 🥇 |
…ler.py - modified the `StreamingConsumer` class in to store its asyncio task - added a `cancel()` method, and handle `asyncio.CancelledError` - Updating test functions (`test_single_chunk`, `test_sequence_of_chunks`) and the helper function (`_test_pattern_case`) to call `consumer.cancel()` in a `finally` block. These changes prevent `RuntimeError: Event loop is closed` and `Task was destroyed but it is pending!` warnings by ensuring background tasks are correctly cancelled and awaited upon test completion.
coverage - Added tests for various functionalities of StreamingHandler, including: - Piping streams between handlers - Buffering enable/disable behavior - Handling multiple stop tokens - Metadata inclusion and processing - Suffix and prefix pattern handling - Edge cases for __anext__ method - First token handling and generation info - Improved test coverage for async methods and error scenarios. - Addressed potential issues with streaming termination signals.
…ling - Replaced inconsistent use of `None` and `""` for stream termination in `StreamingHandler` with a dedicated `END_OF_STREAM` sentinel object. - Modified `push_chunk` to convert `None` to `END_OF_STREAM`. - Updated `__anext__` to raise `StopAsyncIteration` only for `END_OF_STREAM` and to return empty strings or dicts with empty/None text as data. - Adjusted `_process` to correctly handle `END_OF_STREAM` for buffering and queueing logic. - Updated `on_llm_end` to use `END_OF_STREAM`. - Revised tests in `test_streaming_handler.py` to reflect these changes, including how empty first tokens are handled and how `__anext__` behaves with various inputs.
…nd llm calls are properly tracked
d7f8a4d
to
fc1b922
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR addresses two areas: fixing a minor tracking bug for LLM call explanations via the new explain_info context variable, and standardizing streaming end‐of‐stream signaling using a sentinel object. In addition, it updates injection detection functions to return structured results and improves documentation on configuration and streaming.
Reviewed Changes
Copilot reviewed 8 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
8000File | Description |
---|---|
tests/test_llm_rails_context_variables.py | Added new tests to validate streaming async mode and explain_info tracking. |
tests/test_injection_detection.py | Added and updated tests for injection detection guardrails and error handling. |
nemoguardrails/streaming.py | Updated streaming logic to use the END_OF_STREAM sentinel consistently instead of None/empty strings. |
nemoguardrails/rails/llm/llmrails.py | Refactored explain_info handling by introducing _ensure_explain_info for consistency. |
nemoguardrails/library/injection_detection/actions.py | Updated injection detection functions to return structured tuples and adjusted their docstrings. |
docs/user-guides/configuration-guide.md & docs/user-guides/advanced/streaming.md | Revised documentation to reflect updated configuration and streaming usage details. |
Files not reviewed (2)
- nemoguardrails/library/injection_detection/flows.co: Language not supported
- nemoguardrails/library/injection_detection/flows.v1.co: Language not supported
Comments suppressed due to low confidence (2)
nemoguardrails/library/injection_detection/actions.py:212
- The _omit_injection function now returns a tuple (bool, str) instead of just a string. Ensure that all downstream consumers are updated to handle this new tuple structure.
def _omit_injection(text: str, matches: list["yara.Match"]) -> Tuple[bool, str]:
nemoguardrails/library/injection_detection/actions.py:292
- The _reject_injection function now returns a tuple (bool, List[str]) for detected rules instead of a string. Verify that all call sites correctly handle the updated return type.
def _reject_injection(text: str, rules: "yara.Rules") -> Tuple[bool, List[str]]:
if element is END_OF_STREAM: | ||
break | ||
|
||
if isinstance(element, dict): | ||
if element is not None and ( | ||
element.get("text") is None or element.get("text") == "" | ||
): | ||
if element is not None and (element.get("text") is END_OF_STREAM): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Repeated checks against the END_OF_STREAM sentinel appear throughout the streaming logic. Consider extracting this check into a helper function to improve clarity and reduce duplication.
Copilot uses AI. Check for mistakes.
# We also keep a general reference to this object | ||
self.explain_info = explain_info | ||
self.explain_info = explain_info | ||
self.explain_info = self._ensure_explain_info() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The explain_info context handling has been refactored to use _ensure_explain_info. Consider updating the corresponding docstring to clarify its purpose and behavior.
Copilot uses AI. Check for mistakes.
Description
@Pouyanpi this PR is meant to solve a minor bug on tracking the llm call done by nemo guardrails when used in streaming async mode
Related Issue(s)
issue-1192
Checklist