-
Notifications
You must be signed in to change notification settings - Fork 3
Iris
: Add FAQ consistency check
#61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… repositories Repositories: Pyris Repositories without this branch: Atlas A 8000 thena
IRIS
: Add FAQ consistency tabIris
: Add FAQ consistency tab
Iris
: Add FAQ consistency tabIris
: Add FAQ consistency check
There hasn't been any activity on this pull request recently. Therefore, this pull request has been automatically marked as stale and will be closed if no further activity occurs within seven days. Thank you for your contributions. |
Reopen, since i am back working on it |
…sistency' into iris/feature/faq/add-rewrite-consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (6)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py (6)
21-21
: Clarify the reference to "four entries".The phrase "The following four entries" is ambiguous. It would be clearer to explicitly name these entries upfront.
-The following four entries are optional and should only be set if inconsistencies are detected. +The following four fields ("faqs", "message", "suggestion", "improved version") are optional and should only be set if inconsistencies are detected.
27-27
: Fix unclear phrasing about inconsistency assumption.The sentence structure is unclear and could confuse the AI model about when to consider content inconsistent.
-Assume that existing FAQs are correct, so the new final_result is inconsistent. +Assume that existing FAQs are correct, so if there's a contradiction, the final_result should be considered inconsistent.
33-33
: Fix grammatical error.--Make sure to always insert two new lines after the last character of this sentences. +-Make sure to always insert two new lines after the last character of this sentence.
38-38
: Fix literal newline characters.The literal
\n
characters should likely be actual newlines for better formatting.--"suggestion": This entry is a list of strings, each string represents a suggestion to improve the final result.\n +-"suggestion": This entry is a list of strings, each string represents a suggestion to improve the final result.
42-42
: Fix grammatical error.-- Each suggestions highlights what is the inconsistency and how it can be improved. +- Each suggestion highlights what is the inconsistency and how it can be improved.
44-44
: Clarify ambiguous reference.The word "Both" doesn't have a clear antecedent, making the instruction confusing.
-Both should have the same amount of entries. +The suggestions and inconsistencies should have the same amount of entries.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Build and Push to GitHub Container Registry / Build Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Build and Push to GitHub Container Registry / Build Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Lint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py (4)
19-22
: 🛠️ Refactor suggestionComplete the JSON structure specification.
The output specification is still incomplete, showing only the "type" field while the prompt later describes additional fields ("faqs", "message", "suggestion", "improved version"). This incomplete specification could confuse the AI model about the expected output structure.
Apply this diff to provide the complete JSON structure:
-### Output: -Generate the following response dictionary: -"type": "consistent" or "inconsistent" +### Output: +Generate a JSON response dictionary with the following structure: +{ + "type": "consistent" or "inconsistent", + // Additional fields below are required only if type is "inconsistent" + "faqs": [...], + "message": "...", + "suggestion": [...], + "improved version": "..." +}
4-5
: 🛠️ Refactor suggestionFix inconsistent terminology throughout the prompt.
The prompt instructs not to use "final result" (line 46) but continues to use this term extensively throughout the template (lines 4, 5, 17, 34, 43, 51). This inconsistency could confuse the AI model.
Apply this diff to maintain consistent terminology:
-You have been provided with a list of FAQs and a final result. Your task is to determine whether the -final result is consistent with the given FAQs. Please compare each FAQ with the final result separately. +You have been provided with a list of FAQs and a provided text. Your task is to determine whether the +provided text is consistent with the given FAQs. Please compare each FAQ with the provided text separately.Similar changes should be applied to all other instances of "final result" throughout the prompt.
Also applies to: 17-17, 33-34, 43-43, 46-46, 51-51
37-41
: 🛠️ Refactor suggestionRemove confusing placeholder references.
The instructions mention inserting FAQ details into placeholders (line 41), but the message template (line 37) contains no visible placeholders. This creates confusion about message formatting expectations.
Apply this diff to clarify the instructions:
-"message": "The provided text was rephrased, however it contains inconsistent information with existing FAQs." --Make sure to always insert two new lines after the last character of this sentences. -The affected FAQs can only contain the faq_id, faq_question_title, and faq_question_answer of inconsistent FAQs. -Make sure to not include any additional FAQs, that are consistent with the final_result. -Insert the faq_id, faq_question_title, and faq_question_answer of the inconsistent FAQ in the placeholder. +"message": "The provided text was rephrased, however it contains inconsistent information with existing FAQs." +-Make sure to always insert two new lines after the last character of this sentence. +The "faqs" field should contain only the inconsistent FAQs with their faq_id, faq_question_title, and faq_question_answer. +Make sure to not include any additional FAQs that are consistent with the provided text.
1-55
: 🛠️ Refactor suggestionOverall assessment: Past review issues remain unaddressed.
While this prompt template serves its intended purpose of FAQ consistency validation, the previously identified clarity and formatting issues have not been resolved. The prompt would benefit from a comprehensive revision to address terminology consistency, complete JSON specification, and formatting uniformity to ensure reliable AI model comprehension.
🧹 Nitpick comments (2)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py (2)
10-10
: Fix formatting issue with escaped newline character.There's an unnecessary escaped newline character (
\n
) in the middle of the sentence that breaks the flow and formatting consistency.Apply this diff to fix the formatting:
-expressions within the course context. Do not assume equivalence between terms unless explicitly stated. +expressions within the course context. Do not assume equivalence between terms unless explicitly stated.
47-47
: Fix inconsistent formatting with newline character.Line 47 ends with an escaped newline character (
\n
) which is inconsistent with the formatting style used elsewhere in the prompt.Apply this diff to maintain consistent formatting:
-- Please ensure that at no time, you have a different amount of suggestions than inconsistencies.\n +- Please ensure that at no time, you have a different amount of suggestions than inconsistencies.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Build and Push to GitHub Container Registry / Build Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Build and Push to GitHub Container Registry / Build Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Lint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (4)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py (4)
21-22
: 🛠️ Refactor suggestionComplete the JSON structure specification.
The prompt only shows the "type" field but doesn't specify the complete expected JSON output structure, which could confuse the AI model about the full response format.
Apply this diff to provide the complete JSON structure:
-Generate the following response dictionary: -"type": "consistent" or "inconsistent" +Generate a JSON response dictionary with the following structure: +{ + "type": "consistent" or "inconsistent", + "faqs": [...], // Required only if type is "inconsistent" + "message": "...", // Required only if type is "inconsistent" + "suggestion": [...], // Required only if type is "inconsistent" + "improved version": "..." // Required only if type is "inconsistent" +}
34-34
: 🛠️ Refactor suggestionClarify the confusing assumption statement.
The statement "Assume that existing FAQs are correct, so the new final_result is inconsistent" is grammatically unclear and could confuse the AI model about the logic.
Apply this diff to improve clarity:
-Assume that existing FAQs are correct, so the new final_result is inconsistent. +Assume that existing FAQs are correct. If there are contradictions, the provided text should be considered inconsistent with the FAQs.
40-42
: 🛠️ Refactor suggestionRemove confusing placeholder references.
The instructions mention inserting FAQ details into placeholders, but the message template contains no visible placeholders, which could confuse the AI model.
Apply this diff to remove the confusing placeholder references:
-The affected FAQs can only contain the faq_id, faq_question_title, and faq_question_answer of inconsistent FAQs. -Make sure to not include any additional FAQs, that are consistent with the final_result. -Insert the faq_id, faq_question_title, and faq_question_answer of the inconsistent FAQ in the placeholder. +The "faqs" field should contain only the inconsistent FAQs with their faq_id, faq_question_title, and faq_question_answer. +Make sure to not include any additional FAQs that are consistent with the final_result.
47-47
: 🛠️ Refactor suggestionFix inconsistent terminology usage.
The prompt instructs not to use "final result" but uses this term throughout the template, creating confusion for the AI model.
Apply this diff to maintain consistent terminology:
-- Do not mention the term final result, call it provided text +- Do not mention the term "final result", call it "provided text"Additionally, update all instances of "final result" throughout the prompt to "provided text" for consistency.
🧹 Nitpick comments (4)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py (4)
10-10
: Fix formatting issue with escaped newline.The line contains a
\n
character that appears to be an escaped newline, which could affect prompt formatting.Apply this diff to fix the formatting:
-expressions within the course context. Do not assume equivalence between terms unless explicitly stated. +expressions within the course context. Do not assume equivalence between terms unless explicitly stated.
39-39
: Fix grammatical error.The word "sentences" should be "sentence" since it refers to a single sentence.
Apply this diff:
--Make sure to always insert two new lines after the last character of this sentences. +-Make sure to always insert two new lines after the last character of this sentence.
45-45
: Fix grammatical error."Each suggestions" should be "Each suggestion" for proper grammar.
Apply this diff:
-- Each suggestions highlights what is the inconsistency and how it can be improved. +- Each suggestion highlights what is the inconsistency and how it can be improved.
44-51
: Remove unnecessary escaped newlines and clarify unclear reference.The section contains escaped newlines (
\n
) that may affect formatting, and line 50 has an unclear reference to "Both" without clear context.Apply this diff to clean up formatting and clarify the reference:
--"suggestion": This entry is a list of strings, each string represents a suggestion to improve the final result.\n +- "suggestion": This entry is a list of strings, each string represents a suggestion to improve the final result. - Each suggestion should focus on a different inconsistency. - Each suggestion highlights what is the inconsistency and how it can be improved. - Do not mention the term "final result", call it "provided text" -- Please ensure that at no time, you have a different amount of suggestions than inconsistencies.\n +- Please ensure that at no time, you have a different amount of suggestions than inconsistencies. - Highlight how you can improve the rewritten text to be consistent with the existing FAQs. -Both should have the same amount of entries. +The "faqs" and "suggestion" fields should have the same number of entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (1)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py (1)
37-37
: Clarify confusing assumption statement.This statement is still unclear and was flagged in previous reviews. The logic needs to be more explicit.
-Assume that existing FAQs are correct, so the new final_result is inconsistent. +Assume that existing FAQs are correct. If there are contradictions, consider the final_result inconsistent with the FAQs.
🧹 Nitpick comments (2)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py (2)
25-25
: Fix typo in comment.There's a typo "addition al" that should be "additional".
- // The addition al fields are defined below + // The additional fields are defined below
54-54
: Fix formatting inconsistency.The dash formatting should be consistent with other list items.
--"improved version": This entry should be a string that represents the improved version of the final result. +- "improved version": This entry should be a string that represents the improved version of the final result.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py
(1 hunks)
🧰 Additional context used
🪛 Pylint (3.3.7)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py
[convention] 1-1: Missing module docstring
(C0114)
[convention] 1-1: Constant name "faq_consistency_prompt" doesn't conform to UPPER_CASE naming style
(C0103)
🪛 GitHub Actions: Iris - Test
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py
[error] 1-1: Pre-commit hook 'trailing-whitespace' modified this file to trim trailing whitespace.
[error] 43-43: flake8: line too long (122 > 120 characters) (E501)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py (3)
1-1
:⚠️ Potential issueAdd module docstring and fix constant naming convention.
The module is still missing a docstring and the constant name doesn't follow Python naming conventions, as flagged in previous reviews and static analysis.
Apply this diff to address the persistent static analysis issues:
+""" +Prompt template for FAQ consistency validation in the rewriting pipeline. + +This module contains the prompt template used to check if rewritten content +is consistent with existing FAQ entries fo 10973 r a course. +""" + -faq_consistency_prompt = """ +FAQ_CONSISTENCY_PROMPT = """🧰 Tools
🪛 Pylint (3.3.7)
[convention] 1-1: Missing module docstring
(C0114)
[convention] 1-1: Constant name "faq_consistency_prompt" doesn't conform to UPPER_CASE naming style
(C0103)
🪛 GitHub Actions: Iris - Test
[error] 1-1: Pre-commit hook 'trailing-whitespace' modified this file to trim trailing whitespace.
34-34
: 🛠️ Refactor suggestionClarify the confusing assumption statement.
The statement "Assume that existing FAQs are correct, so the new final_result is inconsistent" is confusing and was flagged in previous reviews but remains unclear. The logic doesn't flow properly.
Apply this diff to improve clarity:
-Assume that existing FAQs are correct, so the new final_result is inconsistent. +Assume that existing FAQs are correct. When contradictions exist, consider the final_result as inconsistent with the FAQs.
43-49
: 🛠️ Refactor suggestionFix formatting inconsistencies and grammar error.
The formatting uses inconsistent dash and quote combinations, and there's a grammar error on line 45.
Apply this diff to standardize formatting and fix grammar:
--"suggestion": This entry is a list of strings, each string represents a suggestion to improve the final result. +- "suggestion": This entry is a list of strings, each string represents a suggestion to improve the final result. - Each suggestion should focus on a different inconsistency. -- Each suggestions highlights what is the inconsistency and how it can be improved. +- Each suggestion highlights what the inconsistency is and how it can be improved. - Do not mention the term final result, call it provided text -- Please ensure that at no time, you have a different amount of suggestions than inconsistencies. +- Please ensure that at no time, you have a different amount of suggestions than inconsistencies. - Highlight how you can improve the rewritten text to be consistent with the existing FAQs. Both should have the same amount of entries.
🧹 Nitpick comments (2)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py (2)
39-39
: Fix grammar error in instruction."sentences" should be "sentence" for grammatical correctness.
Apply this diff:
--Make sure to always insert two new lines after the last character of this sentences. +-Make sure to always insert two new lines after the last character of this sentence.
51-51
: Fix formatting consistency for improved version field.The formatting should match the other field descriptions for consistency.
Apply this diff:
--"improved version": This entry should be a string that represents the improved version of the final result. +- "improved version": This entry should be a string that represents the improved version of the final result.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py
(1 hunks)
🧰 Additional context used
🪛 Pylint (3.3.7)
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py
[convention] 1-1: Missing module docstring
(C0114)
[convention] 1-1: Constant name "faq_consistency_prompt" doesn't conform to UPPER_CASE naming style
(C0103)
🪛 GitHub Actions: Iris - Test
iris/src/iris/pipeline/prompts/faq_consistency_prompt.py
[error] 1-1: Pre-commit hook 'trailing-whitespace' modified this file to trim trailing whitespace.
Motivation
To ensure the quality and consistency of FAQ entries, a dedicated consistency-checking mechanism has been introduced. This allows automated detection of structural or semantic inconsistencies in FAQ data, improving content maintainability and supporting editors during the revision process.
Description
Summary by CodeRabbit
New Features
Improvements
Bug Fixes