From 0f1fcc3685d27082c060297bf8113eaa53775a4e Mon Sep 17 00:00:00 2001
From: Yesudeep Mangalapilly <yesudeep@google.com>
Date: Wed, 14 May 2025 02:42:07 -0700
Subject: [PATCH] feat(py/dotpromptz): test harness that creates suites and
 test case methods dynamically
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The previous implementation of the test harness simply iterated
over each suite and test case within it to run it. It made it hard
to identify and scope failures.

This technique generates test suites and test case methods dynamically
adding them to the module environment before the `unittest.main`
entry-point takes over.

CHANGELOG:
- [ ] Update `spec_test.py` with the newer implementation.

USAGE:

```
zsh❯ uv run spec_test.py
...2025-05-14 02:29:57 [info     ] [TEST] history > basic_history > inserts conversation history with proper metadata and roles
.2025-05-14 02:29:57 [info     ] [TEST] history > empty_history > handles empty history by only rendering the template content
.2025-05-14 02:29:57 [info     ] [TEST] ifEquals > basic > renders false branch when values are not equal
.2025-05-14 02:29:57 [info     ] [TEST] ifEquals > basic > renders true branch when values are equal
.2025-05-14 02:29:57 [info     ] [TEST] ifEquals > type_safety > treats different types as not equal
.2025-05-14 02:29:57 [info     ] [TEST] json > basic > renders json in place
.2025-05-14 02:29:57 [info     ] [TEST] json > indented > renders json in place
.2025-05-14 02:29:57 [info     ] [TEST] media > basic > renders media part
.2025-05-14 02:29:57 [info     ] [TEST] metadata > ext > extension fields are parsed and added to 'ext'
.2025-05-14 02:29:57 [info     ] [TEST] metadata > metadata_state > accesses state object from metadata
.2025-05-14 02:29:58 [info     ] [TEST] metadata > metadata_state > handles missing state values
.2025-05-14 02:29:58 [info     ] [TEST] metadata > metadata_state > handles nested state objects
.2025-05-14 02:29:58 [info     ] [TEST] metadata > raw > raw frontmatter is provided on top of parsed frontmatter
.2025-05-14 02:29:58 [info     ] [TEST] partials > basic_partial > renders a basic partial
2025-05-14 02:29:58 [debug    ] {'event': 'partial_registered', 'name': 'greeting'}
.2025-05-14 02:29:58 [info     ] [TEST] partials > nested_resolved_partial > renders a resolver partial inside a resolver partial.
.2025-05-14 02:29:58 [info     ] [TEST] partials > partial_with_context > renders a partial with context
2025-05-14 02:29:58 [debug    ] {'event': 'partial_registered', 'name': 'userGreeting'}
.2025-05-14 02:29:58 [info     ] [TEST] partials > resolved_partial > renders a partial provided by a resolver.
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > any_field > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > array_of_scalars > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > enum_field > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > inferred_json_schema_from_properties > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > input_and_output > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > line_endings_crlf > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > named_schema_override_description > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > nested_named_schema > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > nested_object_in_array_and_out > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > required_field > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > simple_json_schema_type > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > simple_object > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > simple_scalar_description > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > simple_scalar_description_extra_whitespace > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > simple_scalar_description_no_whitespace > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > simple_scalar_description_with_commas > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > simple_scalar_no_description > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > wildcard_fields_with_other_fields > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] picoschema > wildcard_fields_without_other_fields > returns as expected
.2025-05-14 02:29:58 [info     ] [TEST] role > all_roles > allows system, user, and model roles
.2025-05-14 02:29:58 [info     ] [TEST] role > system_only_prompt > inserts history after system prompt
.2025-05-14 02:29:58 [info     ] [TEST] role > system_role > renders variables in system and user role
.2025-05-14 02:29:58 [info     ] [TEST] section > basic_section > renders sequential sections with proper metadata and content boundaries
.2025-05-14 02:29:58 [info     ] [TEST] section > nested_sections > handles nested and reopened sections with proper metadata boundaries
.2025-05-14 02:29:58 [info     ] [TEST] unlessEquals > basic > renders false branch when values are equal
.2025-05-14 02:29:58 [info     ] [TEST] unlessEquals > basic > renders true branch when values are different
.2025-05-14 02:29:58 [info     ] [TEST] unlessEquals > type_safety > treats different types as not equal
.2025-05-14 02:29:58 [info     ] [TEST] variables > basic > does not escape HTML
.2025-05-14 02:29:58 [info     ] [TEST] variables > basic > overrides a default variable with a provided variable
.2025-05-14 02:29:58 [info     ] [TEST] variables > basic > uses a default variable
.2025-05-14 02:29:58 [info     ] [TEST] variables > basic > uses a provided variable
.
----------------------------------------------------------------------
Ran 51 tests in 0.054s
```
---
 README.md                                     |  13 +-
 .../dotpromptz/tests/dotpromptz/spec_test.py  | 378 ++++++++++++------
 2 files changed, 260 insertions(+), 131 deletions(-)

diff --git a/README.md b/README.md
index 90c9c0d12..a22cb1b56 100644
--- a/README.md
+++ b/README.md
@@ -1,6 +1,7 @@
 [![Go checks](https://github.com/google/dotprompt/actions/workflows/go.yml/badge.svg)](https://github.com/google/dotprompt/actions/workflows/go.yml)
 [![Python checks](https://github.com/google/dotprompt/actions/workflows/python.yml/badge.svg)](https://github.com/google/dotprompt/actions/workflows/python.yml)
 [![JS checks](https://github.com/google/dotprompt/actions/workflows/test.yml/badge.svg)](https://github.com/google/dotprompt/actions/workflows/test.yml)
+[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/google/dotprompt)
 
 # Dotprompt: Executable GenAI Prompt Templates
 
@@ -16,26 +17,26 @@ An executable prompt template is a file that contains not only the text of a
 prompt but also metadata and instructions for how to use that prompt with a
 generative AI model. Here's what makes Dotprompt files executable:
 
-- **Metadata Inclusion**: Dotprompt files include metadata about model
+* **Metadata Inclusion**: Dotprompt files include metadata about model
   configuration, input requirements, and expected output format. This
   information is typically stored in a YAML frontmatter section at the beginning
   of the file.
 
-- **Self-Contained Entity**: Because a Dotprompt file contains all the necessary
+* **Self-Contained Entity**: Because a Dotprompt file contains all the necessary
   information to execute a prompt, it can be treated as a self-contained entity.
   This means you can "run" a Dotprompt file directly, without needing additional
   configuration or setup in your code.
 
-- **Model Configuration**: The file specifies which model to use and how to
+* **Model Configuration**: The file specifies which model to use and how to
   configure it (e.g., temperature, max tokens).
 
-- **Input Schema**: It defines the structure of the input data expected by the
+* **Input Schema**: It defines the structure of the input data expected by the
   prompt, allowing for validation and type-checking.
 
-- **Output Format**: The file can specify the expected format of the model's
+* **Output Format**: The file can specify the expected format of the model's
   output, which can be used for parsing and validation.
 
-- **Templating**: The prompt text itself uses Handlebars syntax, allowing for
+* **Templating**: The prompt text itself uses Handlebars syntax, allowing for
   dynamic content insertion based on input variables.
 
 This combination of features makes it possible to treat a Dotprompt file as an
diff --git a/python/dotpromptz/tests/dotpromptz/spec_test.py b/python/dotpromptz/tests/dotpromptz/spec_test.py
index f2e06c6e7..8be0c7b78 100644
--- a/python/dotpromptz/tests/dotpromptz/spec_test.py
+++ b/python/dotpromptz/tests/dotpromptz/spec_test.py
@@ -99,19 +99,46 @@
 
 from __future__ import annotations
 
+import re
 import unittest
+from collections.abc import Callable, Coroutine
 from pathlib import Path
-from typing import Any, TypedDict, cast
+from typing import Any, TypedDict
 
 import structlog
 import yaml
 
 from dotpromptz.dotprompt import Dotprompt
-from dotpromptz.typing import DataArgument, JsonSchema, PromptMetadata, ToolDefinition
+from dotpromptz.typing import DataArgument, JsonSchema, ToolDefinition
 
 logger = structlog.get_logger(__name__)
 
 
+CURRENT_FILE = Path(__file__)
+ROOT_DIR = CURRENT_FILE.parent.parent.parent.parent.parent
+SPECS_DIR = ROOT_DIR / 'spec'
+
+# List of files that are allowed to be used as spec files.
+# Useful for debugging and testing.
+ALLOWLISTED_FILES = [
+    'spec/helpers/history.yaml',
+    'spec/helpers/ifEquals.yaml',
+    'spec/helpers/json.yaml',
+    'spec/helpers/media.yaml',
+    'spec/helpers/role.yaml',
+    'spec/helpers/section.yaml',
+    'spec/helpers/unlessEquals.yaml',
+    'spec/metadata.yaml',
+    'spec/partials.yaml',
+    'spec/picoschema.yaml',
+    'spec/variables.yaml',
+]
+
+# Counters for test class and test method names.
+suite_counter = 0
+test_case_counter = 0
+
+
 class Expect(TypedDict, total=False):
     """An expectation for the spec."""
 
@@ -145,25 +172,6 @@ class SpecSuite(TypedDict, total=False):
     tests: list[SpecTest]
 
 
-CURRENT_FILE = Path(__file__)
-ROOT_DIR = CURRENT_FILE.parent.parent.parent.parent.parent
-SPECS_DIR = ROOT_DIR / 'spec'
-
-ALLOWLISTED_FILES = [
-    'spec/helpers/history.yaml',
-    'spec/helpers/ifEquals.yaml',
-    'spec/helpers/json.yaml',
-    'spec/helpers/media.yaml',
-    'spec/helpers/role.yaml',
-    'spec/helpers/section.yaml',
-    'spec/helpers/unlessEquals.yaml',
-    'spec/metadata.yaml',
-    'spec/partials.yaml',
-    'spec/picoschema.yaml',
-    'spec/variables.yaml',
-]
-
-
 def is_allowed_spec_file(file: Path) -> bool:
     """Check if a spec file is allowed.
 
@@ -180,8 +188,84 @@ def is_allowed_spec_file(file: Path) -> bool:
     return False
 
 
+def sanitize_name_component(name: str | None) -> str:
+    """Sanitizes a name component for use in a Python identifier.
+
+    Args:
+        name: The name to sanitize.
+
+    Returns:
+        A sanitized name.
+    """
+    name_str = str(name) if name is not None else 'None'
+    name_str = re.sub(r'[^a-zA-Z0-9_]', '_', name_str)
+    if name_str and name_str[0].isdigit():
+        name_str = '_' + name_str
+    return name_str or 'unnamed_component'
+
+
+def make_test_method_name(yaml_file_name: str, suite_name: str | None, test_desc: str | None) -> str:
+    """Creates a sanitized test method name.
+
+    Args:
+        yaml_file_name: The name of the YAML file.
+        suite_name: The name of the suite.
+        test_desc: The description of the test.
+
+    Returns:
+        A sanitized test method name.
+    """
+    file_part = sanitize_name_component(yaml_file_name.replace('.yaml', ''))
+    suite_part = sanitize_name_component(suite_name)
+    desc_part = sanitize_name_component(test_desc)
+    return f'test_{file_part}_{suite_part}_{desc_part}_'
+
+
+def make_test_class_name(yaml_file_name: str, suite_name: str | None) -> str:
+    """Creates a sanitized test class name for a suite.
+
+    Args:
+        yaml_file_name: The name of the YAML file.
+        suite_name: The name of the suite.
+
+    Returns:
+        A sanitized test class name.
+    """
+    file_part = sanitize_name_component(yaml_file_name.replace('.yaml', ''))
+    suite_part = sanitize_name_component(suite_name)
+    return f'Test_{file_part}_{suite_part}Suite'
+
+
+def make_dotprompt_for_suite(suite: SpecSuite) -> Dotprompt:
+    """Constructs and sets up a Dotprompt instance for the given suite.
+
+    Args:
+        suite: The suite to construct a Dotprompt for.
+
+    Returns:
+        A Dotprompt instance.
+    """
+    resolver_partials_from_suite: dict[str, str] = suite.get('resolver_partials', {})
+
+    def partial_resolver_fn(name: str) -> str | None:
+        return resolver_partials_from_suite.get(name)
+
+    dotprompt = Dotprompt(
+        schemas=suite.get('schemas'),
+        tools=suite.get('tools'),
+        partial_resolver=partial_resolver_fn if resolver_partials_from_suite else None,
+    )
+
+    # Register partials directly defined in the suite
+    defined_partials: dict[str, str] = suite.get('partials', {})
+    for name, template_content in defined_partials.items():
+        dotprompt.define_partial(name, template_content)
+
+    return dotprompt
+
+
 class TestSpecFiles(unittest.IsolatedAsyncioTestCase):
-    """Runs specification tests defined in YAML files."""
+    """Runs essential checks to ensure the spec directory is valid."""
 
     def test_spec_path(self) -> None:
         """Test that the spec directory exists."""
@@ -199,126 +283,170 @@ def test_spec_files_are_valid(self) -> None:
                 data = yaml.safe_load(f)
                 self.assertIsNotNone(data)
 
-    async def test_specs(self) -> None:
-        """Discovers and runs all YAML specification tests."""
-        for yaml_file in SPECS_DIR.glob('**/*.yaml'):
-            if not is_allowed_spec_file(yaml_file):
-                logger.warn(
-                    'Skipping spec file',
-                    file=yaml_file,
-                )
-                continue
-            with self.subTest(file=yaml_file):
-                with open(yaml_file) as f:
-                    suites_data = yaml.safe_load(f)
 
-                for suite_data_raw in suites_data:
-                    suite: SpecSuite = cast(SpecSuite, suite_data_raw)
-                    suite_name: str = suite.get('name', f'UnnamedSuite_in_{yaml_file.name}')
+class YamlSpecTestBase(unittest.IsolatedAsyncioTestCase):
+    """A base class that is used as a template for all YAML spec test suites."""
 
-                    suite['name'] = suite_name
-
-                    with self.subTest(suite=suite_name):
-                        for tc_raw in suite.get('tests', []):
-                            tc: SpecTest = tc_raw
-                            tc_name = tc.get('desc', f'UnnamedTest_in_{suite_name}')
-                            tc['desc'] = tc_name
-
-                            with self.subTest(test=tc_name):
-                                # TODO: Doing this per test case is safer for
-                                # test sandboxing but we could perhaps do this
-                                # per suite as well.
-                                dotprompt = self.make_dotprompt(suite)
-                                await self.run_yaml_test(yaml_file, dotprompt, suite, tc)
-
-    def make_dotprompt(self, suite: SpecSuite) -> Dotprompt:
-        """Constructs and sets up a Dotprompt instance for the given suite.
+    async def run_yaml_test(self, yaml_file: Path, suite: SpecSuite, test_case: SpecTest) -> None:
+        """Runs a YAML test.
 
         Args:
-            suite: The suite to set up the Dotprompt for.
+            yaml_file: The path to the YAML file.
+            suite: The suite to run the test on.
+            test_case: The test case to run.
 
         Returns:
-            A Dotprompt instance configured for the given suite.
+            None.
         """
-        resolver_partials: dict[str, str] = suite.get('resolver_partials', {})
+        suite_name = suite.get('name', 'UnnamedSuite')
+        test_desc = test_case.get('desc', 'UnnamedTest')
+        logger.info(f'[TEST] {yaml_file.stem} > {suite_name} > {test_desc}')
 
-        def partial_resolver_fn(name: str) -> str | None:
-            """Resolves a partial name to a template string.
+        # Create test-specific dotprompt instance.
+        dotprompt = make_dotprompt_for_suite(suite)
+        self.assertIsNotNone(dotprompt)
 
-            Args:
-                name: The name of the partial to resolve.
+        # TODO: Add test logic here.
 
-            Returns:
-                The template string for the partial, or None if the partial is not found.
-            """
-            return resolver_partials.get(name)
 
-        dotprompt = Dotprompt(
-            schemas=suite.get('schemas'),
-            tools=suite.get('tools'),
-            partial_resolver=partial_resolver_fn if resolver_partials else None,
-        )
+def make_suite_class_name(yaml_file: Path, suite_name: str | None) -> str:
+    """Creates a class name for a suite.
 
-        # Define partials if they exist.
-        partials: dict[str, str] = suite.get('partials', {})
-        for name, template in partials.items():
-            dotprompt.define_partial(name, template)
+    Args:
+        yaml_file: The path to the YAML file.
+        suite_name: The name of the suite.
 
-        return dotprompt
+    Returns:
+        A class name for the suite.
+    """
+    global suite_counter
+    suite_counter += 1
+    file_part = sanitize_name_component(yaml_file.stem)
+    suite_part = sanitize_name_component(suite_name)
+    return f'Test_{file_part}_{suite_part}Suite_{suite_counter}'
 
-    async def run_yaml_test(
-        self,
-        yaml_file: Path,
-        dotprompt: Dotprompt,
-        suite: SpecSuite,
-        test_case: SpecTest,
-    ) -> None:
-        """Runs a single specification test.
 
-        Args:
-            yaml_file: The YAML file containing the specification.
-            dotprompt: The Dotprompt instance to use.
-            suite: The suite to run the test on.
-            test_case: The test case to run.
+def make_test_case_name(yaml_file: Path, suite_name: str, test_desc: str) -> str:
+    """Creates a test case name.
 
-        Returns:
-            None
-        """
-        suite_name = suite.get('name')
-        test_name = test_case.get('desc')
-        logger.info(
-            f'[TEST] \033[1m{yaml_file.name}\033[0m: {suite_name} > {test_name}',
-            yaml_file=yaml_file.name,
-            suite_name=suite_name,
-            test_name=test_name,
-            # suite=suite,
-            # test=test_case,
-        )
-
-        # TODO: Render the template.
-        # data = {**suite.get('data', {}), **test_case.get('data', {})}
-        # result = await dotprompt.render(
-        #    suite.get('template'),
-        #    DataArgument(**data),
-        #    PromptMetadata(**test_case.get('options', {})),
-        # )
-
-        # TODO: Prune the result and compare to the expected output.
-        # TODO: Compare pruned result to the expected output.
-        # TODO: Only compare raw if the spec demands it.
-        # TODO: Render the metadata.
-        # TODO: Compare pruned metadata to the expected output.
-
-        # logger.info(
-        #    f'[TEST] \033[1m{yaml_file.name}\033[0m: {suite_name} > {test_name} finished',
-        #    yaml_file=yaml_file.name,
-        #    suite_name=suite_name,
-        #    test_name=test_name,
-        #    # suite=suite,
-        #    # test=test_case,
-        #    # result=result,
-        # )
+    Args:
+        yaml_file: The path to the YAML file.
+        suite_name: The name of the suite.
+        test_desc: The description of the test.
 
+    Returns:
+        A test case name.
+    """
+    global test_case_counter
+    test_case_counter += 1
+    file_part = sanitize_name_component(yaml_file.stem)
+    suite_part = sanitize_name_component(suite_name)
+    test_method_part = sanitize_name_component(test_desc)
+    return f'test_{file_part}_{suite_part}_{test_method_part}_{test_case_counter}'
+
+
+def make_async_test_case_method(
+    yaml_file: Path,
+    suite: SpecSuite,
+    test_case: SpecTest,
+) -> Callable[[YamlSpecTestBase], Coroutine[Any, Any, None]]:
+    """Creates an async test method for a test case.
+
+    Args:
+        yaml_file: The path to the YAML file.
+        suite: The suite to create the test method for.
+        test_case: The test case to create the test method for.
+
+    Returns:
+        An async test method.
+    """
+
+    async def test_method(self_dynamic: YamlSpecTestBase) -> None:
+        """An async test method."""
+        await self_dynamic.run_yaml_test(yaml_file, suite, test_case)
+
+    return test_method
+
+
+def make_async_skip_test_method(
+    yaml_file: Path, suite_name: str
+) -> Callable[[YamlSpecTestBase], Coroutine[Any, Any, None]]:
+    """Creates a skip test for a suite.
+
+    Args:
+        yaml_file: The path to the YAML file.
+        suite_name: The name of the suite.
+
+    Returns:
+        A skip test.
+    """
+
+    async def skip_method(self_dynamic: YamlSpecTestBase) -> None:
+        self_dynamic.skipTest(f"Suite '{suite_name}' in {yaml_file.stem} has no tests.")
+
+    return skip_method
+
+
+def generate_test_suites(files: list[Path]) -> None:
+    """Dynamically generates test suite classes and methods from YAML spec files.
+
+    Args:
+        files: A list of YAML spec files to generate test suites from.
+
+    Returns:
+        None.
+    """
+    module_globals = globals()
+
+    for yaml_file in files:
+        if not is_allowed_spec_file(yaml_file):
+            logger.warn('Skipping non-allowlisted spec file for class generation', file=str(yaml_file))
+            continue
+
+        # Load the YAML file and ensure it's valid.
+        try:
+            with open(yaml_file, encoding='utf-8') as f:
+                suites_data = yaml.safe_load(f)
+            if not suites_data:
+                logger.warn('Skipping spec file with no data', file=str(yaml_file))
+                continue
+        except yaml.YAMLError as e:
+            logger.error('Error loading spec file', file=str(yaml_file), error=e)
+            raise
+
+        # Iterate over the suites in the YAML file and ensure it has a name.
+        for suite_data in suites_data:
+            # Normalize the suite data to ensure it has a name.
+            suite: SpecSuite = suite_data
+            suite_name = suite.get('name', f'UnnamedSuite_{yaml_file.stem}')
+            suite['name'] = suite_name
+
+            # Create the dynamic test class for the suite.
+            class_name = make_suite_class_name(yaml_file, suite_name)
+            klass = type(class_name, (YamlSpecTestBase,), {})
+
+            # Skip the suite if it has no tests.
+            test_cases = suite.get('tests', [])
+            if not test_cases:
+                klass.test_empty_suite = make_async_skip_test_method(yaml_file, suite_name)  # type: ignore[attr-defined]
+
+            # Iterate over the tests in the suite and add them to the class.
+            for tc_raw in test_cases:
+                # Normalize the test case data to ensure it has a name.
+                tc: SpecTest = tc_raw
+                tc_name = tc.get('desc', 'UnnamedTest')
+                tc['desc'] = tc_name
+
+                # Create the test case method and add it to the class.
+                test_case_name = make_test_case_name(yaml_file, suite_name, tc_name)
+                test_method = make_async_test_case_method(yaml_file, suite, tc_raw)
+                setattr(klass, test_case_name, test_method)
+
+            # Add the test suite class to the module globals.
+            module_globals[class_name] = klass
+
+
+generate_test_suites(list(SPECS_DIR.glob('**/*.yaml')))
 
 if __name__ == '__main__':
     unittest.main()