10000 bump analyzer 0.1.7 by sundarshankar89 · Pull Request #1732 · databrickslabs/lakebridge · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

bump analyzer 0.1.7 #1732

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 19, 2025
Merged

bump analyzer 0.1.7 #1732

merged 2 commits into from
Jun 19, 2025

Conversation

sundarshankar89
Copy link
Collaborator

Bump Analyzer 0.1.7

Copy link
github-actions bot commented Jun 19, 2025

✅ 14/14 passed, 1 skipped, 1m0s total

Running from acceptance #1364

@gueniai gueniai merged commit de92d2f into main Jun 19, 2025
12 of 16 checks passed
@gueniai gueniai deleted the bump/analyzer-0.1.7 branch June 19, 2025 15:02
sundarshankar89 added a commit that referenced this pull request Jun 19, 2025
* Added Analyzer debug ability ([#1727](#1727)). The Analyzer class's `analyze` method now accepts an `is_debug` parameter, which is automatically set based on the current logging level, enabling debug mode when the logging level is set to `DEBUG`. This enhancement allows the analyzer to provide more detailed output or diagnostic information during the analysis process when debug mode is activated. As a result, the analyzer can be run in debug mode by configuring the logging level to `DEBUG`, providing more comprehensive insights into the analysis process.
* Added screenshots and gif showing Lakebridge not remorph ([#1726](#1726)). The visual representation of certain images has been updated to correctly display Lakebridge instead of Remorph, as part of an effort to resolve an associated issue where Lakebridge was not being properly displayed. The updates, which have been manually tested to ensure their accuracy, replace the existing binary content with new data, providing a correct visual representation. This change aims to resolve the issue of incorrect display and ensure that the images accurately reflect the intended content, with the updates being thoroughly verified through manual testing to guarantee their correctness.
* Configure Reconcile Patch ([#1690](#1690)). The installation process has been updated to correctly identify the lakebridge wheel path for deployment by modifying the install method to search for the string `lakebridge` instead of `remorph` in the wheel paths. This change affects the deployment of tables, dashboards, and jobs based on the provided reconcile configuration and wheel paths, with the rest of the installation process remaining unchanged. Additionally, the configure reconcile patch has been updated to reference the "lakebridge-x.y.z-py3-none-any.whl" wheel file, triggering the deployment of a reconciliation job, table, and dashboard, as verified by assertions that the corresponding deployer methods are called. A TODO comment has been added to investigate the motivation behind this change, indicating that further examination may be necessary to fully understand its implications.
* Cosmetic logging updates ([#1704](#1704)). The logging functionality has been enhanced for clarity and reduced noise, with changes aimed at maintaining consistency and minimizing unnecessary log output. Log messages now capitalize `Lakebridge` when it appears at the start, and the logging level for certain messages, such as transpilation completion status and the number of statuses encountered, has been downgraded from INFO to DEBUG to reserve the INFO level for more critical information. Additionally, some log messages have been removed or updated to reduce noise, while a new DEBUG-level log message has been introduced to indicate transpilation completion with a given status. These updates improve the overall logging experience by decluttering the standard log output and making it easier to focus on essential information.
* Create an issue template for creating docs issues ([#1721](#1721)). The issue template configuration has been updated to provide a more interactive and relevant experience for users submitting documentation-related issues. A new template for documentation updates has been introduced, offering a structured format with fields such as checkboxes to check for existing issues and text areas for problem statements and additional context. This template aims to enc
8000
ourage informative issue submissions by guiding users to provide clear and concise descriptions of the issue, along with optional additional context such as references or screenshots. The template also includes labels and a title prefix to facilitate issue categorization and triage, ultimately improving the issue creation process and making it easier to track and address documentation updates.
* Doc Update ([#1681](#1681)). The open-source library has undergone several updates, including changes to the build badge, documentation, and uninstall process. The build badge now references the lakebridge repository, while the workflow link remains unchanged. The Profiler guide documentation has been revised to indicate that the Profiler is coming soon, replacing previous information on running and configuring it with an info admonition instructing users to stay tuned for updates. Additionally, the demos section has been updated to include an informational admonition, providing a more engaging way to communicate the upcoming availability of demos. Lastly, the uninstall process has been updated to reflect the change in product name from `remorph` to "lakebridge", with the `WorkspaceClient` constructor and `run` function modified to accurately target the correct product and maintain version consistency using the `__version__` variable.
* Documentation Updates ([#1701](#1701)). The Lakebridge installation process has been streamlined with a unified installation command, and subsequent installation of specific components like Transpile, with updated documentation including flowcharts and detailed verification steps. The Reconcile module's setup has been rebranded as a configuration step, and its execution process has been modified to utilize a Python command in a notebook cell, leveraging the `recon` function from the `databricks.labs.lakebridge.reconcile.execute` module. To enhance the user experience, new and updated visual aids have been added, including animated GIFs illustrating the installation and verification processes, as well as updated images providing guidance on the reconciliation and transpile setup. Additionally, the logger setup has been modified to reflect the rebranding of components from Remorph to Lakebridge, with new log messages directing users to the Lakebridge documentation for further information, now available at https://databrickslabs.github.io/lakebridge/.
* Documentation markdown improvements ([#1688](#1688)). The documentation for the library has been significantly updated to improve readability, accuracy, and consistency. Changes include updates to the legacy metadata export documentation, which now provides clearer instructions on how to export metadata from various platforms, including PowerCenter, DataStage, SSIS, Talend, and others, using command-line utilities, GUI methods, and links to external documentation. Additionally, the documentation for the Lakebridge Analyzer has been improved, with corrections to formatting issues, typos, and broken image links, making it easier for users to understand the tool's capabilities, such as job complexity assessment, comprehensive job inventory, and cross-system interdependency mapping. Furthermore, the overview documentation and transpile process documentation have been updated to correct typos, improve clarity, and ensure consistency, with example commands now using the correct `lakebridge` references instead of `remorph`, providing users with more accurate and helpful guidance on using the library's features.
* Exit normally instead of indicating failure during `install-transpile` if the user doesn't want to replace the configuration ([#1691](#1691)). The installation process for transpile configurations has been improved to handle cases where a user chooses not to override an existing installation. Instead of exiting with an error, the process now returns the existing configuration, allowing the installation to complete successfully without replacing the existing settings. This change provides a more user-friendly experience by logging a completion message instead of an error when the user decides not to proceed with overwriting the existing installation. Additionally, the corresponding test case has been updated to verify that the installation process returns the expected configuration object with properties such as transpiler configuration path, source dialect, input source, output folder, catalog name, schema name, and SDK configuration, ensuring a successful installation without attempting to override existing configurations.
* Fixed issue templates, so issues can be created ([#1687](#1687)). The issue templates for bug reports and feature requests have been updated to resolve a schema issue that was preventing issues from being created. Specifically, the `type` field has been modified to use a single string value, either `bug` or "feature", which aligns with GitHub's top-level syntax for issue forms. This change enables the correct loading of the issue templates, allowing users to create new issues using either template, which include fields such as description, title, labels, and checkboxes, and in the case of feature requests, a title prefix and specific labels like "needs-triage".
* Fixed java version check if java is unavailable ([#1730](#1730)). The Java version detection functionality has been enhanced to handle cases where Java is not installed. The method responsible for retrieving the Java version now utilizes a platform-independent approach to locate the Java executable and returns `None` if it is not found. If the executable is found, it runs the Java executable with the `-version` flag and captures the output, while still checking the return code of the process and raising an error if it fails. Additionally, the testing framework has been updated with a new fixture to simulate the absence of Java and a corresponding test to verify that the Java version check correctly handles this scenario, ensuring that the program can provide informative error messages, such as warning the user that Java 11 or above is required.
* Make error reporting more compact ([#1693](#1693)). The error reporting functionality has been overhauled to provide more concise and informative messages, making it easier for developers to understand and diagnose issues. Errors are now grouped by path and severity, with the count of each type of error reported, rather than logging each error individually. The severity of errors is determined by a predefined enum, and messages are logged at the corresponding level. This change enables more compact error reporting, where only the count of errors and warnings is reported, along with the number of times each error or warning occurred and their individual positions. To support this new functionality, several internal functions have been updated or introduced, including a function to generate a header based on transpilation diagnostics, which can handle various scenarios such as repeated errors or warnings, and ensure accurate and concise header formatting. Additionally, test cases have been updated to reflect these changes, focusing on verifying that error messages related to the input source are properly logged, without relying on specific error messages.
* Patch Encoding ([#1719](#1719)). The library now includes support for quoted-printable encoding in ETL sources, resolving encoding-related issues in output files. To achieve this, an `EmailParser` with a default policy is utilized to process transpiled code, enabling the handling of quoted-printable encoded content. The decoding process has been enhanced to use the specified charset, defaulting to `utf-8` if none is provided, and the decoded content is written to the output file using the system's default encoding. Furthermore, the logging functionality has been improved by conditionally setting the logger level to `DEBUG` when the application is running in debug mode, allowing for more verbose logging when needed, and the `get_logger` function now also imports the `is_in_debug` function to facilitate this feature. These updates aim to improve the handling of encoding in output files and provide more detailed logging for debugging purposes.
* Refactor parsing of `java -version` during `install-transpile` ([#1731](#1731)). The Java version parsing logic has been overhauled to improve flexibility and accuracy, now treating the version as a 4-tuple comprising feature, interim, update, and patch versions. This change enables more precise comparisons, such as determining if the version is greater than or equal to 11.0.0.0. The parsing process utilizes a regex pattern to match the version string from the output of the `java -version` command, with the extracted version components being returned as a tuple of four integers or `None` if parsing is unsuccessful. Additionally, the code now includes enhanced logging for better debugging and support, including the detected Java version and any errors encountered during the execution of the `java -version` command. The updates also encompass revised test cases to reflect the new parsing and comparison logic, ensuring the installer's requirement for at least Java 11 is correctly enforced, along with the introduction of new unit tests to verify the robustness of the Java version parsing for various version formats and edge cases.
* Removed line number comments from decorated output (again) ([#1684](#1684)). The transpile output has been modified to remove line number comments, aligning with the original intention of providing a cleaner and more readable result. Previously, line numbers were being incorrectly reintroduced, resulting in comments such as `/* 1 */` or `/* 2 */` in the output. This change corrects this issue, ensuring that the output no longer includes these comments, while still maintaining error location information in header comments, such as `[3:3]`, to indicate the start position of problematic expressions. Additionally, the code has been refactored to remove unnecessary logic and imports, including the removal of the `math` import and the `with_line_numbers` variable, resulting in a more streamlined output without line numbers.
* Type hints ([#1705](#1705)). Type hints have been added to various properties, methods, and functions to improve code readability and maintainability by providing explicit type information. This includes type hints for return types such as `Path`, `Path | None`, and `Literal["databricks"]`, as well as type hints for function parameters and return values, including `dict[str, Any]` for the `v1_migrate` and `v2_migrate` class methods. Additionally, type hints have been added to the `mock_workspace_client` and `mock_cli_for_transpile` functions, which now explicitly indicate their return types, such as `WorkspaceClient` and a generator yielding a tuple containing a `WorkspaceClient`, `TranspileConfig`, a callable, and a `MagicMock`. Type hints have also been added to numerous test functions, making the code more explicit and self-documenting, and helping to catch type-related errors earlier, ultimately improving the overall quality of the code.
* Updated `generate-lineage` unit tests ([#1707](#1707)). A new fixture has been introduced to generate an empty directory for testing purposes, providing a clean and isolated environment for more robust and reliable test results. The unit tests for the `generate-lineage` CLI have been updated to improve reliability and maintainability, with additions including type hints, removal of deprecated fixtures, and replacement of monkey-patching with parameterized directory and file setup using the new fixture and existing output folder parameter. The tests now utilize the `pathlib` module for path and file manipulation and the `re` module for escaping special characters in expected error messages, covering various scenarios such as valid input, invalid dialect, and invalid input and output directories, and verifying the `generate-lineage` CLI's behavior in each case, including exception handling for invalid inputs.
* Updated issue templates ([#1682](#1682)). The project's reporting and documentation infrastructure has undergone significant updates to reflect its rebranding from Remorph to Lakebridge. The bug report template now includes revised categories for reporting issues, such as Application crashed, Profiler bug, and Analyzer bug, and offers updated version options. Additionally, the issue template configuration and feature request template have been modified to reference Lakebridge documentation and resources, providing users with more accurate and relevant information. The feature request template now includes updated category options, such as Profiler, Analyzer, and Transpiler, to improve the clarity and accuracy of submitted requests. Furthermore, the project's documentation URL has been updated to point to a dedicated documentation website, enhancing user access to relevant resources and information. These changes aim to improve the overall user experience, streamline the reporting and feature request processes, and provide more effective support for users of the Lakebridge project.
* Use diagnostics' severity to log their message using the corresponding log level ([#1685](#1685)). The logging of diagnostics messages has been enhanced to utilize the corresponding log level based on the severity of the message, with `ERROR` mapped to `logging.ERROR`, `WARNING` to `logging.WARNING`, and `INFO` to `logging.INFO`, while other severities default to `logging.DEBUG`. This change enables more nuanced error reporting during the transpilation process, where errors are now logged according to their severity rather than all being reported as errors, allowing warnings to be distinguished from errors. As a result, error handling and logging have been improved to provide more specific and informative error messages, aligning with the updated logging mechanism and ensuring that warnings are no longer incorrectly reported as errors.
* added supported sources for analyzer ([#1709](#1709)). The analyzer documentation has been enhanced with a new section that details the supported dialects for various source platforms, providing a centralized resource for understanding the tool's capabilities. A comprehensive table outlines the supported source platforms, including ABInitio, Informatica Cloud, SAS, ADF, MS SQL Server, Snowflake, and others, offering a clear overview of the dialects currently supported by the analyzer. This addition aims to improve the transparency and usability of the analyzer by explicitly documenting its capabilities and limitations, making it easier for users to determine which source platforms are compatible with the tool and streamline their workflow accordingly.
* bump analyzer 0.1.7 ([#1732](#1732)). The project's dependency on the databricks-bb-analyzer has been updated to version 0.1.7, introducing the latest features and fixes from this analyzer, which may enhance the project's overall functionality and performance by leveraging improvements and bug fixes introduced in this updated version.
* patch folder structure ([#1683](#1683)). The processing of combined parts during transpilation has been enhanced with improved file name handling and debugging capabilities. A debug log statement has been added to track file processing, providing visibility into the execution flow. File names are now extracted to exclude directory paths, ensuring only the actual file name is used for further processing. The existing logic for handling empty or non-string filenames remains unchanged, and filename segmentation for determining output folder structure is still performed. Additionally, test cases have been updated to verify the presence of at least one JSON file in the output folder, rather than a specific `Jobs` directory, using a recursive search with the `glob` method, making the test condition more flexible and robust.
* updated sources in docs ([#1708](#1708)). The documentation has been updated to reflect changes in supported source platforms and transpilers, providing more accurate information about the current capabilities and limitations of the system. The summary table for source platforms now indicates that dbt support has shifted from a general implementation to an experimental repointing-focused one, labeled as "dbt Repointing (Experimental)". Additionally, the documentation for supported dialects has been revised to include a new table that outlines the conversion capabilities of BladeBridge and Morpheus, with added columns for SQL, ETL/Orchestration, and dbt Repointing (Experimental), covering various source platforms such as DataStage, Informatica, and major database management systems, thereby giving users a clearer overview of the supported dialects and conversion options.
@sundarshankar89 sundarshankar89 mentioned this pull request Jun 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0