8000 Add compiler for Python to procedural SQL translation by elBurg0 · Pull Request #2 · dbis-ilm/grizzly · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add compiler for Python to procedural SQL translation #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

elBurg0
Copy link
Contributor
@elBurg0 elBurg0 commented Jun 12, 2022

Description

Adding a compiler to translate Python statements into a procedural langauge based on the db-type.
This compiler was developed with the support of @sthagedorn at the Department of Databases and Information Systems at the TU Ilmenau as part of a bachelor thesis.

Changes

  • Add ANTLR4 compiler to translate Python statements to procedural SQL.
  • Link compiler in _generateCreateFunc function in sqlgenerator module.
  • Add fallback to PL/Python and local pandas execution of UDFs when errors occur while compiling Python statements.

Detailed modifications

  • Add SQL calls for creating procedural functions and other database specific mappings like datatypes, error types and functions to the grizzly.yml file.

  • Add exceptions to catch errors while compiling and so enabling fallback and user information.

  • Change show function and add _fallback function in frame module to enable dataframe fallback.

  • Add fallback and language parameters when creating UDF FuncCall with the map function. Default is language = 'py' for PL/Python UDF execution and fallback=False to prevent fallbacks to PL/Python or local Pandas execution of the UDF.

  • from the last Grizzly DataFrame object without the projection of the udf

  • Add to_df function to generator.py and relationaldbexecutor.py to execute generated SQL from the last Grizzly DataFrame object without the projection of the udf with the pandas read_sql function to create a DataFrame for local fallback execution.

  • Add an example for udf execution in dbms with procedural functions in example.py.

  • Add the udfcompiler_test.py file for testing compiler with UDFs from grizzly.udfcompiler.test_udfs module.

  • Add details about performancetests in grizzly.udfcompiler.speedtest module with queries and times.

  • Add cx_oracle and psycopg2 dependencies in RelationalExecutor Module to check the given connection type and automatically create the SQLGenerator Object with the corresponding profile.
    If not desired, user has to manually set the db profile when creating the RelationalExecutor with the SQLGenerator: grizzly.use(RelationalExecutor(con, SQLGenerator('oracle')).

  • Remove '_' in name of temporary tables, caused trouble in PostgreSQL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0