8000 Allow loading `load_info` and `pipeline.last_trace` as single tables · Issue #2702 · dlt-hub/dlt · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Allow loading load_info and pipeline.last_trace as single tables #2702
Open
@trymzet

Description

@trymzet

Feature description

I would like to be able to execute

pipeline.run([load_info], table_name="_load_info")

and

pipeline.run([pipeline.last_trace], table_name="_trace")

But only create single tables for each in the destination. As per docs, this should be achievable with the max_table_nesting resource parameter. However, if I try to use it, dlt complains it can't load JSON types, and so I also change file format to jsonl. However, then it produces incorrect table (for tracing), and the load info doesn't work, with the "Pipeline object is not JSON-serializable" error.

The final method I tried looked like this:

trace = dlt.resource(pipeline.last_trace, "_dlt_trace", max_table_nesting=0)
trace.apply_hints(file_format="jsonl")
pipeline.run(trace, table_name="_trace")

And this produced a messy table (the columns were different than ones produced by the documented way shown at the top, also eg. there was a value column with values from several other columns etc.).

I suspect I could build the correct Python dict manually as a workaround, but that seems quite cumbersome. It seems like either I'm missing some setting or dlt is missing some features for this to "just work".

Are you a dlt user?

None

Use case

No response

Proposed solution

No response

Related issues

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0