Description
Feature description
I would like to be able to execute
pipeline.run([load_info], table_name="_load_info")
and
pipeline.run([pipeline.last_trace], table_name="_trace")
But only create single tables for each in the destination. As per docs, this should be achievable with the max_table_nesting
resource parameter. However, if I try to use it, dlt
complains it can't load JSON types, and so I also change file format to jsonl
. However, then it produces incorrect table (for tracing), and the load info doesn't work, with the "Pipeline
object is not JSON-serializable" error.
The final method I tried looked like this:
trace = dlt.resource(pipeline.last_trace, "_dlt_trace", max_table_nesting=0)
trace.apply_hints(file_format="jsonl")
pipeline.run(trace, table_name="_trace")
And this produced a messy table (the columns were different than ones produced by the documented way shown at the top, also eg. there was a value
column with values from several other columns etc.).
I suspect I could build the correct Python dict manually as a workaround, but that seems quite cumbersome. It seems like either I'm missing some setting or dlt is missing some features for this to "just work".
Are you a dlt user?
None
Use case
No response
Proposed solution
No response
Related issues
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status