An Arrow Dataset for reading record batches from Arrow feather files. Feather is a light-weight columnar format ideal for simple writing of Pandas DataFrames.
arrow_feather_dataset(filenames, columns, output_types, output_shapes = NULL)
filenames | A |
---|---|
columns | A list of column indices to be used in the Dataset. |
output_types | Tensor dtypes of the output tensors. |
output_shapes | TensorShapes of the output tensors or |
# NOT RUN { dataset <- arrow_feather_dataset( list('/path/to/a.feather', '/path/to/b.feather'), columns = reticulate::tuple(0L, 1L), output_types = reticulate::tuple(tf$int32, tf$float32), output_shapes = reticulate::tuple(list(), list())) %>% dataset_repeat(1) sess <- tf$Session() iterator <- make_iterator_one_shot(dataset) next_batch <- iterator_get_next(iterator) until_out_of_range({ batch <- sess$run(next_batch) print(batch) }) # }