TensorFlow I/O

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support.

At the moment TensorFlow I/O supports the following data sources:

tensorflow_io.ignite: Data source for Apache Ignite and Ignite File System (IGFS). Overview and usage guide here.
tensorflow_io.kafka: Apache Kafka stream-processing support.
tensorflow_io.kinesis: Amazon Kinesis data streams support.
tensorflow_io.hadoop: Hadoop SequenceFile format support.
tensorflow_io.arrow: Apache Arrow data format support. Usage guide here.
tensorflow_io.image: WebP and TIFF image format support.
tensorflow_io.libsvm: LIBSVM file format support.
tensorflow_io.video: Video file support with FFmpeg.
tensorflow_io.parquet: Apache Parquet data format support.
tensorflow_io.lmdb: LMDB file format support.

Installation

The tensorflow-io package could be installed with pip directly:

$ pip install tensorflow-io

The related module such as Kafka could be imported with python:

$  python
Python 2.7.6 (default, Nov 13 2018, 12:45:42)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> import tensorflow_io.kafka as kafka
>>>
>>> dataset = kafka.KafkaDataset(["test:0:0:4"], group="test", eof=True)
>>> iterator = dataset.make_initializable_iterator()
>>> init_op = iterator.initializer
>>> get_next = iterator.get_next()
>>>
>>> with tf.Session() as sess:
...   print(sess.run(init_op))
...   for i in range(5):
...     print(sess.run(get_next))
>>>

Note that python has to run outside of repo directory itself, otherwise python may not be able to find the correct path to the module.

Developing

Python

The TensorFlow I/O package (tensorflow-io) could be built from source:

$ docker run -it -v ${PWD}:/working_dir -w /working_dir tensorflow/tensorflow:custom-op
$ # In docker
$ curl -OL https://github.com/bazelbuild/bazel/releases/download/0.20.0/bazel-0.20.0-installer-linux-x86_64.sh
$ chmod +x bazel-0.20.0-installer-linux-x86_64.sh
$ ./bazel-0.20.0-installer-linux-x86_64.sh
$ ./configure.sh
$ bazel build build_pip_pkg
$ bazel-bin/build_pip_pkg artifacts

A package file artifacts/tensorflow_io-*.whl will be generated after a build is successful.

R

We provide a reference Dockerfile here for you so that you can use the R package directly for testing. You can build it via:

docker build -t tfio-r-dev -f R-package/scripts/Dockerfile .

Inside the container, you can start your R session, instantiate a SequenceFileDataset from an example Hadoop SequenceFile string.seq, and then use any transformation functions provided by tfdatasets package on the dataset like the following:

library(tfio)
dataset <- sequence_file_dataset("R-package/tests/testthat/testdata/string.seq") %>%
    dataset_repeat(2)

sess <- tf$Session()
iterator <- make_iterator_one_shot(dataset)
next_batch <- iterator_get_next(iterator)

until_out_of_range({
  batch <- sess$run(next_batch)
  print(batch)
})

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 355 Commits
.travis		.travis
R-package		R-package
tensorflow_io		tensorflow_io
tests		tests
third_party		third_party
.gitignore		.gitignore
.travis.yml		.travis.yml
BUILD		BUILD
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
RELEASE.md		RELEASE.md
WORKSPACE		WORKSPACE
build_pip_pkg.sh		build_pip_pkg.sh
configure.sh		configure.sh
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TensorFlow I/O

Installation

Developing

Python

R

License

About

Uh oh!

Releases 45

Packages

Uh oh!

Contributors 110

Uh oh!

Languages

License

tensorflow/io

Folders and files

Latest commit

History

Repository files navigation

TensorFlow I/O

Installation

Developing

Python

R

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 45

Packages 0

Uh oh!

Contributors 110

Uh oh!

Languages

Packages