kangas

launch

def launch(host=None,
           port=4000,
           debug=None,
           protocol="http",
           hide_selector=None,
           **cli_kwargs)

Launch the Kangas servers.

Note: this should never be needed as the Kangas servers are started automatically when needed.

Arguments:

host - (str) the name or IP of the machine the servers should listen on.
port - (int) the port of the Kangas frontend server. The backend server will start on port + 1.
debug - (str) the debugging output level will be shown as you run the servers.

Example:

>>> import kangas
>>> kangas.launch()

show

def show(datagrid=None,
         filter=None,
         host=None,
         port=4000,
         debug=None,
         height="750px",
         width="100%",
         protocol="http",
         hide_selector=False,
         use_ngrok=False,
         cli_kwargs=None,
         **kwargs)

Start the Kangas servers and show the DatGrid UI in an IFrame or browser.

Arguments:

datagrid - (str) the DataGrid's location from current directory
filter - (str) a filter to set on the DataGrid
host - (str) the name or IP of the machine the servers should listen on.
port - (int) the port of the Kangas frontend server. The backend server will start on port + 1.
debug - (str) debugging output level will be shown as you run the servers.
height - (str) the height (in "px" pixels) of the iframe shown in the Jupyter notebook.
width - (str) the width (in "px" pixels or "%" percentages) of the iframe shown in the Jupyter notebook.
use_ngrok - (optional, bool) force using ngrok as a proxy
cli_kwargs - (dict) a dictionary with keys the names of the kangas server flags, and values the setting value (such as: {"backend-port": 8000})
kwargs - additional URL parameters to pass to server

Example:

>>> import kangas
>>> kangas.show("./example.datagrid")
>>> kangas.show("./example.datagrid", "{'Column Name'} < 0.5")
>>> kangas.show("./example.datagrid", "{'Column Name'} < 0.5",
...     group="Another Column Name")

read_sklearn

def read_sklearn(dataset_name)

Load a sklearn dataset by name.

Arguments:

dataset_name - (str) one of: 'boston', 'breast_cancer', 'diabetes', 'digits', 'files', 'iris', 'linnerud', 'sample_image', 'sample_images', 'svmlight_file', 'svmlight_files', 'wine'

Example:

>>> dg = kg.read_sklearn("iris")

read_parquet

def read_parquet(filename, **kwargs)

Takes a parquet filename or URL and returns a DataGrid.

Note: requires pyarrow to be installed.

Example:

>>> dg = DataGrid.read_parquet("userdata1.parquet")

read_dataframe

def read_dataframe(dataframe, **kwargs)

Takes a columnar pandas dataframe and returns a DataGrid.

Arguments:

dataframe - (pandas.DataFrame) the DataFrame to read from. Only works on in-memory DataFrames. If your DataFrame is stored on disk, you will need to load it first.
datetime_format - (str) the Python date format that dates are read. For example, use "%Y/%m/%d" for dates like "2022/12/01".
heuristics - (bool) whether to guess that some float values are datetime representations
name - (str) the name to use for the DataGrid
filename - (str) the filename to save the DataGrid to
converters - (dict) dictionary of functions where the key is the columns name, and the value is a function that takes a value and converts it to the proper type and form.
Note - the file or URL may end with ".zip", ".tgz", ".gz", or ".tar" extension. If so, it will be downloaded and unarchived. The JSON file is assumed to be in the archive with the same name as the file/URL. If it is not, then please use the kangas.download() function to download, and then read from the downloaded file.

Examples:

>>> import kangas
>>> from pandas import DataFrame
>>> df = DataFrame(...)
>>> dg = kangas.read_dataframe(df)
>>> dg.save()

read_datagrid

def read_datagrid(filename, **kwargs)

Reads a DataGrid from a filename or URL. Returns the DataGrid.

Arguments:

filename - the name of the file or URL to read the DataGrid from
Note - the file or URL may end with ".zip", ".tgz", ".gz", or ".tar" extension. If so, it will be downloaded and unarchived. The JSON file is assumed to be in the archive with the same name as the file/URL. If it is not, then please use the kangas.download() function to download, and then read from the downloaded file.

Examples:

>>> import kangas
>>> dg = kangas.read_datagrid("example.datagrid")
>>> dg = kangas.read_datagrid("http://example.com/example.datagrid")
>>> dg = kangas.read_datagrid("http://example.com/example.datagrid.zip")
>>> dg.save()

read_json

def read_json(filename, **kwargs)

Read JSON or JSON Line files [1]. JSON should be a list of objects, or a file with object on each line.

Arguments:

filename - the name of the file or URL to read the JSON from
datetime_format - (str) the Python date format that dates are read. For example, use "%Y/%m/%d" for dates like "2022/12/01".
heuristics - (bool) whether to guess that some float values are datetime representations
name - (str) the name to use for the DataGrid
converters - (dict) dictionary of functions where the key is the columns name, and the value is a function that takes a value and converts it to the proper type and form.
Note - the file or URL may end with ".zip", ".tgz", ".gz", or ".tar" extension. If so, it will be downloaded and unarchived. The JSON file is assumed to be in the archive with the same name as the file/URL. If it is not, then please use the kangas.download() function to download, and then read from the downloaded file.

[1] - https://jsonlines.org/

Example:

>>> import kangas as kg
>>> dg = kg.read_json("json_line_file.json")
>>> dg = kg.read_json("https://instances.social/instances.json")
>>> dg = kg.read_json("https://company.com/data.json.zip")
>>> dg = kg.read_json("https://company.com/data.json.gz")
>>> dg.save()

download

def download(url, ext=None)

Downloads a file, and unzips, untars, or ungzips it.

Arguments:

url - (str) the URL of the file to download
ext - (optional, str) the format of the archive: "zip", "tgz", "gz", or "tar".
Note - the URL may end with ".zip", ".tgz", ".gz", or ".tar" extension. If so, it will be downloaded and unarchived. If the URL doesn't have an extension or it does not match one of those, but it is one of those, you can override it using the ext argument.

Example:

>>> import kangas
>>> kangas.download("https://example.com/example.images.zip")

read_csv

def read_csv(filename,
             header=0,
             sep=",",
             quotechar='"',
             heuristics=True,
             datetime_format=None,
             converters=None)

Takes a CSV filename and returns a DataGrid.

Arguments:

filename - the CSV file or URL to import
header - (optional, int) row number (zero-based) of column headings
sep - used in the CSV parsing
quotechar - used in the CSV parsing
heuristics - if True, guess that some numbers might be dates
datetime_format - (str) the Python date format that dates are read. For example, use "%Y/%m/%d" for dates like "2022/12/01".
converters - (dict, optional) A dictionary of functions for converting values in certain columns. Keys are column labels.
Note - the file or URL may end with ".zip", ".tgz", ".gz", or ".tar" extension. If so, it will be downloaded and unarchived. The JSON file is assumed to be in the archive with the same name as the file/URL. If it is not, then please use the kangas.download() function to download, and then read from the downloaded file.

Examples:

>>> import kangas
>>> dg = kangas.read_csv("example.csv")
>>> dg = kangas.read_csv("http://example.com/example.csv")
>>> dg = kangas.read_csv("http://example.com/example.csv.zip")
>>> dg.save()

Kangas DataGrid is completely open source; sponsored by Comet ML

Home
- User Guides
  - Installation - installing kangas
  - Reading data - importing data
  - Constructing DataGrids - building from scratch
  - Exploring data - exploration and analysis
  - Examples - scripts and notebooks
- Kangas Command-Line Interface
- Kangas Python API
  - kangas - top-level functions
  - DataGrid - DataGrid object and methods
  - Image - Image object and methods
  - Embedding - Embedding object and methods
  - Tensor - Tensor object and methods
- Integrations - with Hugging Face and Comet
- User Interface
  - Filter expressions - filter syntax
  - Cell Types
    - Boolean
    - Datetime
    - Embedding
    - Float
    - Image
    - Integer
    - JSON
    - Tensor
    - Text
    - Vector
- FAQ - Frequently Asked Questions
- Under the Hood
  - Security - issues related to security
  - Development - setting up a development environment
  - Roadmap - plans and known issues

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kangas

kangas

launch

show

read_sklearn

read_parquet

read_dataframe

read_datagrid

read_json

download

read_csv

Table of Contents

Clone this wiki locally