8000 GitHub - rtjarvis/salute: Automatic data preparation for machine learning
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

rtjarvis/salute

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Salute Project

Automatic data preparation for machine learning

The need

In Big Data processing most of the time is spent in preparing the data ready for use by advanced machine learning tools or humans building reports. This work means that insights remain locked away for too long.

The answer

Salute is able to process any type of file (Text, Image, Video, Audio, etc) and generate an output file with all the features created ready to be loaded into a machine learning or reporting tool

The technology

Salute is based on Spark and is able to process huge files.

Running Salute

The best way to run Salute is:

<spark_home>/bin/spark-submit target/salute-0.1-SNAPSHOT.jar <input_file> <output_dir>

About

Automatic data preparation for machine learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0