This is a data mining toolkit that implements Occam's knife :
- These learners eagerly and aggressively dump irrelevancies, redundancies etcs to learn tiny models.
- To read more about that, see http://coffee.unbox.org/p/big-picture.html.
(And yes, I know the more common term is Occam's Razor but I prefer razor to knife- see below.)
This repo is also my attempt at serious CoffeeScript programming. So I've given up a playpen where I can take baby steps in CoffeeScript.
The directories are:
- build/ : my private build environment. Really, you don't want to look in here- just use the built stuff (found in easy/ and harder/);
- data/ : example data used by the code in easy/ and harder/;
- easy/ : the playpen, for my intro to Coffee experiments and demos of the simpler data structures;
- harder/ : more advanced stuff than easy/
Note that each file in easy/ and harder/ is a stand-alone CoffeeScript application with all required files joined together. Also in those directories is one .js file for each CoffeeScript file.
(+) but I'm not trimming tiny stubble off the main body- I'm hacking away large chunks. So knife seems more apt that razor).
William of Occam was a 14th-century English logician proposed Occam's Razor :
Entia non sunt multiplicanda praeter necessitatem.
Or, in English, "entities must not be multiplied beyond necessity". It is also a core principle of data mining, where it got tweaked a little. The best theory, according to data miners, is the one that minimizes both theory size and its number of errors: see [http://en.wikipedia.org/wiki/Minimum_message_length].
But that is all geek land. Here, I just say I prefer knife to razor since the latter just sounds like I'm just trimming tiny stubble off a large body. That's not what I do. I hack away and discard most of the data. So my data mining tool is definitely a knife.