A project in progress that will speed up the G2-network by using the similarity in files instead of the complete file.
Java Data Mining Package (JDMP)
The Java Data Mining Package (JDMP) is an open source Java library for data analysis and machine learning.

It facilitates the access to data sources and machine learning algorithms (e.g. clustering, regression, classification, graphical models, optimization) and provides visualization modules. It includes a matrix library for storing and processing any kind of data, with the ability to handle very large matrices even when they do not fit into memory. Import and export interfaces are provid...

MARF:Modular Audio Recognition Framework
MARF is an open-source research platform and a collection of voice/sound/speech/text and natural language processing (NLP) algorithms written in Java and arranged into a modular and extensible framework facilitating addition of new algorithms. MARF can run distributedly over the network and may act as a library in applications or be used as a source for learning and extension.