Tags: research data-mining data java algorithms artificial-intelligence machine-learning analysis
Weka is a collection of machine learning algorithms for solving real-world data mining problems. It is written in Java and runs on almost any platform. The algorithms can either be applied directly to a dataset or called from your own Java code.
RapidMiner (YALE): Java Data Mining
RapidMiner (formerly YALE) is the most comprehensive open-source software for intelligent data analysis, data mining, knowledge discovery, machine learning, predictive analytics, forecasting, and analytics in business intelligence (BI). RapidMiner provides more than 400 data mining operators, a graphical user interface (GUI), an online tutorial with hands-on data mining applications, a comprehensive PDF tutorial, many visualization schemes for data sets and data mining results, many different le...
SimMetrics is a Similarity Metric Library, e.g. from edit distance's (Levenshtein, Gotoh, Jaro etc) to other metrics, (e.g Soundex, Chapman). Work provided by UK Sheffield University funded by (AKT) an IRC sponsored by EPSRC, grant number GR/N15764/01.
Java Data Mining Package (JDMP)
The Java Data Mining Package (JDMP) is an open source Java library for data analysis and machine learning.

It facilitates the access to data sources and machine learning algorithms (e.g. clustering, regression, classification, graphical models, optimization) and provides visualization modules. It includes a matrix library for storing and processing any kind of data, with the ability to handle very large matrices even when they do not fit into memory. Import and export interfaces are provid...

MARF:Modular Audio Recognition Framework
MARF is an open-source research platform and a collection of voice/sound/speech/text and natural language processing (NLP) algorithms written in Java and arranged into a modular and extensible framework facilitating addition of new algorithms. MARF can run distributedly over the network and may act as a library in applications or be used as a source for learning and extension.
TestEl is a Java-based learning analyzer for HTML (and possibly other) structured documents. It can be trained to detect structures in such documents and renders hits in XML.
Feature Extraction plugin API
Tags: information-analysis artificial-intelligence analysis
Easy-to-use platform-independent plugin API for the extraction of low-level features from audio data in PCM format, as required in the context of music information retrieval software.
Apolo, user based music suggesting.
Tags: artificial-intelligence analysis
Apolo is a personal music suggesting system based on user behavior analysis.
DataTime Process Framework
Tags: analysis video artificial-intelligence framework distributed-computing
The DataTime Process Framework is intended to support the processing of time-based data in a modular, concurrent, distributed and extensible manner. C++, using YARP, ACE, Qt and MUSCLE on Linux, OSX, Windows and Solaris.