This is the opening post in a series of four posts.
Oracle Tribuo is a Java based, open-source, general purpose machine learning library. It provides tools for various machine learning tasks, such as classification, regression and clustering, as well as natural language processing (NLP). In addition, Tribuo enables Java programs to use models that were trained by Python libraries, such as scikit-learn.
Tribuo is distributed under an Apache 2.0 License, which means anyone is free to make and patent derivative works as long as they don’t sue someone else over patent claims regarding the original code.
As Tribuo is strongly typed, each model created is aware of the types of inputs it expects as well as the type of the output it produces. The types are preserved when models are serialized to disk and loaded again.
Another interesting aspect of Tribuo is Provenance, integrated into Tribuo’s classes representing Models, Datasets, and Evaluations, and enables them to know what parameters, transformations, and files were used to create them. Using provenance data, each model can be rebuilt from scratch, and evaluations can track the models and datasets used for each experiment.
Tribuo enables interoperability between Java and native machine learning solutions by providing interfaces to popular libraries such as XGBoost and Tensorflow. In addition, Tribuo’s support for ONNX model exchange format enables deployment of models created using other packages and languages.