OpenRefine is a free, open source power tool for working with messy data and improving it
-
Updated
Sep 19, 2024 - Java
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
OpenRefine is a free, open source power tool for working with messy data and improving it
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Statistical Machine Intelligence & Learning Engine
Java dataframe and visualization library
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Hopsworks - Data-Intensive AI platform with a Feature Store
Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
ELKI Data Mining Toolkit
The premier open source Data Quality solution
Categorical Query Language IDE
Ultimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction
⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Roadmap for Data Engineering
Blockchain2graph extracts blockchain data (bitcoin) and insert them into a graph database (neo4j).
Una introduccion al analisis de datos con R y R Studio
A Java Toolbox for Scalable Probabilistic Machine Learning
🔥 One of the most comprehensive open-source data annotation platform.
A point-and-click tool for creating and analyzing topic models produced by MALLET.
Wrangler Transform: A DMD system for transforming Big Data