Skip to content

Demo code on how to use tablecloth (a clojure Dataframe library) with different data sources.

License

Notifications You must be signed in to change notification settings

clementlefevre/clojure_for_data_manipulation

Repository files navigation

Clojure for Data Manipulation

A short demo on how to use the tablecloth library for data analysis. For people used to work with R/python, it might be difficult to find the same amount of online resources for clojure. Hereby are some code snippets i wrote that cover some generic workflow when handling tabular data :

  • load a folder of Excel file, filter, select specific columns and combine them into a single Dataframe.
  • query a Database and store the result into a Dataframe.

Beware that tablecloth works with Clojure >1.10

Usage

Just clone the repo, and, with a working Clojure/Lein setup, cd into the clojure_for_data_manipulation and run lein deps Then open a REPL to explore one of the file :

  • excel_demo.clj
  • db_demo.clj

Holy Graal on Windows

Once you are done with the Clojure scripting, you can compile your project into a binary file.

Installation on Windows

For the lucky Windows users, you need to follow the instructions here :

1 - install GraalVM and set the environment variable GRAALVM_HOME, JAVA_HOME and PATH correctly.

2 - install the GraalVM native-image

3 - the Visual Studio Build Tools and Windows 10 SDK

Bonus : On top of that, i added 2 additional environment variables : INCLUDE & LIB by following this.

Create a Windows .exe from your clojure project

Once you are happy with your code, just run lein uberjar and you will get 2 .jar files in your target folder: a standard .jar and standalone version).

Then, open the X64 Native Tools Command Prompt for VS 2019, cd into your target folder and run the following command : native-image -jar clojure-for-data-manipulation-0.1.0-SNAPSHOT-standalone.jar

If everything runs smoothly, you should then get a new clojure-for-data-manipulation-0.1.0-SNAPSHOT-standalone.exe file in the target folder.

Benchmark

When running the excel-demo script on 120 Files, the .jar version took 50 Seconds and the .exe version 40 Seconds. But keep in mind that the .exe size is 11MB an the standalone .jar 88MB.

License

Eclipse Public License - v 2.0

About

Demo code on how to use tablecloth (a clojure Dataframe library) with different data sources.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published