data:image/s3,"s3://crabby-images/c1666/c16662a009395d1662f43ea508e2c0c4b0808f04" alt=""
data:image/s3,"s3://crabby-images/bfcc3/bfcc3e59c5757ac350dab9ac8983ec3c7969d71e" alt=""
Custom Pentaho Plugins
data:image/s3,"s3://crabby-images/6f303/6f3038014b6c8d840fb70e03ab8bc0dc0f6b8af6" alt=""
You Are Here
-
Java Libraries designed to Pentaho's Interface
-
Tried to develop in Scala (other JVM langs?)... Abandoned!
-
-
Who else here has built their own plugins?
-
Our Plugin Projects for:
-
Apache Jena - https://github.com/nationalarchives/kettle-jena-plugins
-
RDF Graph Creation, Merging, and Serialization
-
SHACL Validation
-
-
Synchronisation - https://github.com/nationalarchives/kettle-atomic-plugins
-
XML - https://github.com/nationalarchives/kettle-xml-extra-plugins
-
Debugging - https://github.com/nationalarchives/kettle-debug-plugins
-
data:image/s3,"s3://crabby-images/bfcc3/bfcc3e59c5757ac350dab9ac8983ec3c7969d71e" alt=""
Why Build Plugins?
data:image/s3,"s3://crabby-images/6f303/6f3038014b6c8d840fb70e03ab8bc0dc0f6b8af6" alt=""
You Are Here
-
Our default position is not to!
-
We needed functionality not offered by Pentaho
-
or... too complex to implement as N steps
-
-
Why not use a User Defined Java/JavaScript step?
-
Source Control
-
Reuse
-
Duplicate code -> More maintenance!
-
Can't publish as Open Source (for greater RoI)
-
-
Testing
-
Unit and Integration Tests... CI?
-
-
We do use a few very small User Defined JavaScript steps!
-
data:image/s3,"s3://crabby-images/bfcc3/bfcc3e59c5757ac350dab9ac8983ec3c7969d71e" alt=""
Starting a new Step Plugin
data:image/s3,"s3://crabby-images/6f303/6f3038014b6c8d840fb70e03ab8bc0dc0f6b8af6" alt=""
You Are Here
-
Official Documentation
-
https://help.hitachivantara.com/Documentation/Pentaho/9.2/
Developer_center/Create_step_plugins -
Enough to get you started
-
Lacking for any real purposes
-
Won't teach you Eclipse SWT (UI) Toolkit or Pentaho SWT!
-
-
Best Examples - Reading the code of Pentaho's Steps
-
https://github.com/pentaho/pentaho-kettle/tree/9.2.0.0-R/plugins
-
We learnt a great deal by studying:
-
data:image/s3,"s3://crabby-images/bfcc3/bfcc3e59c5757ac350dab9ac8983ec3c7969d71e" alt=""
Anatomy of a Step Plugin
data:image/s3,"s3://crabby-images/6f303/6f3038014b6c8d840fb70e03ab8bc0dc0f6b8af6" alt=""
You Are Here
data:image/s3,"s3://crabby-images/bfcc3/bfcc3e59c5757ac350dab9ac8983ec3c7969d71e" alt=""
data:image/s3,"s3://crabby-images/8498a/8498ad7665461cc84b67d6704c91edd7c2256730" alt=""
processRow is where your business
happens! It is equivalent to
User Defined Java/JavaScript Step
StepMeta
glues everything
together
StepData holds state from row-to-row
during the full transformation
Apache Jena Step Plugins
data:image/s3,"s3://crabby-images/6f303/6f3038014b6c8d840fb70e03ab8bc0dc0f6b8af6" alt=""
You Are Here
-
Create a Jena Model (RDF Graph) per-row
-
Maps fields in a row into RDF Properties
-
-
Combine Jena Models per-row
-
Merges one-or-more Jena Models within the same row
-
-
Group and Merge Jena Models per-column
-
Merges Models from consequtive rows within the same column
-
-
Serialize Jena Model per-column per-transformation
-
Merges all Jena Models (from a column), and writes a file
-
-
SHACL Validation
data:image/s3,"s3://crabby-images/bfcc3/bfcc3e59c5757ac350dab9ac8983ec3c7969d71e" alt=""
Transforming Relational to RDF
data:image/s3,"s3://crabby-images/6f303/6f3038014b6c8d840fb70e03ab8bc0dc0f6b8af6" alt=""
You Are Here
data:image/s3,"s3://crabby-images/bfcc3/bfcc3e59c5757ac350dab9ac8983ec3c7969d71e" alt=""
-
Demo...
data:image/s3,"s3://crabby-images/aa1ed/aa1ed1d93f67015211102c16adfce3e8af97e65f" alt=""
Synchronisation Step Plugins
data:image/s3,"s3://crabby-images/6f303/6f3038014b6c8d840fb70e03ab8bc0dc0f6b8af6" alt=""
-
Compare and Set Atomic per-row
-
Conditionally initalise or CaS an Atomic Value
-
-
Await Atomic per-row
-
Await for an Atomic Value and conditionally branch
-
data:image/s3,"s3://crabby-images/bfcc3/bfcc3e59c5757ac350dab9ac8983ec3c7969d71e" alt=""
-
Allows us to perform several steps as one Atomic Operation
-
Uses Java's Atomic values
-
Concurrency - Can be Tricky to get right!
-
Remember - Every Step in Pentaho is a distinct Thread!
-
-
Our Use Case - Get or Create (and Calculate) an Identifier
Synchronising Transformation Steps
data:image/s3,"s3://crabby-images/6f303/6f3038014b6c8d840fb70e03ab8bc0dc0f6b8af6" alt=""
You Are Here
data:image/s3,"s3://crabby-images/bfcc3/bfcc3e59c5757ac350dab9ac8983ec3c7969d71e" alt=""
-
Demo...
data:image/s3,"s3://crabby-images/9c60e/9c60e0db0ef59fe746537e195232fd7e8d1a89b6" alt=""
Enhancing Pentaho Itself
data:image/s3,"s3://crabby-images/6f303/6f3038014b6c8d840fb70e03ab8bc0dc0f6b8af6" alt=""
data:image/s3,"s3://crabby-images/bfcc3/bfcc3e59c5757ac350dab9ac8983ec3c7969d71e" alt=""
-
We chose Pentaho because it is Open Source
-
We have a mandate to evaluate Open Source first
-
-
Pentaho (like all software) has Issues!
-
We have contributed fixes for:
-
Correct Date Time processing (pre 14th Sept 1752) #8006
See: https://blog.adamretter.org.uk/processing-historical-dates/ -
Correctly detecting JAVA_HOME #7023
-
Documentation about how to compile a distribution #7841
-
Correcting UI rendering on macOS
-
Fixed failing tests on Windows #8007
-
(not) Enhancing Pentaho Itself
data:image/s3,"s3://crabby-images/6f303/6f3038014b6c8d840fb70e03ab8bc0dc0f6b8af6" alt=""
data:image/s3,"s3://crabby-images/bfcc3/bfcc3e59c5757ac350dab9ac8983ec3c7969d71e" alt=""
-
Only two of our most minor fixes have been incorporated
-
In reality - Pentaho is only technically Open Source
-
There is no Open Source Community
-
Contributing to Pentaho is (almost) Impossible!
-
We have sent high quality code with tests and 100% test suite pass
-
Developers are difficult to reach
-
Pull-Requests (or issues) can go unanswered "For Ever"
-
Pull-Requests can be closed without a working solution
-
Opening JIRA Tickets doesn't result in progress
-
-
-
Hitachi Sales / Support
-
We would consider a contract... if we get the fixes we need!
-
Sharing is Caring!
data:image/s3,"s3://crabby-images/6f303/6f3038014b6c8d840fb70e03ab8bc0dc0f6b8af6" alt=""
data:image/s3,"s3://crabby-images/bfcc3/bfcc3e59c5757ac350dab9ac8983ec3c7969d71e" alt=""
-
We are currently maintaining our fork of Pentaho Kettle 9.1
-
Not Practical for us
-
Updating is tricky... 9.2 is out now
-
Have to maintain skilled staff, GitHub, CI, etc.
-
-
Not Sustainable for the Future... What are our options?
-
...Would we choose Pentaho again?
-
Questions?
data:image/s3,"s3://crabby-images/6f303/6f3038014b6c8d840fb70e03ab8bc0dc0f6b8af6" alt=""
data:image/s3,"s3://crabby-images/bfcc3/bfcc3e59c5757ac350dab9ac8983ec3c7969d71e" alt=""
Adam Retter
Director of Evolved Binary
(Consultant) Technical Architect for Project Omega,
The National Archives
data:image/s3,"s3://crabby-images/6645b/6645b109b519e7cdf9f56940a28259f48295572b" alt=""
data:image/s3,"s3://crabby-images/ee6dc/ee6dc9e067ba4ff3bb7d8afb91a16e14df4250af" alt=""
@adamretter
Pentaho Plugins and Enhancements
By Adam Retter
Pentaho Plugins and Enhancements
Talk given for Pentaho London Users Group on behalf of Project Omega at The National Archives - 10th Feb 2022
- 1,491