Adam Retter
adam@evolvedbinary.com
Declarative Amsterdam
@ Amsterdam Science Park
2023-11-02
@adamretter
A Possible
EXPath Pkg Version 2
About Me
-
Director and CTO of Evolved Binary
-
XML / XQuery / XSLT / RDF / SPARQL
-
Scala / Java / C++ / Rust
-
Concurrency and Scalability
-
-
Creator of FusionDB multi-model database
-
Contributor to
Facebook'sMeta's RocksDB (7 yrs.) -
Core contributor to eXist-db XML Database (18 yrs.)
-
Founder of EXQuery, and creator of RESTXQ
-
Was a W3C XQuery WG Invited expert
What is "Packaging"?
-
A loosely undefined set of related concepts!
-
A "package" is a container of one or more things
-
Might conform to a standard size, shape, or construction
-
Might ease the storage of things
-
Might ease the transportation of things
-
-
The act of "packaging" is that of containing the things
-
We will focus on Packaging of:
-
Software Code
-
Data
-
Photo by Jiawei Zhao on Unsplash
Why do we need Packages?
-
Software Reuse
-
Modern software architecture is modular
-
We are dependent on Software Libraries
-
Each may consist of many files
-
In-turn dependent on other libs. Ad infinitum
-
-
We may wish to publish our App/Libraries
-
May depend on many libraries
-
May consist of many files
-
-
-
Data Distribution
-
A data set may consist of many files
-
We may need to consume data sets
-
We may wish to publish data sets
-
Photo by Kelly Sikkema on Unsplash
Principles of Modularity
-
"At implementation time each module and its inputs and outputs are well-defined, there is no confusion in the intended interface with other system modules."
-
"At checkout time the integrity of the module is tested independently"
-
"the system is maintained in modular fashion; system errors and deficiencies can be traced to specific system modules"
Photo by Tom Hermans on Unsplash
Designing Systems Programs, by Gauthier and Ponto (1970)
Packaging is an Ecosystem
-
The "package" itself is one small part of a larger system
-
Hopefully a standardised file (and metadata?) format and name
-
-
We also need to consider:
-
Consumption
-
Integration
-
Storage
-
Building a new Package (a.k.a. "Packaging")
-
Transportation
-
Publication
-
Photo by Vlad Tchompalov on Unsplash
The Package Itself
-
Essential Properties
-
Detailed open specification that standardises its format
-
Internal - What goes where, and how?
-
Interface - What is available from the package, and how?
-
External - Package file fomat(s) and naming convention(s)
-
-
Standardised metadata describing the package
-
Not implementation specific!
-
-
Content Agnostic
-
-
Desirable Properties
-
Ease of storage / transportation
-
Single file containing both data and metadata; compressible
-
-
Easily inspectable
-
Metadata can be easily accessed and understood
-
-
Verifiable
-
Photo by Oli Zubenko on Unsplash
Integration
-
How do we use a package?
-
It's the things inside that we care about!
-
Do they need to be extracted from the package?
-
-
-
What about tools?
-
Do existing tools understand what a package is?
-
Do they even need to?
-
Could they be updated to support packages?
-
Can new tools be built to bridge between packages and existing tools?
-
-
Do we need to build new tools?
-
Photo by Kelly Sikkema on Unsplash
Building a Package
-
Output Format, i.e. the "Package"
-
Should be defined elsewhere in a "Package Specification" standard
-
-
Input Format
-
Unknown... Likely tool specific!
-
Needs to be clearly defined and documented for the users
-
-
-
Existing tools might be usable
-
e.g. Compose: mkdir, cp, tar, and gz
-
-
New tooling could simplify
-
Require certain inputs
-
Validate inputs and outputs
-
Single command
-
Photo by Josue Isai Ramos Figueroa on Unsplash
Transportation and Publishing
-
One file, or a data file and accompanying metadata file(s)
-
Amenable to std. operations, e.g: cp, scp, EMail, Upload to Dropbox, etc.
-
-
Publish to where?
-
Anywhere that accepts generic files
-
An environment adapted to the Package format
-
Registry - holds metadata and redirects to the package elsewhere
-
Repository - holds metadata and a copy of the package
-
Typically provide search facilities
-
May provide upload/download capabilities
-
Possibly accessible as a Humane Website
-
Possibly accessible by API - may also provide tools
-
-
Photo by David Trinks on Unsplash
Current Packaging for XML
Photo by Hush Naidoo Jade Photography on Unsplash
EXPath Packaging System
-
EXPath Candidate Module - May 2012
-
An unfinished draft of a potential standard
-
Describes itself as a "packaging system" for components: "XSLT, XQuery, and XProc"
-
Some tools (xrepo and Java libs) provided
-
Covers:
-
The Package:
-
External file format
-
Layout of files within the package
-
Metadata (including dependencies and exported components)
-
-
Resolution of Namespace URI to (local) Components
-
On-disk repository layout
-
Photo by Jen Theodore on Unsplash
EXPath Packaging System
-
The Good:
-
We have something to discuss!
-
Reasonable basic Package metadata
-
Package is a single Zip file
-
Semantic Versioning 2.0.0 is used
-
People have used it "for real"...
-
Experience! i.e. We know where the pain is!
-
-
Photo by Pawtography Perth on Unsplash
EXPath Packaging System
-
The Bad:
-
It is completely missing:
-
Consumption
-
Building
-
Transportation
-
Publication
-
-
Integration is weakly defined
-
Same URI can be reused for different components
-
No security
-
Q: Is that XQuery going to delete my collection(s)?
-
-
No checksums
-
On-disk repo package directories are named by the non-unique package
abbrev
-
Photo by Priscilla Du Preez 🇨🇦 on Unsplash
EXPath Packaging System
-
The Ugly:
-
Each package has two names: `name` and `abbrev`
-
Metadata lacks extensibility:
-
Can't add additional user oriented information
-
Can't add implementation specific metadata
-
Metadata for a component is only a URI and filename
-
-
Components are explicit in metadata, could be introspected instead?
-
Dependencies on processors?
-
Photo by Sylwia Bartyzel on Unsplash
EXPath Packaging System Implementations
-
Marklogic - abandoned prototype.
-
BaseX - Supported.
-
Saxon - abandoned prototype.
-
eXist-db
-
Undocumented Metadata Extensions to EXPath Packaging -
repo.xml
,exist.xml
, andrepo.xml
-
<license>
/<copyright>
/<type>
/<target>
/<prepare>
/<finish>
-
-
Consumption / Publication: Public Application Repository (~100 Pkgs.)
-
Integration:
autodeploy
directory, XQuery functions, repository partly in database -
Building: Ant, or Maven.
Photo by Karsten Winegeart on Unsplash
EXPath Packaging in eXist-db
Photo by Mohamed Nohassi on Unsplash
Where do we go from here?
-
We know that we have EXPath Packaging
-
We know what we need/want from Packaging
-
A modern ecosystem that encompasses:
-
Consumption
-
Integration
-
Storage
-
Building
-
Transportation
-
Publication
-
-
-
EXPath Packaging doesn't yet meet our requirements
-
Can we fix it?
-
or, do we need to start again?
-
Photo by David Kovalenko on Unsplash
That sounds like a lot of hard work!
Photo by Nathalie SPEHNER on Unsplash
Can we take a lesson from this bird?
Photo by Joshua J. Cotten on Unsplash
Can we take a lesson from this bird?
Photo by Joshua J. Cotten on Unsplash
-
If our eggs (Packages) looked like someone else's eggs...
-
If we put our eggs in someone else's nest (Repository)...
-
Would they look after them for us?
Repurposing Another System
-
RPM / DEB / Homebrew
-
May be possible, but single version, and highly system oriented.
-
libsolv
is interesting!
-
-
NPM
-
JavaScript only. Only one public repo. (Don't even) ask about their purported packaging standards.
-
-
RubyGems
-
Ruby only. Only one public repo. Packaging format is both good and bad.
-
-
Maven
-
JVM first, but extensible for any package format. Build centric approach.
-
-
Pip
-
Python only. Single version, has dependency resolution problems.
-
-
Conda
-
Designed for handling any language!
-
Photo by Steven Wright on Unsplash
Two Contenders
-
Maven
-
Consumption: From repositories
-
Integration: Major IDEs
-
Storage:The
.m2
folder -
Building:The
mvn
tool -
Transportation / Publication: To repositories
-
-
Conda
-
Consumption: From repositories
-
Integration: Major IDEs
-
Storage: The
.conda
folder (or virtualenv) -
Building: Conda Forge
-
Transportation / Publication: To repositories
-
Photo by Karsten Winegeart on Unsplash
Future Work...
-
A series of distinct standards and tools
-
Can we design a revised EXPath Packaging Standard (v2?) that can be implemented through reuse of other packaging systems?
-
Can we maintain compatibility with EXPath Packaging v1?
-
-
Can we successfully implement a revised EXPath Packaging Standard (v2?):
-
Using Maven?
-
Using Conda?
-
If so, are they interoperable?
-
Photo by Roberto Nickson on Unsplash
Questions?
January 22 - 26, 2024 / London
Our new Training Course
-
Modular:
-
XML
-
XQuery 3.1
-
XSLT 2 and 3
-
XML Databses
-
-
In Person:
-
Instructor Led
-
Lot of Hands-on Exercises - Bring your laptop!
-
3 to 5 days depending on your chosen Modules
-
email: hello@evolvedbinary.com
Alexandra
Adam
Tomos
A Possible EXPath Pkg Version 2
By Adam Retter
A Possible EXPath Pkg Version 2
Presentation given at the Declarative Amsterdam conference - 2 November 2023 - Amsterdam Science Park
- 710