Skip to content

OCI Artifact for ML model & metadata

Python License Build E2E testing PyPI - Version

Static Badge GitHub Repo stars YouTube Channel Subscribers

This project is a collection of blueprints, patterns and toolchain (in the form of python SDK and CLI) to leverage OCI Artifact and containers for ML model and metadata.

Installation

In your Python environment, use:

pip install omlmd

Why do I need a Python environment?

This SDK follows the same prerequisites as InstructLab and is intented to offer Pythonic way to create OCI Artifact for ML model and metadata. For general CLI tools for containers, we invite you to checkout Podman and all the Containers toolings.

Push

Store ML model file model.joblib and its metadata in the OCI repository at localhost:8080:

from omlmd.helpers import Helper

omlmd = Helper()
omlmd.push("localhost:8080/matteo/ml-artifact:latest", "model.joblib", name="Model Example", author="John Doe", license="Apache-2.0", accuracy=9.876543210)
omlmd push localhost:8080/mmortari/mlartifact:v1 model.joblib --metadata md.json --plain-http

Pull

Fetch everything in a single pull:

omlmd.pull(target="localhost:8080/matteo/ml-artifact:latest", outdir="tmp/b")
omlmd pull localhost:8080/mmortari/mlartifact:v1 -o tmp/a --plain-http

Or fetch only the ML model assets:

omlmd.pull(target="localhost:8080/matteo/ml-artifact:latest", outdir="tmp/b", media_types=["application/x-mlmodel"])
omlmd pull localhost:8080/mmortari/mlartifact:v1 -o tmp/b --media-types "application/x-mlmodel" --plain-http

Custom Pull: just metadata

The features can be composed in order to expose higher lever capabilities, such as retrieving only the metadata informatio. Implementation intends to follow OCI-Artifact convention

md = omlmd.get_config(target="localhost:8080/matteo/ml-artifact:latest")
print(md)
omlmd get config localhost:8080/mmortari/mlartifact:v1 --plain-http

Crawl

Client-side crawling of metadata.

Note: Server-side analogous coming soon/reference in blueprints.

crawl_result = omlmd.crawl([
    "localhost:8080/matteo/ml-artifact:v1",
    "localhost:8080/matteo/ml-artifact:v2",
    "localhost:8080/matteo/ml-artifact:v3"
])
omlmd crawl localhost:8080/mmortari/mlartifact:v1 localhost:8080/mmortari/mlartifact:v2 localhost:8080/mmortari/mlartifact:v3 --plain-http

Example query

Demonstrate integration of crawling results with querying (in this case using jQ)

Of the crawled ML OCI artifacts, which one exhibit the max accuracy?

import jq
jq.compile( "max_by(.config.customProperties.accuracy).reference" ).input_text(crawl_result).first()
omlmd crawl --plain-http \
    localhost:8080/mmortari/mlartifact:v1 \
    localhost:8080/mmortari/mlartifact:v2 \
    localhost:8080/mmortari/mlartifact:v3 \
    | jq "max_by(.config.customProperties.accuracy).reference"

To be continued...