Tracking model artifacts used in machine learning


What are effective working patterns and tools to ensure we can easily reproduce the model artifacts deployed in production? (Customer is using DVC and Github to get version control on all key aspects: data, training scripts, model specification, hyper parameters, etc.) I'd like to share AWS best practices and recommendations with them.

asked 3 years ago862 views
1 Answer
Accepted Answer

There are a few good practices that would ensure a robust model governance and tracking strategies in place. This is usually required by regulators especially in the financial industry and defined by frameworks like SR 11-7:

• Identify models, owners, and associated usage: This is usually controlled by having a controlled landing zone for data scientists with clear authentication and authorization strategies to keep track of user activities and model owners.

• Cover all aspects of the model life cycle and MLOps: By using an experimentation, proper documentation, feature tagging, testing, deployment environment and pipelining so that models can be independently validated by model validators without having to get back to model developers.

• maintain a centralized model inventory, and track the current validation status: by keeping track of different model versions along with its associated risk and validation processes.

On-going monitoring for production models: by implementing mechanisms to continuously assess accuracy, drift, building constraints on data feed used for inference and outcome analysis for different model versions.

Now saying that, there are many tools out there to help in implementing and achieving all the above, which are scattered and can become challenge in implementation and integration. However, SageMaker have different modules to cover model of the practices mentioned above.

  • SageMaker Experiments: tracks all different steps of ML lifecycle in a construct called trial component. A bunch of trial components can form a trial and a trial belongs to an experiment.

  • SageMaker Pipelines: Help in building a re-producible experiment. Each stage of the ML Lifecycle fits in the pipeline and can automatically be tracked as Trial Components and tracked by its execution history.

  • SageMaker Model Registry: Builds a centralized catalog for models to manage different model versions, associate meta data with different version of models and manage approvals to enforce ownership.

  • SageMaker Model Monitoring: for managing data feed constraints, model performance analysis and drift detection.

  • SageMaker Feature Store: with proper tagging for feature groups and why certain features have been engineered by who is also a must.

  • SageMaker ML Lineage: which - from my point of view - is the most important component for model tracking, auditing and governance. SM ML Lineage is the glue that builds a graph to trace a certain model back to its origins. It can tell what Artifact contributed to which Trial Component and what data produced which model.

answered 3 years ago
profile picture
reviewed a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions