Tracking model artifacts used in machine learning

0

What are effective working patterns and tools to ensure we can easily reproduce the model artifacts deployed in production? (Customer is using DVC and Github to get version control on all key aspects: data, training scripts, model specification, hyper parameters, etc.) I'd like to share AWS best practices and recommendations with them.

專家
已提問 3 年前檢視次數 899 次
1 個回答
0
已接受的答案

There are a few good practices that would ensure a robust model governance and tracking strategies in place. This is usually required by regulators especially in the financial industry and defined by frameworks like SR 11-7:

• Identify models, owners, and associated usage: This is usually controlled by having a controlled landing zone for data scientists with clear authentication and authorization strategies to keep track of user activities and model owners.

• Cover all aspects of the model life cycle and MLOps: By using an experimentation, proper documentation, feature tagging, testing, deployment environment and pipelining so that models can be independently validated by model validators without having to get back to model developers.

• maintain a centralized model inventory, and track the current validation status: by keeping track of different model versions along with its associated risk and validation processes.

On-going monitoring for production models: by implementing mechanisms to continuously assess accuracy, drift, building constraints on data feed used for inference and outcome analysis for different model versions.

Now saying that, there are many tools out there to help in implementing and achieving all the above, which are scattered and can become challenge in implementation and integration. However, SageMaker have different modules to cover model of the practices mentioned above.

  • SageMaker Experiments: tracks all different steps of ML lifecycle in a construct called trial component. A bunch of trial components can form a trial and a trial belongs to an experiment.

  • SageMaker Pipelines: Help in building a re-producible experiment. Each stage of the ML Lifecycle fits in the pipeline and can automatically be tracked as Trial Components and tracked by its execution history.

  • SageMaker Model Registry: Builds a centralized catalog for models to manage different model versions, associate meta data with different version of models and manage approvals to enforce ownership.

  • SageMaker Model Monitoring: for managing data feed constraints, model performance analysis and drift detection.

  • SageMaker Feature Store: with proper tagging for feature groups and why certain features have been engineered by who is also a must.

  • SageMaker ML Lineage: which - from my point of view - is the most important component for model tracking, auditing and governance. SM ML Lineage is the glue that builds a graph to trace a certain model back to its origins. It can tell what Artifact contributed to which Trial Component and what data produced which model.

AWS
Will_B
已回答 3 年前
profile picture
專家
已審閱 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南