By using AWS re:Post, you agree to the Terms of Use
/Machine Learning & AI/

Questions tagged with Machine Learning & AI

Sort by most recent
  • 1
  • 90 / page

Browse through the questions and answers listed below or filter and sort to narrow down your results.

ClientError: An error occurred (UnknownOperationException) when calling the CreateHyperParameterTuningJob operation: The requested operation is not supported in the called region.

Hi Dears, I am building ML model using DeepAR Algorithm. I faced this error while i reached to this point : Error : ClientError: An error occurred (UnknownOperationException) when calling the CreateHyperParameterTuningJob operation: The requested operation is not supported in the called region. ------------------- Code: from sagemaker.tuner import ( IntegerParameter, CategoricalParameter, ContinuousParameter, HyperparameterTuner, ) from sagemaker import image_uris container = image_uris.retrieve(region= 'af-south-1', framework="forecasting-deepar") deepar = sagemaker.estimator.Estimator( container, role, instance_count=1, instance_type="ml.m5.2xlarge", use_spot_instances=True, # use spot instances max_run=1800, # max training time in seconds max_wait=1800, # seconds to wait for spot instance output_path="s3://{}/{}".format(bucket, output_path), sagemaker_session=sess, ) freq = "D" context_length = 300 deepar.set_hyperparameters( time_freq=freq, context_length=str(context_length), prediction_length=str(prediction_length) ) Can you please help in solving the error? I have to do that in af-south-1 region. Thanks Basem hyperparameter_ranges = { "mini_batch_size": IntegerParameter(100, 400), "epochs": IntegerParameter(200, 400), "num_cells": IntegerParameter(30, 100), "likelihood": CategoricalParameter(["negative-binomial", "student-T"]), "learning_rate": ContinuousParameter(0.0001, 0.1), } objective_metric_name = "test:RMSE" tuner = HyperparameterTuner( deepar, objective_metric_name, hyperparameter_ranges, max_jobs=10, strategy="Bayesian", objective_type="Minimize", max_parallel_jobs=10, early_stopping_type="Auto", ) s3_input_train = sagemaker.inputs.TrainingInput( s3_data="s3://{}/{}/train/".format(bucket, prefix), content_type="json" ) s3_input_test = sagemaker.inputs.TrainingInput( s3_data="s3://{}/{}/test/".format(bucket, prefix), content_type="json" ) tuner.fit({"train": s3_input_train, "test": s3_input_test}, include_cls_metadata=False) tuner.wait()
0
answers
0
votes
0
views
Basem
asked 10 hours ago
0
answers
0
votes
1
views
AWS-User-gnahz
asked 6 days ago

XGBoost Error: Allreduce failed - 100GB Dask Dataframe on AWS Fargate ECS cluster dies with 1T of memory.

Overview: I'm trying to run an XGboost model on a bunch of parquet files sitting in S3 using dask by setting up a fargate cluster and connecting it to a Dask cluster. Total dataframe size totals to about 140 GB of data. I scaled up a fargate cluster with properties: Workers: 40 Total threads: 160 Total memory: 1 TB So there should be enough data to hold the data tasks. Each worker has 9+ GB with 4 Threads. I do some very basic preprocessing and then I create a DaskDMatrix which does cause the task bytes per worker to get a little high, but never above the threshold where it would fail. Next I run xgb.dask.train which utilizes the xgboost package not the dask_ml.xgboost package. Very quickly, the workers die and I get the error `XGBoostError: rabit/internal/utils.h:90: Allreduce failed`. When I attempted this with a single file with only 17MB of data, I would still get this error but only a couple workers die. Does anyone know why this happens since I have double the memory of the dataframe? ``` X_train = X_train.to_dask_array() X_test = X_test.to_dask_array() y_train = y_train y_test = y_test ``` dtrain = xgb.dask.DaskDMatrix(client,X_train, y_train) output = xgb.dask.train( client, {"verbosity": 1, "tree_method": "hist", "objective": "reg:squarederror"}, dtrain, num_boost_round=100, evals=[(dtrain, "train")])`
1
answers
0
votes
5
views
AWS-User-7732475
asked 7 days ago

Amazon SageMaker Data Wrangler now supports additional M5 and R5 instances for interactive data preparation

Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare data for machine learning (ML) from weeks to minutes in Amazon SageMaker Studio, the first fully integrated development environment (IDE) for ML. With SageMaker Data Wrangler, you can simplify the process of data preparation and feature engineering, and complete each step of the data preparation workflow, including data selection, cleansing, exploration, and visualization, from a single visual interface. SageMaker Data Wrangler runs on ml.m5.4xlarge by default. SageMaker Data Wrangler includes built-in data transforms and analyses written in PySpark so you can process large data sets (up to hundreds of gigabytes (GB) of data) efficiently on the default instance. Starting today, you can use additional M5 or R5 instance types with more CPU or memory in SageMaker Data Wrangler to improve performance for your data preparation workloads. Amazon EC2 M5 instances offer a balance of compute, memory, and networking resources for a broad range of workloads. Amazon EC2 R5 instances are the memory optimized instances. Both M5 and R5 instance types are well suited for CPU and memory intensive applications such as running built-in transforms for very large data sets (up to terabytes (TB) of data) or applying custom transforms written in Panda on medium data sets (up to tens of GBs). To learn more about the newly supported instances with Amazon SageMaker Data Wrangler, visit the [blog ](https://aws.amazon.com/blogs/machine-learning/process-larger-and-wider-datasets-with-amazon-sagemaker-data-wrangler/) or the [AWS document](https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler-data-flow.html), and the[ pricing page](https://aws.amazon.com/sagemaker/pricing/). To get started with SageMaker Data Wrangler, visit the [AWS documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler.html).
0
answers
0
votes
5
views
AWS-Huong-Nguyen
asked 9 days ago

Rekognition search faces API endpoint

Hi everyone! Currently, I've accomplished detecting all the faces from a collection and then generating sub-galleries of each subject with all their photos associated with the ruby SDK '~> 1.65' To do this, I've indexed the faces of all photos within a collection, list all the faces (https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/Rekognition/Client.html#list_faces-instance_method), then grabbing each face_id recognized and search the faces related to that face_id (https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/Rekognition/Client.html#search_faces-instance_method), and delete the face id used to do the API call and all the returned ones to tell when a new detected subject starts and ends. My issue is that the search faces API returns different results depending on which face id param you are doing the request with. For example, if there are 10 faces ids detected that belong to a person (1, 2, 3, ..., 10), the search faces call with face id = 1 param, should return the faces id (2, 3, 4, ..., 10) but if you continue to do this with the other face ids this is not always the case with some scenarios where the search faces call with face id = 3 has returned a subset of the previously mentioned like just (4, 5, 6). Is there any other way to achieve this to prevent this kind of "error"? if not, this is a real concern for us because it depends on the order in which we call the search faces with different face ids, and sometimes it seems like there is more than 1 subject detected with almost the same photos when in reality it's the same person. Thanks in advance!
3
answers
0
votes
4
views
AWS-User-5694207
asked 13 days ago

How to create (Serverless) SageMaker Endpoint using exiting tensorflow pb (frozen model) file?

Note: I am a senior developer, but am very new to the topic of machine learning. I have two frozen TensorFlow model weight files: `weights_face_v1.0.0.pb` and `weights_plate_v1.0.0.pb`. I also have some python code using Tensorflow 2, that loads the model and handles basic inference. The models detect respectively faces and license plates, and the surrounding code converts an input image to a numpy array, and applies blurring to the images in areas that had detections. I want to get a SageMaker endpoint so that I can run inference on the model. I initially tried using a regular Lambda function (container based), but that is too slow for our use case. A SageMaker endpoint should give us GPU inference, which should be much faster. I am struggling to find out how to do this. From what I can tell reading the documentation and watching some YouTube video's, I need to create my own docker container. As a start, I can use for example `763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.8.0-gpu-py39-cu112-ubuntu20.04-sagemaker`. However, I can't find any solid documentation on how I would implement my other code. How do I send an image to SageMaker? Who tells it to convert the image to numpy array? How does it know the tensor names? How do I install additional requirements? How can I use the detections to apply blurring on the image, and how can I return the result image? Can someone here please point me in the right direction? I searched a lot but can't find any example code or blogs that explain this process. Thank you in advance! Your help is much appreciated.
1
answers
0
votes
2
views
George
asked 20 days ago

lex / lambda order flowers blueprints give invalid lambda response error

the order flowers blueprint runs fine on lex through console but when the lambda blueprint is attached the test fails with "Invalid Lambda Response: Received error response from Lambda: Unhandled" the log shows the error: [ERROR] KeyError: 'userId' Traceback (most recent call last): File "/var/task/lambda_function.py", line 206, in lambda_handler return dispatch(event) File "/var/task/lambda_function.py", line 182, in dispatch logger.debug('dispatch userId= {} , intentName= {} '.format(intent_request['userId'], intent_request['currentIntent']['name'])) [ERROR] KeyError: 'userId' Traceback (most recent call last): File "/var/task/lambda_function.py", line 206, in lambda_handler return dispatch(event) File "/var/task/lambda_function.py", line 182, in dispatch logger.debug('dispatch userId={}, intentName={}'.format(intent_request['userId'], intent_request['currentIntent']['name'])) the reuse of the function for fulfillment and validation is very hard to follow because the doc says: In the Editor, choose AWS Lambda function as Fulfillment, and select the Lambda function that you created in the preceding step (OrderFlowersCodeHook). Choose OK to give Amazon Lex permission to invoke the Lambda function. ..... the problem is there is no choice for the lambda function in that form nor is there one in the validation form!!! there is a reference to the function being stored in an alias but i cant find a way to tie this together. i have really tried to follow the documentation systematically but cant get through it. where am i going off the rails??? shouldnt the blueprint run without this stumbling block? this has cost me a month of hair pulling agony and is really the first step in building an index for my knowledge system already running on aws where a session could build a compound index such as "Application:Billing\Function:Encounters\Process:Create\Plan:BlueCross\Product:PPO\Coverage:Medical\Service:Physical\FromDate:01-01-2022\" it seems like a natural way to specify types and values in a modern interface. Thanks, AEH
2
answers
1
votes
6
views
AWS-User-9536514
asked 2 months ago

Trying Sagemaker example but getting error: AttributeError: module 'sagemaker' has no attribute 'create_transform_job'

Hi, I keep getting this error: AttributeError: module 'sagemaker' has no attribute 'create_transform_job', when using a batch transform example that AWS graciously had in the notebook instances. Code: ***Also, I updated Sagemaker to the newest package and its still not working. ``` %%time import time from time import gmtime, strftime batch_job_name = "Batch-Transform-" + strftime("%Y-%m-%d-%H-%M-%S", gmtime()) input_location = "s3://{}/{}/batch/{}".format( bucket, prefix, batch_file ) # use input data without ID column output_location = "s3://{}/{}/output/{}".format(bucket, prefix, batch_job_name) request = { "TransformJobName": batch_job_name, "ModelName": 'xgboost-parquet-example-training-2022-03-28-16-02-31-model', "TransformOutput": { "S3OutputPath": output_location, "Accept": "text/csv", "AssembleWith": "Line", }, "TransformInput": { "DataSource": {"S3DataSource": {"S3DataType": "S3Prefix", "S3Uri": input_location}}, "ContentType": "text/csv", "SplitType": "Line", "CompressionType": "None", }, "TransformResources": {"InstanceType": "ml.m4.xlarge", "InstanceCount": 1}, } sagemaker.create_transform_job(**request) print("Created Transform job with name: ", batch_job_name) # Wait until the job finishes try: sagemaker.get_waiter("transform_job_completed_or_stopped").wait(TransformJobName=batch_job_name) finally: response = sagemaker.describe_transform_job(TransformJobName=batch_job_name) status = response["TransformJobStatus"] print("Transform job ended with status: " + status) if status == "Failed": message = response["FailureReason"] print("Transform failed with the following error: {}".format(message)) raise Exception("Transform job failed") ``` Everything else is working well. I've had no luck with this on anyother forum.
1
answers
0
votes
9
views
AWS-User-7732475
asked 2 months ago

Training Metric logging on SageMaker experiment tracking: how to get time-series metrics with visualisation

I am using the sagemaker python SDK to train a bespoke model. I have defined my `metric_definition` regexes and passed them to the estimator like: ```python num_re = "([0-9\\.]+)(e-?[[01][0-9])?" metrics = [ {"Name": "learning-rate", "Regex": f"lr: {num_re}"}, {"Name": "training:loss", "Regex": f"loss: {num_re}"}, # ... ] estimator = Estimator( image_uri=training_image_uri, # ... metric_definitions=metrics, enable_sagemaker_metrics=True, ) ``` When I run training, these metrics are visible in my logs and I can also see them in SageMaker Studio in `Trial Components > Metrics (tab)` as a grid of numbers like: > Name | Minimum | Maximum | Standard Deviation | Average | Count | Final value > learning-rate | 8.889 | 8.907 | 0.010392304845413657 | 8.898 | 4 |8.907 > ... Which suggests that the regexes are correctly matching on the logs However, I am not able to visualise any graphs for my metrics. I have tried all of: - `Sagemaker Studio > Trial components > charts`. It is only possible to plot things like `learning-rate_min` (i.e. a point value not a time-series metric) - `SageMaker aws console > training > training jobs > <select job> > Scroll to Monitor section`. Here I can see metrics like CPUUtilization over time but for my metrics there is just an empty graph for each metric that I have defined that says 'No data available' - `SageMaker aws console > training > training jobs > <select job> > Scroll to Monitor section > View algorithm metrics (opens in CloudWatch) > Browse > select metric (e.g. learning-rate and 'Add to Graph' `. I filter by the correct time period and go the `Graphed metrics (1) tab`, even after updating the period to `1 second` I am not able to see anything on the graph. I'm not sure what the issue is here but any help would be much appreciated
2
answers
0
votes
5
views
AWS-User-6087540
asked 2 months ago
3
answers
0
votes
3
views
AWSome User
asked 2 months ago

How can I feed outputed augmented manifest file as input to blazingtext in a pipeline?

I'm creating a pipeline with multiple steps One to preprocess a dataset and the other one takes the preprocessed one as an input to train a BlazingText model for classification My first `ProcessingStep` outputs augmented manifest files step_process = ProcessingStep( name="Nab3Process", processor=sklearn_processor, inputs=[ ProcessingInput(source=raw_input_data, destination=raw_dir), ProcessingInput(source=categories_input_data, destination=categories_dir) ], outputs=[ ProcessingOutput(output_name="train", source=train_dir), ProcessingOutput(output_name="validation", source=validation_dir), ProcessingOutput(output_name="test", source=test_dir), ProcessingOutput(output_name="mlb_train", source=mlb_data_train_dir), ProcessingOutput(output_name="mlb_validation", source=mlb_data_validation_dir), ProcessingOutput(output_name="mlb_test", source=mlb_data_test_dir), ProcessingOutput(output_name="le_vectorizer", source=le_vectorizer_dir), ProcessingOutput(output_name="mlb_vectorizer", source=mlb_vectorizer_dir) ], code=preprocessing_dir) But I'm having a hard time when I try to feed my `train` output as a `TrainingInput` to the model step to use it to train. step_train = TrainingStep( name="Nab3Train", estimator=bt_train, inputs={ "train": TrainingInput( step_process.properties.ProcessingOutputConfig.Outputs[ "train" ].S3Output.S3Uri, distribution="FullyReplicated", content_type="application/x-recordio", s3_data_type='AugmentedManifestFile', attribute_names=['source', 'label'], input_mode='Pipe', record_wrapping='RecordIO' ), "validation": TrainingInput( step_process.properties.ProcessingOutputConfig.Outputs[ "validation" ].S3Output.S3Uri, distribution="FullyReplicated", content_type='application/x-recordio', s3_data_type='AugmentedManifestFile', attribute_names=['source', 'label'], input_mode='Pipe', record_wrapping='RecordIO' ) }) And I'm getting the following error 'FailureReason': 'ClientError: Could not download manifest file with S3 URL "s3://sagemaker-us-east-1-xxxxxxxxxx/Nab3Process-xxxxxxxxxx/output/train". Please ensure that the bucket exists in the selected region (us-east-1), that the manifest file exists at that S3 URL, and that the role "arn:aws:iam::xxxxxxxxxx:role/service-role/AmazonSageMakerServiceCatalogProductsUseRole" has "s3:GetObject" permissions on the manifest file. Error message from S3: The specified key does not exist.' What Should I do?
0
answers
0
votes
2
views
Muhammad Badawy
asked 2 months ago

Can we/How to provide an array as contextual metadata on GetRecommendation?

This is a follow up question after this [Question](https://repost.aws/questions/QUeFSnzs3sRAiSRpJFo5GquA/how-to-correctly-get-the-recommendation-if-the-user-data-or-item-data-change-where-the-previous-interactions-were-based-on-the-old-data). I am storing contextual metadata on interaction whose value is an array so it is stored as | separated string. The data stores the users licenses at the time of interaction. User may bought new licenses any time. Say, The definition of the Contextual Metadata on interaction is, ``` { "name": "LICENSES", "type": "string", "categorical": true } ``` And the data can be "L1", "L1|L3",, or "L1|L2|L3" etc. Where L1, L2, L3 are different licenses. Users may have any combination of them. Now I want to get Recommendations using GetRecommendation based on that context data. I would like to pass the current licenses user have on the Context. Can I pass the | separated string and get the desired result? For example an user has license L1 and L2, ``` context = { 'LICENSES': 'L1|L2' } ``` Now my question is, can we pass context like this? And will it work? I want to show items based on the license users have. On the documentation, it never explicitly says that we can provide such categorical data in context. All the examples use a single string, not a | separated array of strings. So I am wondering if I can pass such | separated array of string on context which would work. One might suggest filtering the Items based on license. But we want to show items without filtering in our case.
1
answers
0
votes
3
views
AWS-User-6949763
asked 2 months ago
1
answers
0
votes
2
views
apullin
asked 2 months ago

Unable to resolve "Learn Python on AWS Workshop Python" error involving looping over JSON

I haven't been able to resolve the the following error that keeps showing up at the end of the [Looping over JSON](https://catalog.us-east-1.prod.workshops.aws/workshops/3d705026-9edc-40e8-b353-bdabb116c89c/en-US/loops/lab-6/step-2#looping-over-dictionaries-and-json) portion of Lab 6 in the "Learn Python on AWS Workshop" module: ``` error: the following arguments are required_ --file ``` Following the directions throughout the lab, I created the JSON file called `translate_input.json` and copied the list of dictionaries contained within. Then I created a new python file called `lab_6_step_2_loops.py` , typed in the text as directed, and ran the program with the command in the terminal python `lab_6_step_2_loops.py --file translate_input.json` after which the above error appears. I reached out to some coworkers who were also working on this but none of them have gotten back to me yet. Additionally, I've gone over all of the previous labs of this Python workshop to see what/if I missed anything, read numerous explanations, and watched several tutorials on YouTube regarding argparse and json. All of this was helpful but danced around the general issue without actually helping me resolve it but leads me to think the issue is related to the first section of the [code](https://catalog.us-east-1.prod.workshops.aws/workshops/3d705026-9edc-40e8-b353-bdabb116c89c/en-US/loops/lab-6/step-2#looping-over-dictionaries-and-json). ``` parser = argparse.ArgumentParser(description="Provides translation between one source language and another of the same set of languages.") parser.add_argument( '--file', dest='filename', help="The path to the input file. The file should be valid json", required=True) ``` Leaving this code as is, inputting the file name, file path, destination, or some combination keeps bringing up the same error as above. When I follow the directions in the link and literally just copy and paste- no typing- the text, I still come up with this error. Any thoughts?
1
answers
0
votes
5
views
paugoede
asked 3 months ago

SageMaker - All metrics in statistics.json by Model Quality Monitor are "0.0 +/- 0.0", but confusion matrix is built correctly for multi-class classification!!

I have scheduled an hourly model-quality-monitoring job in AWS SageMaker. both the jobs, ground-truth-merge and model-quality-monitoring completes successfully without any errors. but, all the metrics calculated by the job are "0.0 +/- 0.0" while the confustion matrix gets calculated as expected. I have done everything as mentioned in [this notebook for model-quality-monitoring from sagemaker-examples](https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker_model_monitor/model_quality/model_quality_churn_sdk.ipynb) with very few changes and they are: 1. I have changed the model from xgboost churn to model trained on my data. 2. my input to the endpoint was csv like in the example-notebook, but output was json. 3. i have changed the problem-type from BinaryClassfication to MulticlassClassification wherever necessary. confustion matrix was built successfully, but all metrics are 0 for some reason. So, I would like the monitoring job to calculate the multi-classification metrics on data properly. **All Logs** Here's the `statistics.json` file that model-quality-monitor saved to S3 with confustion matrix built, but with 0s in all the metrics: ``` { "version" : 0.0, "dataset" : { "item_count" : 4432, "start_time" : "2022-02-23T03:00:00Z", "end_time" : "2022-02-23T04:00:00Z", "evaluation_time" : "2022-02-23T04:13:20.193Z" }, "multiclass_classification_metrics" : { "confusion_matrix" : { "0" : { "0" : 709, "2" : 530, "1" : 247 }, "2" : { "0" : 718, "2" : 497, "1" : 265 }, "1" : { "0" : 700, "2" : 509, "1" : 257 } }, "accuracy" : { "value" : 0.0, "standard_deviation" : 0.0 }, "weighted_recall" : { "value" : 0.0, "standard_deviation" : 0.0 }, "weighted_precision" : { "value" : 0.0, "standard_deviation" : 0.0 }, "weighted_f0_5" : { "value" : 0.0, "standard_deviation" : 0.0 }, "weighted_f1" : { "value" : 0.0, "standard_deviation" : 0.0 }, "weighted_f2" : { "value" : 0.0, "standard_deviation" : 0.0 }, "accuracy_best_constant_classifier" : { "value" : 0.3352888086642599, "standard_deviation" : 0.003252410977346705 }, "weighted_recall_best_constant_classifier" : { "value" : 0.3352888086642599, "standard_deviation" : 0.003252410977346705 }, "weighted_precision_best_constant_classifier" : { "value" : 0.1124185852154987, "standard_deviation" : 0.0021869336610830254 }, "weighted_f0_5_best_constant_classifier" : { "value" : 0.12965524348784485, "standard_deviation" : 0.0024239410000317335 }, "weighted_f1_best_constant_classifier" : { "value" : 0.16838092925822584, "standard_deviation" : 0.0028615098045768348 }, "weighted_f2_best_constant_classifier" : { "value" : 0.24009212108475822, "standard_deviation" : 0.003326031863819311 } } } ``` Here's how couple of lines of captured data looks like(*prettified for readability, but each line has no tab spaces as shown below*) : ``` { "captureData": { "endpointInput": { "observedContentType": "text/csv", "mode": "INPUT", "data": "0,1,628,210,30", "encoding": "CSV" }, "endpointOutput": { "observedContentType": "application/json", "mode": "OUTPUT", "data": "{\"label\":\"Transfer\",\"prediction\":2,\"probabilities\":[0.228256680901919,0.0,0.7717433190980809]}\n", "encoding": "JSON" } }, "eventMetadata": { "eventId": "a7cfba60-39ee-4796-bd85-343dcadef024", "inferenceId": "5875", "inferenceTime": "2022-02-23T04:12:51Z" }, "eventVersion": "0" } { "captureData": { "endpointInput": { "observedContentType": "text/csv", "mode": "INPUT", "data": "0,3,628,286,240", "encoding": "CSV" }, "endpointOutput": { "observedContentType": "application/json", "mode": "OUTPUT", "data": "{\"label\":\"Adoption\",\"prediction\":0,\"probabilities\":[0.99,0.005,0.005]}\n", "encoding": "JSON" } }, "eventMetadata": { "eventId": "7391ac1e-6d27-4f84-a9ad-9fbd6130498a", "inferenceId": "5876", "inferenceTime": "2022-02-23T04:12:51Z" }, "eventVersion": "0" } ``` Here's couple of lines from my ground-truths that I have uploaded to S3 look like(*prettified for readability, but each line has no tab spaces as shown below*): ``` { "groundTruthData": { "data": "0", "encoding": "CSV" }, "eventMetadata": { "eventId": "1" }, "eventVersion": "0" } { "groundTruthData": { "data": "1", "encoding": "CSV" }, "eventMetadata": { "eventId": "2" }, "eventVersion": "0" }, ``` Here's couple of lines from the ground-truth-merged file look like(*prettified for readability, but each line has no tab spaces as shown below*). this file is created by the ground-truth-merge job, which is one of the two jobs that model-quality-monitoring schedule runs: ``` { "eventVersion": "0", "groundTruthData": { "data": "2", "encoding": "CSV" }, "captureData": { "endpointInput": { "data": "1,2,1050,37,1095", "encoding": "CSV", "mode": "INPUT", "observedContentType": "text/csv" }, "endpointOutput": { "data": "{\"label\":\"Return_to_owner\",\"prediction\":1,\"probabilities\":[0.14512373737373732,0.6597074314574313,0.1951688311688311]}\n", "encoding": "JSON", "mode": "OUTPUT", "observedContentType": "application/json" } }, "eventMetadata": { "eventId": "c9e21f63-05f0-4dec-8f95-b8a1fa3483c1", "inferenceId": "4432", "inferenceTime": "2022-02-23T04:00:00Z" } } { "eventVersion": "0", "groundTruthData": { "data": "1", "encoding": "CSV" }, "captureData": { "endpointInput": { "data": "0,2,628,5,90", "encoding": "CSV", "mode": "INPUT", "observedContentType": "text/csv" }, "endpointOutput": { "data": "{\"label\":\"Adoption\",\"prediction\":0,\"probabilities\":[0.7029623691085284,0.0,0.29703763089147156]}\n", "encoding": "JSON", "mode": "OUTPUT", "observedContentType": "application/json" } }, "eventMetadata": { "eventId": "5f1afc30-2ffd-42cf-8f4b-df97f1c86cb1", "inferenceId": "4433", "inferenceTime": "2022-02-23T04:00:01Z" } } ``` Since, the confusion matrix was constructed properly, I presume that I fed the data to sagemaker-model-monitor the right-way. But, why are all the metrics 0.0, while confustion-matrix looks as expected? EDIT 1: Logs for the job are available [here](https://controlc.com/1e1781d2).
0
answers
1
votes
5
views
Naveen Reddy Marthala
asked 3 months ago

Where to get started as a C/C++/C# developer?

Hello everyone, I've been a software and game developer for well over a decade, and I feel very at home with C, C++ and C#/.NET. I've done a lot of programming with DirectX SDKs and Unity, as well as desktop and mobile development. However, I don't really know much at all about the web and networking. But now I'm becoming very interested in what kinds of things I can do with AWS in game development, as well as with blockchain, AI/ML and cloud computing power. There seems to be more AWS services and packages than I can count though, and I'm really not sure which ones I should be pursuing and trying to learn more about and which ones are irrelevant to me and my job. The list of services, SDKs and packages is as impressive and inspiring as it is overwhelming and confusing! I'd like to be able to deploy some .NET applications to AWS to provide remote APIs for games and applications. To start with and get the hang of it, I just want to make a little minigame that lives on a server that a Unity or Unreal game can interact with through http requests. And I'll incrementally add some features to it as I start to get the hang of it. I'd also like to do some experimentation with blockchain software and services and see what kinds of interesting things can be accomplished with it. And beyond that, I'd eventually like to get into using AI/ML to accomplish goals in games and apps, harness cloud-based CPU/GPU processing power for heavy-lifting and even setup some real-time multiplayer game servers. I just don't know what tools/services/packages I need to dive into and start figuring out. I've found [the Visual Studio 2022 AWS package here](https://marketplace.visualstudio.com/items?itemName=AmazonWebServices.AWSToolkitforVisualStudio2022&ssr=false#overview), as well as [this AWS GitHub page](https://github.com/aws/dotnet) for .NET. I've also heard about Lambda. I really have no idea which ones are relevant to me or where I should start with this. To begin with, I want to run a .NET application on a server that applications can query and interact with and have it store and supply data for them, offer some remote APIs and requests, etc. I'd also potentially be interested in running some native C/C++ modules on servers. Can someone point me in the right direction and tell me what I need to set up and dive into first? I'm doing this in my free time when I'm not working on projects for work, so my time is very limited and researching and learning the wrong things would be a big setback for me. Any help or guidance is greatly appreciated!
1
answers
0
votes
7
views
atc
asked 3 months ago

LexRuntimeV2::LexRuntimeV2Client::StartConversationAsync not generating events

We're trying to get streaming working with C++ LexRuntimeV2Client. I'm making a successful call to "PutSession," then attempting to start the stream by calling "StartConversationAsync." Code looks like this: ` LexRuntimeV2::Model::StartConversationHandler convoHandler = LexRuntimeV2::Model::StartConversationHandler(); convoHandler.SetTextResponseEventCallback(textEvent); convoHandler.SetIntentResultEventCallback(intentEvent); convoHandler.SetOnErrorCallback(errorEvent); convoHandler.SetHeartbeatEventCallback(heartbeatEvent); startConvo.WithBotId(BotId) .WithBotAliasId(BotAliasId) .WithLocaleId(Locale) .WithSessionId(ConvoSession) .WithEventStreamHandler(convoHandler) .WithConversationMode(Aws::LexRuntimeV2::Model::ConversationMode::TEXT); client.StartConversationAsync(startConvo, readyHandler, responseHandler); I'm getting a callback to "readyHandler" but none of the other callbacks are firing. I've read some sources that suggest that the issue is WinHttp client, but it's not clear how to switch to curl on Windows to test. Currently testing this on Windows, but the target deployment will be a Linux Docker container. Documentation is very very light for this API so I may definitely be calling it incorrectly. I can successfully call "RecognizeText" on the bot and get proper results so I know the client & bot are configured properly. Any help for how to make this call would be appreciated.
0
answers
0
votes
6
views
Ryan_Motive
asked 3 months ago

Unsupported pytorch version 1.10.0 with SM Elastic Inference Accelerators

Hi Team, Greetings!! We are not able to deploy on real-time endpoint with elastic inference accelerators. Could you please have a look? SageMaker version: 2.76.0 Code: from sagemaker.pytorch import PyTorchModel from sagemaker import get_execution_role endpoint_name = 'ner-bert' model = PyTorchModel(entry_point='deploy_ei.py', source_dir='code', model_data=model_data, role=get_execution_role(), framework_version='1.10.0', py_version='py38') predictor = model.deploy(initial_instance_count=1, instance_type='ml.m5.xlarge', accelerator_type='ml.eia2.medium', endpoint_name=endpoint_name) Error details: Unsupported pytorch version: 1.10.0. You may need to upgrade your SDK version (pip install -U sagemaker) for newer pytorch versions. Supported pytorch version(s): 1.3.1, 1.5.1, 1.3, 1.5. Note: We are able to deploy without elastic accelerator in above code and want to use Python 3.8 version because we have some dependency libraries which supports only Python 3.8 version. I looked at "Available DL containers" at https://github.com/aws/deep-learning-containers/blob/master/available_images.md and by looking at this section "SageMaker Framework Containers (SM support only)", SM support PyTorch 1.10.0 with Python 3.8 version. But we would like to deploy on Elastic Inference and by looking at this section "Elastic Inference Containers" in above URL, EI containers supports only PyTorch 1.5.1 with Python 3.6. Why these containers are so outdated? What could be the solution? Can we specify the latest version of Python in requirements.txt file and get it installed? Thanks, Vinayak
0
answers
0
votes
2
views
Vinayak Shanawad
asked 3 months ago

Host a fine-tuned BERT Multilingual model on SageMaker with Serverless inference

Hi All, Good day!! Key point to note here is, we have pre-processing script for the text document, deserialize which is required for prediction then we have post-processing script for generating NER (entitites). I went through SageMaker material and decided to try following options. 1. Option 1: Bring our own model, write a inference script and deploy it on SM real-time endpoint using Pytorch container. I went through Suman video (https://www.youtube.com/watch?v=D9Qo5OpG4p8) which is really good, need to try with our pre-processing and post-processing scripts then see if it works fine or not. 2. Option 2: Bring our own model, write a inference script and deploy it on SM real-time endpoint using Huggingface container. I went through Huggingface docs (https://huggingface.co/docs/sagemaker/inference#deploy-a-%F0%9F%A4%97-transformers-model-trained-in-sagemaker) but there is no reference for how to use own pre and post-processing scripts to setup inference pipeline. If you know any examples on using our own pre and post-processing scripts using Huggingface container then please share it. 3. Option 3: Bring our own model, write a inference script and deploy it on SM Serverless inference/endpoint using Huggingface container. I went through Julien video (https://www.youtube.com/watch?v=cUhDLoBH80o&list=PLJgojBtbsuc0E1JcQheqgHUUThahGXLJT&index=35) which is excellent but he has not shown how to use our own pre and post-processing scripts using Huggingface container. Please share if you know any examples. Could you please help? Thanks, Vinayak
1
answers
0
votes
4
views
Vinayak Shanawad
asked 3 months ago
  • 1
  • 90 / page