By using AWS re:Post, you agree to the Terms of Use

Questions tagged with AWS Deep Learning Containers

Sort by most recent
  • 1
  • 12 / page

Browse through the questions and answers listed below or filter and sort to narrow down your results.

SKLearn Processing Container - Error: "WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager."

Hey all, I am trying to run the script below in the writefile titled "vw_aws_a_bijlageprofile.py". This code has worked for me using other data sources, but now I am getting the following error message from the CloudWatch Logs: "***2022-08-24T20:09:19.708-05:00 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv***" Any idea how I get around this error? Full code below. Thank you in advance!!!! ``` %%writefile vw_aws_a_bijlageprofile.py import os import sys import subprocess def install(package): subprocess.check_call([sys.executable, "-q", "-m", "pip", "install", package]) install('awswrangler') install('tqdm') install('pandas') install('botocore') install('ruamel.yaml') install('pandas-profiling') import awswrangler as wr import pandas as pd import numpy as np import datetime as dt from dateutil.relativedelta import relativedelta from string import Template import gc import boto3 from pandas_profiling import ProfileReport client = boto3.client('s3') session = boto3.Session(region_name="eu-west-2") def run_profile(): query = """ SELECT * FROM "intl-euro-archmcc-database"."vw_aws_a_bijlage" ; """ #swich table name above tableforprofile = wr.athena.read_sql_query(query, database="intl-euro-archmcc-database", boto3_session=session, ctas_approach=False, workgroup='DataScientists') print("read in the table queried above") print("got rid of missing and added a new index") profile_tblforprofile = ProfileReport(tableforprofile, title="Pandas Profiling Report", minimal=True) print("Generated table profile") return profile_tblforprofile if __name__ == '__main__': profile_tblforprofile = run_profile() print("Generated outputs") output_path_tblforprofile = ('/opt/ml/processing/output/profile_vw_aws_a_bijlage.html') #switch profile name above print(output_path_tblforprofile) profile_tblforprofile.to_file(output_path_tblforprofile) ``` ``` import sagemaker from sagemaker.processing import ProcessingInput, ProcessingOutput session = boto3.Session(region_name="eu-west-2") bucket = 'intl-euro-uk-datascientist-prod' prefix = 'Mark' sm_session = sagemaker.Session(boto_session=session, default_bucket=bucket) sm_session.upload_data(path='vw_aws_a_bijlageprofile.py', bucket=bucket, key_prefix=f'{prefix}/source') ``` ``` import boto3 #import sagemaker from sagemaker import get_execution_role from sagemaker.sklearn.processing import SKLearnProcessor region = boto3.session.Session().region_name S3_ROOT_PATH = "s3://{}/{}".format(bucket, prefix) role = get_execution_role() sklearn_processor = SKLearnProcessor(framework_version='0.20.0', role=role, sagemaker_session=sm_session, instance_type='ml.m5.24xlarge', instance_count=1) ``` ``` sklearn_processor.run(code='s3://{}/{}/source/vw_aws_a_bijlageprofile.py'.format(bucket, prefix), inputs=[], outputs=[ProcessingOutput(output_name='output', source='/opt/ml/processing/output', destination='s3://intl-euro-uk-datascientist-prod/Mark/IODataProfiles/')]) ```
1
answers
0
votes
31
views
asked a month ago

Setting up data for DeepAR, targets and categories for simultaneous data?

I would like to try out DeepAR for an engineering problem that I have some sensor datasets for, but I am unsure how to set it up for ingestion into DeepAR to get a predictive model. The data is essentially the positions, orientations, and a few other timeseries sensor readings of an assortment of objects (animals, in this case, actually) over time. Data is both noisy and sometimes missing. So, in this case, there are N individuals and for each individual, there are Z variables of interest per individual. None of the variables are "static" (color, size, etc), they are all expected to be time-varying on the same time scale. Ultimately, I would like to try and predict all Z targets for all N individuals. How do I set up the timeseries to feed into DeepAR? The premise is that all these individuals are implicitly interacting in the observed space, so all the target values have some interdependence on each other, which is what I would like to see if DeepAR can take into account to make predictions. Should I be using a category vector of length 2, such that the first cat variable corresponds to the individual, and the second corresponds to one of the variables associated with the individual? Then there would be N*Z targets in my input dataset, each with `cat = [ n , z ]`, where there are N distinct values for n, and z for Z?
1
answers
0
votes
64
views
asked 7 months ago
  • 1
  • 12 / page