By using AWS re:Post, you agree to the Terms of Use

Unanswered Questions tagged with Amazon Machine Images (AMI)

Sort by most recent
  • 1
  • 12 / page

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Deploying a Random Forest Model on Amazon Sagemaker always getting a UnexpectedStatusException with Reason: AlgorithmError

Hey I am trying to deploy my RandomForest Classifier on Amazon Sagemaker but get a StatusException Error even though the script worked fine before: The script runs fine and prints out the confusion matrix and accuracy as expected. When I try to deploy the model to amazon Sagemaker using the script it does not work. >>! python script.py --n-estimators 100 \ --max_depth 2 \ --model-dir ./ \ --train ./ \ --test ./ \ Confusion Matrix: [[13 8] [ 1 17]] Accuracy: 0.7692307692307693 I used the Estimator from Sagemaker Python SDK >>from sagemaker.sklearn.estimator import SKLearn >>sklearn_estimator = SKLearn( entry_point='script.py', role = get_execution_role(), instance_count=1, instance_type='ml.m4.xlarge', framework_version='0.20.0', base_job_name='rf-scikit') I launched the training job as follows >>sklearn_estimator.fit({'train':trainpath, 'test': testpath}, wait=False) Here I am trying to deploy the model which leads to the StatusExceptionError that I cannot seem to fix >>sklearn_estimator.latest_training_job.wait(logs='None') >>artifact = m_boto3.describe_training_job( TrainingJobName=sklearn_estimator.latest_training_job.name)['ModelArtifacts'['S3ModelArtifacts'] >>print('Model artifact persisted at ' + artifact) >>2022-08-25 12:03:27 Starting - Starting the training job.... >>2022-08-25 12:03:52 Starting - Preparing the instances for training............ >>2022-08-25 12:04:55 Downloading - Downloading input data...... >>2022-08-25 12:05:31 Training - Downloading the training image......... >>2022-08-25 12:06:22 Training - Training image download completed. Training in progress.. >>2022-08-25 12:06:32 Uploading - Uploading generated training model. >>2022-08-25 12:06:43 Failed - Training job failed --------------------------------------------------------------------------- UnexpectedStatusException Traceback (most recent call last) <ipython-input-37-628f942a78d3> in <module> ----> 1 sklearn_estimator.latest_training_job.wait(logs='None') 2 artifact = m_boto3.describe_training_job( 3 TrainingJobName=sklearn_estimator.latest_training_job.name)['ModelArtifacts']['S3ModelArtifacts'] 4 5 print('Model artifact persisted at ' + artifact) ~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/estimator.py in wait(self, logs) 2109 self.sagemaker_session.logs_for_job(self.job_name, wait=True, log_type=logs) 2110 else: -> 2111 self.sagemaker_session.wait_for_job(self.job_name) 2112 2113 def describe(self): ~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/session.py in wait_for_job(self, job, poll) 3226 lambda last_desc: _train_done(self.sagemaker_client, job, last_desc), None, poll 3227 ) -> 3228 self._check_job_status(job, desc, "TrainingJobStatus") 3229 return desc 3230 ~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/session.py in _check_job_status(self, job, desc, status_key_name) 3390 message=message, 3391 allowed_statuses=["Completed", "Stopped"], -> 3392 actual_status=status, 3393 ) 3394 UnexpectedStatusException: Error for Training job rf-scikit-2022-08-25-12-03-25-931: Failed. Reason: AlgorithmError: framework error: Traceback (most recent call last): File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_trainer.py", line 84, in train entrypoint() File "/miniconda3/lib/python3.7/site-packages/sagemaker_sklearn_container/training.py", line 39, in main train(environment.Environment()) File "/miniconda3/lib/python3.7/site-packages/sagemaker_sklearn_container/training.py", line 35, in train runner_type=runner.ProcessRunnerType) File "/miniconda3/lib/python3.7/site-packages/sagemaker_training/entry_point.py", line 100, in run wait, capture_error File "/miniconda3/lib/python3.7/site-packages/sagemaker_training/process.py", line 291, in run cwd=environment.code_dir, File "/miniconda3/lib/python3.7/site-packages/sagemaker_training/process.py", line 208, in check_error info=extra_info, sagemaker_training.errors.ExecuteUserScriptError: ExecuteUserScriptError: ExitCode 1 ErrorMessage "" Command "/miniconda3/bin/python script.py" ExecuteUserScriptErr > I am happy for some help
0
answers
0
votes
11
views
asked a month ago

Mikrotik CHR sever connection lost

I have created a server using the Mikrotik created AMI for their CHR software. I keep losing connection to the server entirely; no Winbox, no SSH, no console connect from the Instances page. I keep having to spin up a new server and rebuild my work. No other AWS server (mostly Ubuntu AMI's) on our account has had this issue. I am assuming there is something about the CHR AMI that I am missing which is causing this issue. I am attempting to set up a VPN using OpenVPN to connect the field devices my employer creates. A previous VPN project was run last year and that server was up for nearly the full year and we could still connect to it, until I removed the PPTP setup and replaced it with the Mikrotik built-in OpenVPN server. Here is the config export for the CHR. ``` # mar/31/2022 17:55:47 by RouterOS 6.44.3 # software id = # # # /interface bridge add arp=local-proxy-arp fast-forward=no name=afads priority=0x8192 \ transmit-hold-count=1 /interface ethernet set [ find default-name=ether1 ] disable-running-check=no /interface wireless security-profiles set [ find default=yes ] supplicant-identity=MikroTik /ip pool add name=afadpool ranges=10.8.0.1-10.8.127.255 /ppp profile set *0 bridge=afads change-tcp-mss=default local-address=10.8.0.1 only-one=\ yes use-encryption=yes add bridge=afads local-address=10.8.0.1 name=SmartFlaggerL3 only-one=yes \ remote-address=afadpool use-encryption=yes /interface bridge port add bridge=afads hw=no interface=ether1 add bridge=afads interface=*F005C9 add bridge=afads interface=*F004E9 add bridge=afads interface=dynamic /interface ovpn-server server set auth=sha1 certificate=[ServerCertName] cipher=aes256 default-profile=\ SmartFlaggerL3 enabled=yes keepalive-timeout=30 netmask=17 /ip firewall address-list add address=10.8.40.1 list=undeployed [Removed approx 4000 lines, similar to the one above] /ip firewall filter add action=accept chain=forward comment=\ "Allows units in the Test group to communicate." dst-address-list=test \ src-address-list=test add action=accept chain=forward comment=\ "Allows all traffic from Internal Trusted Servers to units." \ dst-address-list=!InternalTrustedServers src-address=0.0.0.0 \ src-address-list=InternalTrustedServers add action=accept chain=forward comment=\ "Allows all traffic from units to Internal Trusted Servers." \ dst-address-list=InternalTrustedServers add action=accept chain=forward comment="Test of unit to unit communication" \ disabled=yes dst-address-list=test src-address-list=test add action=accept chain=forward comment=\ "Accept Forward for Established and Related Connections" \ connection-state=established,related,untracked add action=accept chain=forward comment="Allow Forwarding by OVPN Clients" \ src-address=192.168.22.128/25 add action=accept chain=input comment=\ "Accept Input for Established and Related Connections" connection-state=\ established,related,untracked add action=accept chain=input comment="Allow OpenVPN Connection" dst-port=\ 1194 protocol=tcp add action=accept chain=input comment="Allow Input by OVPN Clients" \ in-interface=all-ppp add action=accept chain=input comment="Allow Winbox Input" dst-port=8291 \ protocol=tcp add action=accept chain=input comment="Allow HTTPS Input" dst-port=443 \ protocol=tcp add action=drop chain=input comment="Input drop for all other connection" \ disabled=yes add action=drop chain=forward comment="Forward drop for all other connection" \ disabled=yes add action=drop chain=forward comment="Invalid drop for all other connection" \ connection-state=invalid disabled=yes add action=drop chain=forward comment="PREVENT ALL TALK BETWEEN UNITS." \ disabled=yes src-address=!10.8.0.5 /ip firewall nat add action=masquerade chain=srcnat out-interface=all-ppp /ip service set telnet disabled=yes set ftp disabled=yes set www disabled=yes set api-ssl disabled=yes /ppp secret add name=AFD0001 password=[Redacted] profile=SmartFlaggerL3 remote-address=\ 10.8.80.1 service=ovpn [Removed nearly 4000 lines, similar to the one above] /system identity set namep[AWS instance auto-generated name] /system logging add topics=ovpn add topics=debug ```
0
answers
0
votes
19
views
asked 6 months ago
  • 1
  • 12 / page