How can I troubleshoot common errors when using the OEM_AGENT option with Amazon RDS for Oracle?

10 minute read
0

I have an Amazon Relational Database Service (Amazon RDS) DB instance that is running Oracle. I receive an error when I try to use the OEM_AGENT option with my DB instance. How can I troubleshoot and resolve common errors when using the OEM_AGENT option?

Short description

Amazon RDS supports the Oracle Enterprise Manager (OEM) Management Agent through use of the OEM_AGENT option. You might receive one of the errors described in this article when using the OEM_AGENT option with your Amazon RDS for Oracle DB instance.

Note: Before proceeding with the troubleshooting steps, run the following pre-checks on your DB instance:

1.    Check the option group status after adding the OEM_AGENT option to the option group and applying option to the DB instance by running the following AWS Command Line Interface (AWS CLI) command:

aws rds describe-db-instances --db-instance-identifier <db-instance-name> --query 'DBInstances[*].[Engine,DBInstanceStatus,OptionGroupMemberships]'

Note: If you receive errors when running AWS CLI commands, make sure that you’re using the most recent version of the AWS CLI.

Verify the output to be sure that the option group status is in-sync. If the option group status is in-sync, then you get the following output:

[
    [
        "oracle-ee",
        "available",
        [
            {
                "OptionGroupName": "custom-oracle-option",
                "Status": "in-sync"
            }
        ]
    ]
]

If the option group status is INVALID, then the OEM_AGENT isn't installed successfully due to issues with the network configuration setup and other prerequisites.

You can verify the installation status of the OEM_AGENT option by reviewing the Events section of your Amazon RDS DB instance from the AWS Management Console. You can also use the AWS CLI command describe-events.

If the events indicate an issue with the installation of the OEM_AGENT option or network configuration, then be sure that the succeeding prerequisites are met.

2.    Check your DB instance's network configuration. The security group of your DB instance must allow OMS_HOST to listen to the OEM_AGENT port (default is 3872) and the RDS port (default is 1521).

3.    Run a telnet test from the Oracle Management Service (OMS) server to your DB instance on the OEM agent port and database port to check connectivity.

4.    Check your network configurations, including network access control lists (ACLs) and route tables. Verifying the configurations rules out the possibility of blockers or an explicit deny.

5.    Make sure that the firewall between OMS and RDS allows traffic on both the DB listener port and OEM Agent port.

After you complete these checks, check if OEM_AGENT upload is working. For more information on OEM_AGENT prerequisites, see Oracle Management Agent for Enterprise Manager Cloud Control.

Resolution

To troubleshoot issues when using the OEM_AGENT option with your RDS for Oracle instance, review the OEM agent logs after exporting these logs to Amazon CloudWatch. For more information, see Publishing Oracle logs to Amazon CloudWatch Logs.

The DB instance doesn't show up in the Auto Discovery of targets on the OEM console

This issue occurs when the OMS server can't connect to the underlying host via SSH. This connection is a prerequisite at the OS level for Auto Discovery to work correctly. Unlike the wizard-based Auto Discovery that's used when you add a target Oracle DB instance, you must manually add your Oracle DB instance as the target. For more information on limitations for the Management Agent, see Limitations for Management Agent.

Error: Unable to install the Oracle OEM_AGENT because the agent password is incorrect or expired

Make sure that the agent password is correct and that it's not expired. On the OEM server, you can modify the existing agent registration password, or create a new password.

Error: Unable to install the Oracle OEM_AGENT because the DB instance cannot reach the OMS host

You receive this error when the OEM_AGENT fails to install because the OMS host/port can't be reached from the RDS host. To troubleshoot this error, check whether the OMS host can be reached from your DB instance.

To validate the network connectivity between the OMS server and OEM_AGENT, test the connection from the RDS for Oracle instance to the OMS server. To do so, you can leverage access control lists (ACL) and UTL_TCP packages.

  • Use the DBMS_NETWORK_ACL_ADMIN package that provides the interface to administer the network ACL.
  • Use the UTL_TCP.CONNECTION. This is a PL/SQL record type that's used to represent a TCP/IP connection.

You can also do the following:

1.    Launch an Amazon Elastic Compute Cloud (Amazon EC2) instance with the same network setup (SG/ACL) as your DB instance.

2.    Run a telnet command from the Amazon EC2 instance to the OMS host on port 4903:

telnet OMS_HOST 4903

3.    Validate the connectivity by running a telnet test in your OEM host from the OMS server to your DB instance:

telnet RDS-instance-endpoint 1521 (RDS default port)

4.    Check if the RDS host is able to resolve the OMS hostname into an IP address:

SQL> SELECT UTL_INADDR.get_host_address('OMS_Host_Name') FROM dual;

5.    Run a TCP Traceroute to check where the traffic is blocked.

Error: You successfully installed the OEM_AGENT option on your DB instance. Your security group might not be configured correctly

Even if your installation completes correctly, the RDS security group associated with your DB instance might be missing configurations to allow communication between the OMS host and DB instance.

To resolve this error, verify that the security group of the agent allows inbound traffic, and that your OMS host belongs to a security group that has access to the agent port. For more information, see Oracle Management Agent for Enterprise Manager Cloud Control.

Error: Unable to install the Oracle OEM_AGENT because your OMS host version x.x.x.x is not compatible with the agent version x.x.x.x.

You receive this error when there is a compatibility issue between your OEM_AGENT version and the OMS host version. Currently, OEM_AGENT integrates with the OMS when both components are compatible. To resolve this error, choose compatible versions for both OMS host and OEM agent. For information on reviewing the compatibility matrix, see Prerequisites for Management Agent.

Error: Your OMS host is using an untrusted third-party certificate

You receive this error when you successfully install the OEM_AGENT option, but your OMS host is using a third-party certificate that isn't trusted. Configure your OMS host with the required trust certifications from your third party.

Error: OEM_AGENT option is missing required option settings (Service: AmazonRDS; Status Code: 400; Error Code: InvalidParameterValue;

You receive this error when OEM_AGENT is missing one of the required settings, and you need to specify this setting. For more information on the required settings for OEM_AGENT, see Option settings for Management Agent.

Error: Heartbeat Status: OMS responded illegally [ERROR - Failed to Update Target Type Metadata]

You receive this error when the OMS host is replaced after the OEM_AGENT option is attached to Amazon RDS.

1.    Clear the agent status, or re-start the OEM_AGENT using the steps in Performing database tasks with the Management Agent.

2.    Re-establish your connection with the OMS host.

3.    Check for compatibility issues with the OMS version and OEM_AGENT version. Run the following query to check if the table lists the OEM_AGENT version used in the option group:

select type_meta_ver from sysman.mgmt_target_type_versions where target_type = 'oracle_emd';

4.    If the mgmt_target_type_versions output doesn't contain the OEM_AGENT version used in the option group, install the OEM_AGENT version that is listed in the command output.

This error might also indicate that the required OMS side patches and plugins are missing. Be sure that OMS is set up correctly and all the required patches are applied.

If the agent is blocked, do the following in the OEM console to resync the agent:

  1. Sign in to the Cloud Control console.
  2. Choose Setup, choose Manage Cloud Control, and then choose Agents.
  3. Choose the agent that you want to resync.
  4. From the Agent dropdown list, choose Resynchronization....
  5. Select Unblock agent on successful completion of agent resynchronization.
  6. Choose Continue.
    The resync operation is submitted as a job.
  7. Check the resynchronization job's status by choosing the job name's link.

After the job is completed successfully, verify the status of the agent that you resynchronized and all monitored targets.

Note: The DNS server must be functioning continuously for the OEM monitoring to work effectively. The Agent emits heartbeat and pushes status updates to the OMS host. if the OMS host isn't reachable from the agent for an extended period of time, then OMS might consider that the agent and database are down. Therefore, be sure that the DNS server is functioning.

To make the Oracle Management Agent upload the OMS associated with it, run the following query. Running this query is equivalent to running the emctl upload agent command.

SELECT rdsadmin.rdsadmin_oem_agent_tasks.upload_oem_agent() as TASK_ID from DUAL;

To restart the OEM agent after clearing the agent state, run the following query:

SELECT rdsadmin.rdsadmin_oem_agent_tasks.restart_oem_agent() as TASK_ID from DUAL;

You can deploy only Oracle Management Agent 13c Release 4 (13.4.0.0.0) using the Oracle Enterprise Manager Cloud Control 13c Release 4 (13.4.0.0.0). Fresh deployment of older Oracle Management Agent (13.2 and 13.3) versions aren't supported after OMS is upgraded to 13c Release 4. For more information, see Before you begin installing an Enterprise Manager System.

Error: Unable to install the OEM_AGENT option because the agent port conflicts with the OMS port. Update the option settings and try again

You receive this error because you have the wrong configuration for OEM_AGENT to work correctly. You might have specified the same port number for both the OMS port and the OEM_AGENT port. To resolve this error, change either the OMS port or the OEM_AGENT port number.

Review the following Management Agent option settings:

  • AGENT_PORT - This port on the DB instance listens for the OMS host. The default is 3872. Your OMS host must belong to a security group that has access to this port.
  • OMS_PORT - This HTTPS port on the OMS Host listens for the Management Agent.
    To find the HTTPS upload port, connect to the OMS host and run the following command:
emctl status oms -details

Error: Unable to install the Oracle OEM_AGENT because your DB instance does not have enough storage. Confirm that option group is supported on your DB instance class and configuration. If so, verify all option group settings and retry

You receive this error when the provisioned storage for your DB instance doesn't have enough available storage as needed according to the OEM_AGENT prerequisites. For more information, see Prerequisites for Management Agent. Increase the storage space, and then re-install the OEM_AGENT option.

Error: Filesystem / has X.XX% available space

You receive this error because of a limitation to using the OEM_AGENT option with an RDS for Oracle instance. Host metrics and the process list might not reflect the actual system state. Therefore, avoid using OEM to monitor the root file system or mount point file system. For more information, see Limitations for Management Agent.

The root file system of an Amazon RDS instance is maintained by the internal automation system. This automation system monitors the root file system at regular intervals to make sure that the file system has adequate space. If insufficient storage is detected, the automation system adds adequate space to the root file system. Because the automation system manages the space in the root file system, you can ignore this error.


Related information

Oracle Management Agent for Enterprise Manager Cloud Control

Prerequisites for Management Access

Performing database tasks with the Management Agent

AWS OFFICIAL
AWS OFFICIALUpdated 2 years ago