Agent running yet OEM shows unreachable/metric collection error…what gives?

One of the most confusing (and frustrating) things with Oracle Enterprise Manager is figuring out why agents are not uploading from time-to-time.  This issue was worse in previous versions; in OEM12c the uploading issues have been somewhat corrected and do not seem to be that big of an issue.  What I have found to be more of an issue with OEM12c is when the agent says it is unreachable within OEM12c.

An unreachable agent could be almost anything, a firewall blocking the required upload port, invalid DNS entries, and hostname configuration issues to network related issues.  In most of these cases, except for agent configuration issues, the DBA doesn’t have access to resolve these issues and require the assistance of different departments within the IT.  Sometimes these external resources are not available to help troubleshoot and resolve network issues; example, if you are trying to configure OEM12c at home as I have.

The problem I recently ran into while adding an agent to a local server without DNS; was that the agent would install and run; yet OMS couldn’t sync it due to metric collection errors.  The status of the agent after the install was up and running yet OEM still wouldn’t recognize it.  This prompted me to uninstall the agent and push the agent again.  From the OEM12c management server, I was able to push the agent and add the target successfully though the Add Host Targets wizard.  Although the host target is added successfully, the status in OEM12c still showed “Unreachable” due to Metric Collection Error.   Why is this target unreachable, I just added the agent and didn’t have any problems with the push form OEM12c?

I was perplexed to say the least; however, in searching MOS I came across a note that helped resolve this issue.  The note number for reference is: 1440682.1.  This note outlines similar symptoms to what I was having and provided a workable solution.  What I found interesting in this note was the fact that this issue is an unpublished bug.  The note also gives examples of what messages may be received from OEM via notification messages.

The incorrect message that may be received is (via notification emails):

Message=Agent is Unreachable (REASON = unable to connect to the agent at https://hostname.domain:3872/emd/main/ [Connection timed out]). Host is unreachable (REASON = Unknown Error pinging the host of URL https://hostname.domain:3872/emd/main/.1)

As outlined in the MOS note, the workaround for this issue is to check to see if the ping property is set.  If so, then it needs to be disabled to allow for the target host status to be changed.  The steps below will assist in resolving this issue:

On the OMS, check the property via emctl

./emctl get property –name oracle.sysman.core.omsAgentComm.ping.pingCommand

If it returns this status:

Emdrep.ping.pingCommand=%EM_PING_COMMAND%

This is the reason for the invalid agent status and metric collection error in OEM12c.

Disable the ping command:

./emctl delete property –name “oracle.sysman.core.omsAgentComm.ping.pingCommand”

With this property removed, the OMS will ping targets using an alternative successful method (getPingCmdForOS).

Now stop and restart the OMS:

./emctl stop oms

./emctl start oms

Lastly, ensure that the agent is started and that the status in OMS is saying up.  (may take a few minutes due to agent uploads)

Let me know what you thing about this resolution to this interesting problem.

About these ads

3 comments

  1. Richard · · Reply

    Not sure if you can assist but I’m having the same problem with a Windows 12c CC not connecting to a Redhat 6 Linux x64 OS. When I run the command you mentioned on the Windows server, I get the following:

    D:\oracle\product\12.1.0\middleware\oms\BIN>emctl get property –name oracle.sysman.core.omsAgentComm.ping.pingCommand
    Oracle Enterprise Manager Cloud Control 12c Release 3
    Copyright (c) 1996, 2013 Oracle Corporation. All rights reserved.
    Found unrecognized argument: ûname
    Invalid arguments!

    Any Ideas?

    1. Richard · · Reply

      I was able to see the information by using .\emctl (windows go figure); however it said it was set to null (Value for property oracle.sysman.core.omsAgentComm.ping.pingCommand for oms All Management Servers is null). Still unable to get the Linux host agents to send information to the Windows EMCC.

      1. Glad you found the answer to your problem.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

HeliFromFinland

Heli's Oracle thoughts

Julian Dontcheff's Database Blog

The good DBA is one who learns from his mistakes, the best DBA is one who learns from other DBA's mistakes

Martins Blog

Trying to explain complex things in simple terms

Oracle Data Warrior

Changing the world, one data model at a time. How can I help you?

Maaz Anjum's Blog

A life yet to be lived...

PeteWhoDidNotTweet

Stuff that interests me, if not you!

The Oracle Instructor

Explain, Exemplify, Empower

Frits Hoogland Weblog

IT Technology; Oracle, linux, TCP/IP and other stuff I find interesting

Oracle Spin - Flimatech Blog

Sharing Our Database Experience

Follow

Get every new post delivered to your Inbox.

Join 1,599 other followers

%d bloggers like this: