Oracle Enterprise Manager 12c is a great monitoring tool for the enterprise, I think I’ve said that more than once over the last two years; however, with every release small yet simple things change. It is always the small things that will get you. I had setup monitoring for a client using monitoring templates within OEM12c; everything was being monitored, so I thought! I got a call from my client asking why nobody was alerted when the Fast Recovery Area (FRA) was filled due to archive logs. My initial response was it should have alerted, I’ll look into what happen.
Naturally, the first place I started was with the monitoring template (Enterprise –> Monitoring –> Monitoring Templates –>View/Edit desired template) to check and make sure that the Archive Area Used (%) metric is set.
The monitoring template for the database instances had the Archive Area Usage (%) metric and it is set to email a warning when 80% full and email a critical when 90% full. Why was the emails not triggered? The template has been applied to all database instances.
The easiest way to find out what this metric is “suppose” to do, is to look at the reference documentation on supported metrics (here). This particular metric is listed under Database Instance. In reading the description of the Archive Area Used (%) metric, I found a note that leads directly to what the issue was.
As the notes says, if the database is using the Fast Recovery Area (FRA) for archive logs; then the metrics associated with archive logs do not apply. The metric Recovery Area Free Space (%) has to be used to monitor the Fast Recovery Area. Ok, simple enough; lets just add the metric to the template.
When trying to add Recovery Area Free Space (%) to the template using Database Instance Target Type, there is no metrics for Fast Recovery Area (Image shows a partial list of metric categories). Where is Fast Recovery Area metrics?
Again, I go back to the reference guide and lookup Fast Recovery Metrics. Section 5.34 of the reference guide has a good bit of information on the metrics related to the Fast Recovery Area, but no definitive answers on where these metrics are stored or how to add them to a template.
At this point, what do I know?
- Archive Area Usage (%) cannot be used to monitor the Fast Recovery Area.
- What metrics are needed to monitor Fast Recover Area, but cannot find them to add them to a template.
Maybe “All Metrics” under a database target would shed some light on the situation.
To access “All Metrics” for a database instance, follow Targets –> Databases –> Database Instance. Once I was at the database instance I wanted to look out, then I went Oracle Database –> Monitoring –> All Metrics.
Once in “All Metrics”, I can see every metric that is associated with an Oracle Database Instance. At the top of the metric tree, there is a search box for finding a metric. When I search for “Fast”, I find all the Fast Recovery metrics.
Great, I found all the metrics that I want related to Fast Recovery Area. Now how do I get them into a template so I can set thresholds for monitoring? Back to the template (Enterprise –> Monitoring –> Monitoring Templates).
When I edit the template, I noticed (have always noticed) the tabs at the top: General, Metric Thresholds, Other Collected Items, Access. Normally, I’m only worried about the metrics on the Metric Thresholds tab; since I haven’t had any luck adding the metrics I wanted, lets take a look at the “Other Collected Items” tab.
Scrolling down through the “Other Collected Items” tab, I find the Fast Recovery category for metrics.
Apparently, the Fast Recovery metrics are already added to the template; how do the metrics, “Other Collected Items” tab, work or alerted against. Again, back to the documentation.
This time when looking at the documentation, I needed to look up templates to find the answer I needed. In section 8.2 of the Oracle Enterprise Manager Cloud Control Administrator’s Guide, I find the answer I needed. Here is why the Fast Recovery Area metrics are not configurable with thresholds:
Oracle has made all the metrics related to Fast Recovery Area non-metric! That is right, OEM is gathering the information but not allowing you to alert on it with thresholds! Although it is part of the template, the template will gather the information; but in the end I would need to go to “All Metrics” to see the results.
If you want to monitor the Fast Recovery Area and have thresholds against metrics; the solution is to use Metric Extensions. Metric Extensions allow the end user to create custom metrics for monitoring. Once an Metric Extension is created, it will be seen in “All Metrics” and then can be added to a monitoring template with thresholds assigned.
Instead of going into how to develop Metric Extensions in this post, I have provided some really great posts on how to implement and use Metric Extensions below. I have also provide a link to a similar post which includes showing how the metric extensions are setup by Courtney Llamas of Oracle.
Almost everyone now is using Fast Recovery Area to store their backups and archive log. Monitoring of this area is critical; however, out of the box Oracle Enterprise Manager 12c, needs to be adjusted to monitor the Fast Recovery Area with the correct metrics. This slight change in metric monitoring came as a surprise versus previous editions of OEM. In the end, OEM is still a good monitoring tool for the enterprise; just now we need to make some small adjustments.
Friendly Oracle Employees – Pete Sharman (would say find him on twitter as well but he doesn’t tweet)
Oracle Enterprise Manager 12c Documentation (http://docs.oracle.com/cd/E24628_01/index.htm)