2 Services
There are two management protocol independent EVA services provided, the basic Event and Alarm service and the Log Control service. The basic EVA service provides clients with an API for registering and sending events and alarms. The Log control service provides a mechanism for control of generic logs. Also included is a specialization of the generic log function for logging of events and alarms.
Each service provides client functions that can be used from applications in the system to, for example, send alarms. There is also an API that management applications can use to monitor and control the system. This API can be extended for specific management protocols, such as SNMP or CORBA.
2.1 Basic Event and Alarm Service
This service contains functions for the client API to EVA. EVA is a distributed global application, which means that clients can access the EVA functionality from any node.
Clients can register and send events and alarms. Management applications can subscribe to event and alarms, and control the treatment of them.
An event is a notification sent from the NE to a management application. An event is uniquely identified by its name. A special form of an event is an alarm. An alarm represents a fault in the system that needs to be reported to the manager. An example of an alarm could be
equipment_on_fire. When an alarm is sent, it becomes active, and is stored in an active alarm list. When the application that sent the alarm notices that the fault that caused the alarm is not valid anymore, it clears the alarm. When an alarm is cleared, the alarm is deleted from the active alarm list, and anclear_alarmevent is generated by EVA. Each fault may give rise to several alarms, maybe with different severities. There can however only be one active alarm for each fault at the same time. For example, associated with disk space usage may be two alarms,disk_80_percent_filledanddisk_90_percent_filled. These two alarms represents the same fault, but only one of them can be active at the same time. An active alarm is identified by its fault_id. In contrast to alarms, ordinary events do not represent faults, and are not stored as the alarms in the active alarm list.The basic EVA server is a global server to which all events and alarms are sent. The server updates its tables (e.g. the active alarm list), and sends the event or alarm to the
alarm_handlerprocess that runs on the same node as the global server.alarm_handleris agen_eventprocess defined in the SASL application.Before a client can send an event or alarm, the name of the event must be registered in EVA. To register an event, a client calls
register_event/2. The parameters of this function are the name of the event and whether the event should be logged by default or not. A manager can decide to change this value later. To register an alarm, a client callsregister_alarm/4. The parameters of this function are the name and logging parameters as for events, and the class and default severity of the alarm.EVA stores the definitions of events and alarms in the Mnesia tables
eventTableandalarmTablerespectively. Since an alarm is a special form of an event, each alarm is present in both of these tables. The active alarm list is stored in the Mnesia tablealarm. The records for all these tables are defined in the header fileeva.hrl, available in theincludedirectory in the distribution.2.1.1 Event Definition Table
All registered events are stored in the
eventTable. It has the following attributes:
name
log
generated
The event is uniquely identified by its
name, which is an atom.The
logattribute is a boolean flag that tells whether this event should be stored in some log when it is generated or not. This attribute is writable.The
generatedattribute is a counter that counts how many times the event has been generated.2.1.2 Alarm Definition Table
The
alarmTableextends theeventTable, and has the following attributes:
name
class
severity
The alarm is uniquely identified by its
name, which is an atom. Note that each alarm is present in theeventTableas well.The
classattribute categorizes the alarm, and is defined when the alarm is registered. It is as defined in X.733, ITU Alarm Reporting Function:
communications. An alarm of this class is principally associated with the procedures or processes required to convey information from one point to another.
qos. An alarm of this class is principally associated with a degradation in the quality of service.
processing. An alarm of this class is principally associated with a software or processing fault.
equipment. An alarm of this class is principally associated with an equipment fault.
environmental. An alarm of this class is principally associated with a condition relating to an enclosure in with equipment resides.
The
severityparameter defines five severity levels, which provide an indication of how it is perceived that the capability of the managed object has been affected. Those severity levels which represent service affecting conditions ordered from most severe to least severe arecritical,major,minorandwarning. The levels used are as defined in X.733, ITU Alarm Reporting Function:
indeterminate. The Indeterminate severity level indicates that the severity level cannot be determined.
critical. The Critical severity level indicates that a service affecting condition has occurred and an immediate corrective action is required. Such a severity can be reported, for example, when a managed object becomes totally out of service and its capability must be restored.
major. The Major severity level indicates that a service affecting condition has developed and an urgent corrective action is required. Such a severity can be reported, for example, when there is a severe degradation in the capability of the managed object and its full capability must be restored.
minor. The Minor severity level indicates the existence of a non-service affecting fault condition and that corrective action should be taken in order to prevent a more serious (for example, service affecting) fault. Such a severity can be reported, for example, when the detected alarm condition is not currently degrading the capacity of the managed object.
warning. The Warning severity level indicates the detection of a potential or impending service affecting fault, before any significant effects have been felt. Action should be taken to further diagnose (if necessary) and correct the problem in order to prevent it from becoming a more serious service affecting fault.
When an alarm is cleared, a
clear_alarmevent is generated. This event clears the alarm with thefault_idcontained in the event. It is not required that the clearing of previously reported alarms are reported. Therefore, a managing system cannot assume that the absence of anclear_alarmevent for a fault means that the condition that caused the generation of previous alarms is still present. Managed object definers shall state if, and under which conditions, theclear_alarmevent is used.2.1.3 Active Alarm List
The active alarm list is stored in the ordered Mnesia table
alarm. The corresponding record is sent to thealarm_handlerwhen an alarm is sent. It has the following read-only attributes:
index
fault_id
name
sender
cause
severity
time
extra
A row in the active alarm list is uniquely identified by its
fault_id. However, to make the table ordered, the alarms uses the integerindexas a key into the table. For each new alarm, EVA allocates a newindexthat is greater than theindexof all other active alarms.The
nameis the name of the corresponding alarm type, defined inalarmTable.
senderis a term that uniquely identifies the resource that generated the alarm.
causedescribes the probable cause of the alarm.
severityis the perceived severity of the alarm.
timeis the UTC time the alarm was generated.
extrais any extra information describing the alarm.2.1.4 Event
When an event is generated, the
eventrecord is sent toalarm_handler. It has the following attributes:
name
sender
time
extra
The
nameis the name of the corresponding event type, defined ineventTable.
senderis a term that uniquely identifies the resource that generated the event.
timeis the UTC time the event was generated.
extrais any extra information describing the event.2.1.5 Example
As an example of how to register and send events and alarms, consider the following code:
%%%----------------------------------------------------------------- %%% Resource code %%%----------------------------------------------------------------- reg() -> eva:register_event(boardRemoved, true), eva:register_event(boardInserted, false), eva:register_alarm(boardFailure, true, equipment, minor). remove_board(No) -> eva:send_event(boardRemoved, {board, No}, []). insert_board(No, BoardName, BoardType) -> eva:send_event(boardInserted, {board, No}, {BoardName, BoardType}). board_on_fire(No) -> FaultId = eva:get_fault_id(), %% Cause = fire, ExtraParams = [] eva:send_alarm(boardFailure, FaultId, {board, No}, fire, []), FaultId.Two events and one alarm is defined. Board removal is an event that is logged by default, and board insertion is an event that is not logged by default. The alarm
equipmentFailureis a minor alarm that is logged by default.When the application detects that board
Nis on fire,board_on_fire(N)is called. This function is responsible for sending the alarm. It gets a new fault identifier for the fault, and callseva:send_alarm/5, pointing out the faulty board (N), and suggests that the probable cause for the equipment trouble isfire.The
board_on_firefunction returns the fault identifier for the new alarm. This fault identifier can be used at a later time in a call toeva:clear_alarm(FaultId)to clear the alarm.2.2 Log Control Service
The Log Control service contains functions for monitoring logs, and functions for transferring logs to remote hosts, e.g. management stations. The main purpose of the Log Control service is to provide one entity through which all logs in the system can be controlled by a management station. Regardless of the type log, all logs are controlled in a similiar fashion.
Clients can register their logs in the log server. Management applications can control the logs, and transfer the logs to a remote host.
2.2.1 Log Monitoring
This service uses a log server that monitors all logs in the system. Each log uses the standard module
disk_logfor the actual logging.Each log has an administrative and an operational status, that both can be either
upordown. If the operational status isup, the log is working, and if it isdown, the log does not work. The administrative status is writable, and reflects the desired operational status. Normally they are both the same. If the administrative status is set toup, the operational status will beupas well. However, if the log for some reason does not work, e.g. if the disk partition is full, the operational status will bedown. When the operational status is down, no events are logged in the log.2.2.1.1 Alarms
The
Tlogservice defines two EVA alarms;log_file_errorandlog_wrap_too_often.
log_file_error. This alarm is generated if a file error occurs when an item is logged. Default severity iscritical. The cause for this alarm can be anyReasonas returned fromfile:writein case of error. The alarm is cleared if the file system starts working again. For example, the alarm can be generated if the partition is full, and cleared when space is available.
log_wrap_too_often. This alarm is generated when the log wraps more often than the wrap time. Default severity ismajor. The cause for this alarm is undefined. The alarm is cleared if the log wraps within the wrap time, the next time it wraps.
2.2.1.2 Example
The following is an example of code that creates a log to be controlled by the generic Log Control function:
start() -> disk_log:open([{name, "ex_log"}, {file, "ex_log/ex_log.LOG"}, {type, wrap}, {size, {10000, 4}}]), log:open("ex_log", ex_log_type, 3600). test() -> %% Log an item disk_log:log("ex_log", {1, "log this"}), %% Set the administrative status of the log to 'down' log:set_admin_status("ex_log", down), %% Try to log - this one won't be logged disk_log:log("ex_log", {2, "won't be logged"}), Logs1 = log:get_logs(), %% Set the administrative status of the log to 'up' log:set_admin_status("ex_log", up), %% Log an item disk_log:log("ex_log", {3, "log this"}), Logged = disk_log:chunk("ex_log", start), {Logs1, Logged}.2.2.2 Log Transfer
It is possible to transfer a log to a remote host. When the log is transferred, the log may be filtered, and the log records may be formatted.
As the logs are implemented as
disk_loglogs, each log consists of several log files. When the log is transferred, it is written to one single file on the remote host. Whendisk_logis used, the log records are normally not formatted when they are stored in the log, in order to increase log performance. However, a manager will probably need the log formatted in a human readable format. Thus, when the log is being transferred, each log record may be formatted in a log specifc way. Of course, to further increase performance, the log can be transferred as is, and leave it to the managar to format the log off-line.2.3 EVA Log Service
The EVA log service uses the generic Log Control service to implement log functionality for events and alarms defined in EVA.
In the rest of this description, the term event refers to both events and alarms as defined in EVA.
This log functionality supports logging of events from EVA. It uses the module
disk_logfor logging of events. There can be several event logs active at the same time. It is possible to create new event logs dynamically, either from within an application, or from a management system. Each log uses a filter function to decide whether an event should be stored in the log or not.There is a concept of a default log. The default log is used to log any event that has the
logflag ineventTableset totrue, but no log is currently able to store the event (or there is no other log defined to log the event). The usage of the default log is optional.For example, suppose that we want to define an alarm log, that logs all alarms in the system. We can do this with the following code:
-module(alarm_log). -export([alarm_filter/1, make_alarm_log/0]). alarm_filter(Item) when record(alarm, Item) -> true; alarm_filter(_) -> false. make_alarm_log() -> disk_log:open([{name, "alarm_log"}, {format, internal}, {type, wrap}, {size, {10000, 10}}]), eva_log:open("alarm_log", {alarm_log, alarm_filter, []}, 36000).If we set the administrative status of this log to
down, and an alarm that should be logged according to its definition in theeventTable, the alarm is stored in the default log instead of"alarm log"(provided there are no other logs that are defined to log the alarm).