Cisco ACI faults life cycle [Explained]
What is a Fault in ACI:
A fault is represented as a stateful and persistent MO. When a specific condition occurs, such as a component failure or an alarm, the system creates a fault MO as a child object to the MO that is primarily associated with the fault.
For a fault object class, the fault conditions are defined by the fault rules of the parent object class. A MO class can have multiple defined faults, each of which has a different fault code and a different fault rule.
For a given fault code, a parent MO instance can have only one fault MO.
There are four type of fault triggers:
- Fault Rules
- Counters crossing thresholds
- Task or state failures
- Object resolution failures
A fault raised by the system can transition through more than one severity during its life cycle. This table describes the possible fault severities in decreasing order of severity:
Cisco ACI Faults States:
APIC fault MOs are stateful, and a fault raised by the APIC transitions through more than one state during its life cycle. In addition, the severity of a fault might change due to its persistence over time, so a change in the state may also cause a change in severity.
- Soaking: A fault MO is created when a fault condition is detected. The initial state is Soaking, and the initial severity is specified by the fault policy for the fault class. Because some faults are important only if they persist over a period of time, a soaking interval begins, as specified by the fault policy.
During the soaking interval, the system observes whether the fault condition persists or whether it is alleviated and reoccurs one or more times. When the soaking interval expires, the next state depends on whether the fault condition remains.
- Soaking-Clearing: If the fault condition is alleviated during the soaking interval, the fault MO enters the Soaking-Clearing state, retaining its initial severity. A clearing interval begins.
If the fault condition returns during the clearing interval, the fault MO returns to the Soaking state. If the fault condition does not return during the clearing interval, the fault MO enters the Retaining state.
- Raised: If the fault condition persists when the soaking interval expires, the fault MO enters the Raised state. Because a persistent fault might be more serious than a transient fault, the fault is assigned a new severity, the target severity.
- Raised-Clearing: When the fault condition of a Raised Fault is alleviated, the fault MO enters the Raised-Clearing state. The severity remains at the target severity, and a clearing interval begins. If the fault condition returns during the clearing interval, the fault MO returns to the Raised state.
- Retaining: When the fault condition is absent for the duration of the clearing interval in either the Raised-Clearing or Soaking-Clearing state, the fault MO enters the Retaining state with the severity level cleared. A retention interval begins, during which the fault MO is retained for the length of time that is specified in the fault policy. If the fault condition has not returned before the retention interval expires, or if the fault is acknowledged by the user, the fault MO is deleted.
Cisco ACI Faults Life cycle:
- Clearing Interval: The range is 0 to 3600 seconds. The default is 120 seconds.
- Retention Interval: The range is 0 to 31536000 seconds. The default is 3600 seconds.
- Soaking Interval: The range is 0 to 3600 seconds. The default is 120 seconds.
Check ACI Faults reference to verify the cause of the fault and how to treat it according to its code.