While the promise of lights out may be unattainable for most IT departments, achieving a "lights-dimmed" status is well within the reach of most system managers.
"Lights out," a term used to describe an IT operation that is completely automated, is often viewed as the ideal state of systems management, given the implied cost savings and error elimination it would produce. In reality, environments can sustain lights-out state over selected segments and still benefit substantially. This real-world compromise can still yield more productive time for staff and fewer problems in the immediate and long term.
Systems-management solution provider CCSS has released a Best Practice Guide to help IT managers implement a lights-out automation state on their systems. The guide to lights-out automation suggests that IT managers should begin the assessment of their systems by examining the dependence of users on key system events, applications, and processes, especially those under stress or threatened by potential disruptions.
Where these events, jobs, and processes exist, automation can reduce the impact of unforeseen circumstances that might otherwise quickly throw all subsequent events, tasks, or processing into disarray. These disruptions could leave an already-stretched team and their management to cope with the consequences. The Best Practices Guide cites examples of similar situations taken from the real world. It examines the circumstances of a large retail organization that must upload its daily price lists to the system. When this one event does not start or end on time, the impact can resonate throughout the organization, affecting every aspect—from stock replenishment to distribution.
Ray Wright, CEO of CCSS, explains why the quest for lights out remains the Holy Grail for system managers today: "Part of the lights-out challenge is to look beyond the systems environment and identify areas of vulnerability. It may be that a specific job must be started by a manager from a specific division, but if there is a delay that coincides with a shift change, or perhaps even an external delay, then these small problems could set in motion a chain of delays which quickly add up to a serious situation should it be after hours, where on-call staff is needed to help resolve the problem, or when even an unnoticed problem prevents users from working, and it could quickly become a very expensive issue. Avoiding that type of needless expense is absolutely essential for IT managers today."
While the vision of lights out in its most extreme form may be unattainable for most IBM shops, the attainability of a "lights-dimmed" status is well within the reach of most system managers. Doing so has the economic advantage and cumulative impact of automating distinct areas of work, processes, events, and procedures.
In addition to monitoring critical job areas, such as whether end-of-day tasks start and finish on time so as to avoid impacting other dependant jobs or processes, managers can automate the vast majority of operational daily checklists and non-essential system-generated messages, which will free up operator time to be used elsewhere. When exception conditions occur, tailored escalation procedures will send alerts, according to staff availability, priority, shift patterns, or other requirements. These may be received by various devices (email, mobile phone, etc.) to ensure they are not overlooked and slip through unnoticed in a busy environment.