By Sphera’s Editorial Team | June 13, 2016

Going to the doctor for an annual checkup is a smart approach because even if you don’t feel sick, the visit can catch problems like high cholesterol or blood pressure before they become serious. And what does high cholesterol feel like anyway? Similarly, the systems that support complex business-critical enterprise functions such as environmental performance and compliance assurance may appear to be operating just fine. Yet their health can slowly deteriorate in ways that will only be caught with regular check-ups.

Why would a system “get sick” when it just does the same thing every day? Much like us, these systems don’t operate in a static environment. Changes to key personnel, business processes, and technologies may affect individual transactions and base configurations in subtle ways, rippling throughout the system and causing great organizational pain until they get treated. Maintaining a healthy system requires careful monitoring.

Pay attention to personnel changes. The ecosystem around an enterprise solution involves numerous stakeholders who have various roles, responsibilities, and skills. Understandably, when these people move into and out of specific roles, there can be a loss of key knowledge unless the transitions are carefully managed. Often a company has enough organizational and technical checks and balances in place to mitigate this risk, but in times of great change such as a significant workforce reduction or reorganization, the effectiveness of those safeguards can be compromised.

Recently a Sphera mining client performed a health check on one of their mature systems. They uncovered an error in the definition of a unit of measure that had been added to the system for modeling a new permit requirement. Technically there was nothing wrong with the way the unit of measure was entered into the system, but the numeric value of a conversion factor was in error. The management procedures in place to review changes like this weren’t followed. After the company reorganized, a new administrator took over and some of the domain experts were no longer available to review the change. So the new administrator implemented the configuration change to the best of his ability, unaware that the erroneous conversion factor was propagating throughout the system, misrepresenting emissions and compromising compliance.

In this case, the system appeared to be operating as it had the day before the change had been made. It continued to calculate emissions, generate reports and metrics, and send alerts as expected. However, a health check uncovered that the system was compromised because of this simple numeric error in a conversion factor contained deep within the system — out of sight from daily observations. Fortunately the client identified the issue and its root cause, and implemented specific corrective actions before submitting the annual emissions inventory.

Take a holistic approach to process changes. Small changes in one area can have significant and potentially unexpected implications in other areas. Many operational business processes overlap and interact with environmental, health, safety, and risk management processes across the enterprise. For example, take the real-time data collected in a process historian and passed to the emissions management system for validation, substitution, aggregation, and calculation according to specific business rules. The results are used to demonstrate compliance status, and generate or update actions that in some cases become work orders for maintenance or operations to complete in the course of normal duties.

Changes to one step of the workflow, like the calculation rules, must be done with knowledge of the incoming data as well as the outgoing reporting or compliance requirements. Otherwise, making process changes in isolation to the larger interdependencies is much like suddenly deciding to do a triathlon without considering the impact the activity will have on joints and muscles—not to mention the cardiovascular system.

More exercise sounds like a good idea, but the form it takes should take individual needs into account or else that activity risks doing more harm than good. When one of our clients decided to change the way risk severity was defined and communicated across the enterprise, they implemented a simple 1, 2, or 3 designation in the systems managing elements of risk. These codes contributed to a monthly report that the board of directors reviewed and acted upon. However, a health check on the system revealed that the risk severity in the emissions and compliance system was configured with 1 being the most severe and 3 being the least severe, while in the maintenance management system, 1 was least severe and 3 the most. For months, the board of directors had misunderstood the top operational risks and were misdirecting resources.

Monitor activities that affect the technology infrastructure. Awareness and understanding of technology changes are important to assuring the ongoing health of a business solution. This is difficult given the complex relationships among the hardware, software, operating system, browsers, and communications that make up an enterprise technology infrastructure. Like brain surgery, major upgrades of servers, browsers or operating systems across an entire company tend to command significant resources and organizational attention in order to produce a successful outcome. However, sometimes smaller activities create major unintended issues for the health of critical systems.

One client with a well-established solution and informed support team had developed a small customization to streamline data loading. Although customizations that write data directly to the database are risky because they circumvent safeguards built into the application, the client believed that the risk was contained by thorough testing. The custom procedure ran quietly in the background every evening. Everything seemed to work fine. When the next small request came into the client support team, another customization was made.

Over time, numerous stored procedures, triggers, table, views, and functions had been created or altered. Applying even a minor upgrade like a service pack became problematic, and prevented the client from taking advantage of core product enhancements that could have retired legacy customizations and provided much needed functionality to the users. By the time the system health was assessed, there were more than 400 customizations found in their database compared to an off-the-shelf product. Moreover, some of these customizations were improperly writing data to the tables in a way that didn’t look like a problem from the front end, but actually degraded the performance of the system by creating orphan records. Correcting this required significant effort as well as enforcing a new policy to avoid future customizations.

Good system health, like our physical health, is best assessed by an expert who understands what data to collect, how to interpret it, and when to take action. An initial health check also establishes a baseline for future evaluations. Waiting until you actually feel the symptoms of a condition such as high cholesterol to seek treatment could mean undergoing complex corrective surgery. Schedule regular check-ups so that you and your business perform well for years to come.