As I’ve been working on Problem Management at my job I’ve realized that problem solving in this organization is almost like running a mini project. I’m sure in the perfect ITIL world Problem Management flows beautifully into Change Management once a Root Cause is found and a Change needs to be made to fix it, but one part to that flow is the full understanding of what’s going on in the I.T. environment. Since we don’t have a CMDB this makes Problem Management and RCA as difficult as having a blind person drive in the Indianapolis 500; they can still do it, but it’s going to be messy. And keeping with this analogy of a blind driver, it means that to get to the finish line someone is going to have to give detailed instructions on where to go. This is why I’m thinking to treat Problems as projects and consider a Problem Life-cycle. So far I’ve broken the life-cycle down into six (6) phases. First phase is Initiation, in which detection, logging and categorization occurs. Second phase is Assessment, where Prioritization occurs based on the assessment of the impact on the users. The third phase is Investigation where investigation and diagnosis occurs, as well as finding a workaround to resolve Incidents and minimize impact on the organization. The fourth phase is Confirmation/Elimination of Root Causes. I took part of the Kepner and Tregoe methodology of problem solving and considered that during the Problem Life-cycle several possible Root Causes will probably be found, so they’ll need to be tested. It’s at this phase that the majority of work will occur if Event Management or a CMDB are not in place. Here is where tasks need to be given, results need to be reported and if a task isn’t complete, follow-ups will occur. This is also the phase that mimics a project with meeting minutes, action items and a lot of work to ensure involvement from the different technical teams within the I.T. department. All this has the goal of trying to find the Root Cause. Problem resolution efforts will probably also go back and fourth between phase three and phase four since it’s possible that all possible root causes will be eliminated and efforts will have to “go back to the drawing board,” so to speak. Phase five is Analysis in which the cost and benefits of implementing the Resolution are compared to the costs of keeping with a workaround, assuming a workaround has actually been found. If it’s decided to implement a resolution, then the final phase is reached; Change and Validation. Here a Request for Change will be submitted and once complete, reporting will need to take place to validate the reduction in Incidents. The validation is extremely important and often overlooked in my department because of the assumption that our work is flawless so a Change to resolve a Problem will indeed work. This phase also helps to keep with the “Check” phase of the Deming Cycle.
Like a project, I’ve broken down the steps of the Problem Management Process into different divisions, each with their defined goals and as I work on this idea, I’ll also develop documentation for each phase. I know there are other ways for Problem Management, but the beautiful thing about ITIL is that it’s a set of best practices, which means I have the flexibility to mold the processes to my department’s culture and as my organization changes, the implementation of those processes will change as well.
Comments