S-R-P : SYSTEMIC-RESILIENT-PRECISION: July 2008

Thursday, July 31, 2008

One week after Westpac .. NAB's payroll processing system fails

Only one week after Westpac's computerised processing resulted in double pay or non-payments to over a million customers; it was reported on Wednesday 30 July that, as of 1.40pm, NAB's bulk payment systems had not cleared the processing backlog from the failure to process that occurred on Monday night.

Some of the NAB's customers were ensuring that their staff received their pay by resorting to old tried and true methods: including the withdrawal of $60,000.00 in cash and placing the cash in envelopes to pay staff; other customers entered the individual salary payments and bank details of a whole payroll via the online banking – one transaction at a time.

While it may not be proof positive that accountants are part of the creative class, it does show commitment to employees and is evidence that some organisations do truly put their staff first. Hopefully their heroic efforts will pay off when their employees consider other employment options in this talent-short employment market.

When a small process failure can affect a million customers it really highlights the incredible job the banks do each day keeping so many complex processes working seamlessly.

The processes that failed for NAB would have been repeated every night, hundreds of times with 100% precision. This time a small change would have connected with a few latent conditions and the 100% precision, over a whole complex processing procedure, fell to zero - in a very public way.

When the IT Problem Solving Team are able to move away from just fixing the problem (it was reported that many of their 2000 technology workers were prioritising clearing the processing backlog) they will have time to analyse what went wrong and set about changing their business process to ensure that it just cant happen again.

When we look at process management, we ask our clients to consider their core business process in a matrix of two dimensions, "Criticality" and "Frequency".

When considering the type of incident that Westpac and NAB faced, it is clear that the transaction automation is a high-criticality and high-frequency business process - every near miss and certainly any process failure demand the highest level of rigour be applied to analysing the causal structure that allowed even the smallest of near-miss to occur.

NAB and Westpac are very mature organisations with well established quality processes. What ever their internal process for problem solving, part of the analysis process will include a documentation of the causal structure; and a search of their Lessons Learned Database to see if there was any near-miss or other incidents that should have been resulted in a warning being raised.

For our clients they would use the REASON® methodology to guide us through the analysis process and to document the full causal structure of the incident and any early warning signs.

We would ask our clients to pay particular attention to any latent conditions contributing to a major adverse outcome; if these conditions played a significant role in this process failure then they are just sitting there waiting for some other small change to interact with them, in an undesirable way, so they can play a role in the next process failure.

REFERENCE:
http://www.australianit.news.com.au/story/0,25197,24101547-15306,00.html
http://www.news.com.au/adelaidenow/story/0,22606,24100923-5006301,00.html

Monday, July 28, 2008

Qantas - the Missing Oxygen Cylinder

Monday is certainly not a quiet news day for Australian enterprises. The weekend papers published images of luggage slipping out of a 3 meter hole in the fuselage of a Qantas Boeing 747.

Qantas is envied the world over for its exemplary safety record, this is especially commendable given airlines are one of the world’s safest industries.

To see such a graphic image from such a high-performance enterprise helps us all remember that there is always room to improve our process and practices; and that this commitment to continuous improvement is directly linked to the value of our corporate brand and customer confidence in the product that is delivered.

It was reported in today’s newspapers, that the initial focus of the safety investigation is oxygen cylinders that are usually stored between the luggage compartment and the fuselage. These cylinders provide back-up oxygen for the aircraft. Mr. Neville Blyth, an investigator from the Australian Transport Safety Bureau, advised that one of the cylinders which provides back up oxygen was missing.

The Age Newspaper also reported that “some months ago” the US Federal Aviation Administration ordered airlines that come under their jurisdiction to examine their emergency oxygen cylinders because many of them had not been properly heat treated and needed to be replaced. The Age article documented the reasoning for the directive to include “… which would cause the oxygen cylinder to come loose and leak oxygen”.

The Brisbane Times has posted a video news report from Reuters that shows images of the inside of the aircraft and the hole in the fuselage; here is the link:
http://media.brisbanetimes.com.au/?rid=39943

There are reports of passenger distress as the aircraft rapidly descended from 29,000 feet to 10,000 feet, with all due respect being accorded to the good work done by the crew in getting all 346 passengers and 19 crew safety to Manila Airport.

A preliminary report is due to be released by the Australian Transport Safety Bureau in two to three months; as information is released we will be building a REASON® incident model that will be available for free download.

If you would like to be sent an email when that incident model becomes available (or participate in online discussion to finalise the model) please let us know at rca@reason4rca.com

REFERENCES:
http://media.brisbanetimes.com.au/?rid=39943 VIDEO REPORT

http://www.brisbanetimes.com.au/news/national/qantas-blast-airlines-were-warned/2008/07/28/1217097102406.html

http://www.theage.com.au/action/printArticle?id=167718

http://www.theage.com.au/articles/2008/07/28/1217097102556.html

Thursday, July 24, 2008

Lessons Learned System – COLLECTION Considerations

The post earlier in the week outlined some common process flows for Lessons Learned Systems.

By looking in some detail at just one of these systems we will provide you with tips on the things to consider when you set up the functionality requirements for your automated your Lessons Learned System.

The model we will look at in more detail is Model #1 from the earlier post.
This post is a closer look at the COLLECTION element of the process flow.

COLLECTION

Your people are only human; if you minimise the extra effort required to submit lessons to your Lessons Learned System and maximise the flexibility of using the lessons to solve local operational problems; the use of the Lessons Learned System will be maximised.

The RAID™ human factor rca identify’s this process in the context of forces pressing the person to behave in the desired way (the requirement and assignment) and the potentially opposing forces (inducement and disposition).

The recommendations in this post aim to guide your functionality considerations towards minimising resistance and maximising encouragement for use of your Lessons Learned System.

Integrating the collection of data, for your Lessons Learned System, is more likely to be taken up if it can be set up as a next logical next step in the problem resolution process. Wherever possible, we recommend that the submission of lessons to the Lessons Learned System involve only a small amount of extra work, it should mostly be submitting already prepared problem resolution analysis.

We also recommend you consider building into your Lessons Learned System with scalable data-entry options. The amount of time an individual is required to invest in preparing the problem resolution information, ready to submit to the Lessons Learned System, should be scalable to the complexity and criticality of the problem being resolved. To submit the fix for a printer that has repeatedly jammed should not require the same level of analysis as submitting a lesson from a fatality or major outage.

Having data-entry options with a sliding scale of time commitment, you can ensure that you maximise the data collected. With maximum longitudinal data, regular reviews will identify trends before they escalate to critical events.

In line with the recommended systemic approach to data collection, we recommend that enterprises set up a cause code matrix (to identify the classifications of each Lesson) then give as many people as possible the rights to submit relevant data to your Lessons Learned System.

Your Lessons Learned System will be a silent monitor that records all the little failures and near misses so that you can identify the trend and linkages before it becomes a problem. You won’t need to pay an expensive consultant to come in and find out what has been happening; you will know just as soon as the trend is identifiable and before it is a problem.

For enterprise organisations, dealing with multiple worksites and often in different countries, the collection method for a Lessons Learned System should be able to collect information in a way that allows for recording situations that are unique, there is no point in forcing people to chose an answer from a fixed list of possible solutions – this can perpetuate any existing system flaws and limit the opportunity for true innovation or quantum process improvements.

Significant benefit can be derived through the recording contextual information. The capacity to rapidly modify the application of lessons to the unique contextual environment allows users to quickly bypass recommended actions that were linked to erroneous contextual causal factors; allowing them to hone in on only the corrective action options that specifically relate to their situation and context.

Beyond the conceptual considerations, for the Collection stage of the Lessons Learned System process, there is the practical on-the-job collection and recording of the physical data for problem analysis and problem resolution.

Reducing the need for Investigators to re-enter data when they return to the office is a significant consideration for accuracy and take up of the Lessons Learned System.

We recommend that your Lessons Learned System allows for mobile use (without a direct connection the central server). Synchronisation between a laptop and the central server will allow problem resolution data to be collected on operational lessons in any location.

SUMMARY
1. Minimise the extra work required to submit completed problem analysis to the Lessons Learned System.
2. Scale the analysis and detail to fit the complexity and criticality of the Lesson being submitted to the Lessons Learned System.
3. Within a cause code framework, allow as many people as possible to submit their lessons to the Lessons Learned System.
4. Submit lessons to the Lessons Learned System in a way that allows for innovation and quantum improvements in process.
5. Submit lessons to the Lessons Learned System in a way that allows users to quickly identify and discard erroneous contextual data and customise for their own unique situation.
6. Allow independent workers mobility to analyse and record lessons in a format that is ready for upload to the central Lessons Learned System.

If you would like more information about your Lessons Learned System or designing a Lessons Learned System for your enterprise
visit us at www.systemic-resilient-precision.biz

© Systemic-Resilient-Precision Pty Ltd - 2008

Monday, July 21, 2008

Lessons Learned Systems – Mining & LLS Process Flows

The local Australian mining and engineering sector is currently booming. The cyclical nature of demand for this sector, and the transient nature of the workforce, helps keep the mining sector at the forefront; when it comes to embedding efficiency gains in the business process.

There is currently a high level of interest in automating that systemic change through the practical deployment of Lessons Learned Systems.

While most of the recent interest has been from the mining sector, any enterprise size organisations can benefit from using a Lessons Learned System; is an invaluable tool to ensure that investments made in process innovation and risk management are rapidly transferred across the whole organisation.

Regulators and Industry bodies can apply the same principles to quickly transfer learning’s across a whole sector.

The mining sector, like other multi-site enterprises (banks, supermarkets, government departments,hospitals & manufacturing plants) have many sites performing similar functions and/or using similar equipment to deliver similar goods and services; lessons learned are readily transferred.

Failing to effectively record and share information about new innovations and solutions to new and old problems is not only a systemic waste of resources it can result in latent risks not being corrected in a timely manner. This latter point often revealed after an accident or significant adverse event has occurred; when it is all too late, the special investigation reveals that some work locations had good practices in place that could have prevented the critical event.

Below is posted three of the most common approaches to the Lessons Learned System as a process flow.

Blog posts still to come have a more detailed explanation of each step in the process flow.

For more information please feel free to call or send an email to
srp@systemic-resilient-precision.biz

Lessons Learned System Model #1

Lessons Learned System Model #2

Lessons Learned System Model #3

Thursday, July 17, 2008

Employee Engagement: Threatens Growth

Large Enterprises are increasingly reporting that the biggest threat to achieving their growth targets is the attraction and retention of skilled employees. The Age reported that economists are warning that the No. 1 "success disease" of the Australian economy is skills shortages and associated wages pressure.

The longitudinal research by the Gallup Institute has demonstrated significant productivity increases can be released, at the same time as increased employee satisfaction and engagement, through the use of a strengths-based approach to individual and team management.

Gallup Institute findings were revealed by studying high-performing individuals and teams in over 100 Enterprises. Their research found that managers who consistently perform at the top of their cohort tend to naturally adopt a strengths based approach to their team management.

Moving away from a rigid focus on job descriptions to a strength-focus, leverages the individual strengths of team members in new and innovative ways: improving productivity and minimising individual resistance to change.

Marcus Buckingham (previously of the Gallup Institute and now in his own company) has authored or co-authored a number of books on human performance that included practical application examples and strategies that managers can deploy to leverage team member strengths for improved productivity.

The Gallup Institute research findings are consistent with the findings of Positive Psychology specialist Mihály Csíkszentmihályi; American psychology professor at Claremont Graduate University in Claremont, California and the former head of the department of psychology at the University of Chicago.

Mihály Csíkszentmihályi has authored a number of books that documented the increased levels of individual satisfaction, happiness; creativity and well-being are attained when people spend time in a state of “Flow”.

He is best known for his seminal work, 'Flow: The Psychology of Optimal Experience'. To achieve a flow state, a balance must be struck between the challenge of the task and the skill of the performer. If the task is too easy or too difficult, “Flow” cannot occur.

Marcus Buckingham is clear that a strength is not necessarily some thing that your Supervisor says you do well, in fact “Activities that you happen to perform well can actually deplete you if don’t also enjoy them. That makes them a weakness for you” he defines strength as “the work activities that consistently make you feel productive, energised and engaged.”

The extensive Gallup Institute longitudinal research demonstrated conclusively that teams comprised of people who spend most of their time using their strengths deliver higher performance than those who spent less time working to their strengths.

Marcus Buckingham has designed some practical tools to help individuals determine if an activity is strength, including the 4 signs that something is strength and suggests that managers can help employees improve their productivity by supporting them to leverage their strengths:
• Listen to them and trust their judgement; they are the only ones that know if they are energised by an activity.
• Adjust their jobs (wherever possible); be open to other ways that an activity could be completed. This releases the employee from the feeling of being stuck.
• Actively support individuals to find ways to make less desirable activities less onerous; consider partnering, doing an activity as a team excised etc.

Considering a focus on leveraging the strengths of your team also guides organisations towards respectful communication between other departments and other divisions, it also supports an innovation culture by daily re-enforcing the concept that there are many ways to achieve a good outcome.

REFERENCES:
http://www.cgu.edu/pages/4751.asp
http://www.marcusbuckingham.com/home.php
http://www.gallup.com/consulting/positive/107755/2008-Gallup-WellBeing-Forum.aspx

Monday, July 14, 2008

Annual IT Spend - IPod or Blackberry; Linux , Mac or Windows?

A Process for Prioritising the rapidly escalating annual IT Spend

Failing to keep up with technology can be competitively fatal, but technology annual spend is just like all business decisions; it requires a process, the business case needs to support an informed decision that allows priorities to be set based on a justifiable ROI.

Andrew MacAfee, an associate professor at Harvard, noted that the US spend per employee on physical IT was $5,100 per employee per year in 2004, it had trebled in the period from 1987 – later figures are not yet available.

As the bottom line impact of the IT spend has become a significant expense item Andrew MacAfee recommend a process to help Executives determine priorities and support decision making in this rapidly expanding expense area.

He recommends a process where the IT purchases down into three distinct categories.
1. Function IT
- Supports execution of tasks i.e. spreadsheets, CAD
2. Network IT
- Supports collaboration and connections i.e. e-mail, wiki, blog.
3. Enterprise IT
- Specifies a Business Process i.e. defines tasks and sequences, mandates data formats, use is mandatory.

His HBR article also provides a list of questions that Senior Executives can ask their CIO’s to help clarify priorities and ROI, a couple of the questions are:
1. Functional IT
- Will any of these new software options allow our operations people to do their jobs more efficiently?
- Is any of our current IT out of date – what changed?
2. Network IT
- What technologies are our people collaborate?
- Do we know what they think on hot issues?
3. Enterprise IT
- Are there best practices that should be embedded in our Enterprise IT?
- Are there important business activities, events or trends that we
should monitor?

There is a word to the wise in Andrew MacAfee’s article in terms of the ROI, he notes that while business cases will inevitably present a can’t lose scenario, the reality is that IT deployments “are never a sure bet because they rely on a complex interplay between technologies, capabilities and compliments”.

Finding a structured way to work through all of the IT capability options, and ask bottom line focused questions of your CIO, is a solid governance approach.

We especially like the Enterprise dimension of his model with its focus on the automation of non-negotiable core business process.

REFERENCES:
The referred to “compliments” being the compliments of process: better skilled workers, higher levels of team work, re-designed process and new decision rights.

Andrew MacAfee’s Blog: http://blog.hbs.edu/faculty/amcafee/
HBR Article: http://harvardbusinessonline.hbsp.harvard.edu/hbsp/hbr/articles/article.jsp?ml_action=get-article&articleID=R0611J&ml_page=1&ml_subscriber=true

Thursday, July 10, 2008

Apache Energy Varanus Island Gas Plant Explosion wipes $A65 million from Alcoa's bottom line

A systematic way to analyse and respond to ambiguous signals of impending crisis

While the Newspaper reports may make it appear that Apache Energy management have demonstrated management behaviour that is less competent than other organisations; even if it is all proven to be true, their behaviour many not be that unusual.

In fact this type of behaviour is often the norm for large enterprises; including NASA (Colombia Disaster), Merck Pharmaceutical (Vioxx Incident), Kodak & Schwinn Cycles.

Enterprises that have a culture that is highly reliant on evidence and data can have a cultural weakness when faced with problem-solving in an environment where the available data is ambiguous. There is a systematic way to analyse and respond to weak signals of impending crisis.

M. Roberto (Bryant University) & J. Bohmner and A. Edmondson (Harvard) conducted an in-depth 2-year study, following the Columbia Disaster, that revealed that a culture that responded poorly to ambiguous threats was a naturally occurring behavioral trait in enterprises.

This problem-solving skill shortfall was more pronounced in enterprises where hard facts and data are the cultural norm for decision making. Of it’s self this data-focused trait is a strength but when combined with a mature organisation that has many accepted norms it tends to limit individuals ability to speak up when ambiguous threats start to appear.

Post incident analysis inevitably reveals that there was a “recovery window” where the enterprise could have responded in a way that would have averted the disaster:
· NASA management did not authorise a space-walk or approve additional satellite images be produced to study the possible effects of the foam strike after take-off.
· Merk did not act quickly enough to ambiguous data linking Vioxx to cardiovascular risk bringing their product to market with significant reputation damage.
· Kodak dismissed early signs that their film business was in decline
· Schwinn did not act quickly enough when mountain bikes hit the market, announcing the rapid decline for their road bike business.

Why do organisations fail to deal with ambiguous threats well?

Roberto, Bohmner and Edmondson found that the big three contributing causes were:
· Human Cognition - suppression of danger, emphasising information that confirms our existing belief and supports our existing execution strategy.
· Group Dynamics – Teams for high-risk and high-complexity projects are often designed with a strong focus on the expertise of the individuals; limited planning is invested in the dynamics of the group to ensure it works effectively as a team. Including the creation of an environment of ‘psychological safety’.
· Organisational Culture –Strong bias to data-driven decision making combined with a presumption that how things are being done (have been done for years) is a sound way to respond. Any challenger is therefore required to present data that is not going to be available in the short term.

For Apache Energy, their Varanus Island Gas Plant has been shut down since June 3; the immediate consequence of the explosion was that WA’s domestic gas supplies were reduced by 30%. Westfarmers are reported to be facing a total cost blow out of $A120 million, paying $A20 million a month for replacement gas supplies, and now Alcoa are facing a loss this quarter of $A65 million due to the same incident.

A valve replacement is reported to be a major directly contributing cause of the explosion but from a systemic change perspective the problem-solving culture may play a bigger role in deploying a corrective action strategy that would prevent a similar incident from re-occurring.

Police Superintendent Dave Parkinson is reported to have said “Apache had been told of a potential problem with a valve similar to the one the company is now trying to replace as a result of the explosion”. When told to buy a duplicate valve, Apache had commented "how can we justify having an $8 million component sitting on the shelf?" Mr. Parkinson commented "I got the impression they were not taking the need for contingencies too seriously".

The study by Roberto, Bohmner and Edmondson noted that enterprises that respond well to an ambiguous threat, do not improvise during the “recovery window”, they set in place a rigours process of detection and response capabilities that they have developed and practices prior to the crisis.

This rigorous process of detection and response includes:
· Rapid problem-solving and teamwork skills
· Recognise and take advantage of the ‘recovery window’
· Amplify the threat, making it culturally ‘safe’ for employees to ask potentially disconcerting ‘what-if’ questions.
· Explore (scenario model) possible responses to threats through low-cost experimentation.

We look forward to seeing what the findings of the investigation reveal and hopefully they will be using a root cause analysis tool like REASON® that will reveal causes that take them beyond the immediate engineering fix to the real structural and systemic issues that could provide valuable lessons to all involved in the Energy Sector.

If you would like more information in developing you own internal problem-solving capacities contact Kimmaree Thompson at srp@systemic-resilient-precision.biz

© 2008 K. A Thompson http://www.systemic-resilient-precision.biz/
Please cite web-link with reference.

REFERENCE & LINKS TO RELATED ARTICLES:
http://www.theaustralian.news.com.au/story/0,25197,23995860-5005200,00.html
http://news.smh.com.au/national/police-warned-varanus-plant-report-20080628-2yez.html
http://www.businessspectator.com.au/bs.nsf/Article/Wesfarmers-counts-up-lost-hatchlings-from-Varanus--FXVBD?OpenDocument
http://www.theaustralian.news.com.au/story/0,25197,23883858-2702,00.html
http://harvardbusinessonline.hbsp.harvard.edu/hbrol/en/search/saSearchResults.jhtml?Ntt=R0611F&N=0&Ntk=hbrsa&Ntx=mode%2Bmatchallpartial&x=19&y=13

Monday, July 7, 2008

Westpac: +$1 Billion Incorrect Transactions

Tens of thousands of people were either double paid or missed out on getting their pay due to a processing error by Westpac Bank.

Westpac is reported to have confirmed that a “processing error” caused $1 billion of incorrect transactions to be processed on their computerised pay-roll & Direct Debit system.

The Sydney Morning Herald reported that the Finance Sector Union Spokesman attributed the error to a review of backroom processing that had left processing staff under pressure, believing that their jobs may be sent offshore.

In his recent Harvard article Michael Hammer, the father of the BPR, identified 6 factors that were the difference between successful and unsuccessful process innovation.

Michael Hammer also gives Gail Kelly, the new CEO at Westpac, some good support for sticking with it when things go wrong.

The 6 success factors he identified are:
1- Process Focus – describe your enterprise within an enterprise process model (a small number of value-creating end-to-end processes).
2- Process owners – senior executive empowered to make decision across the enterprise, overcoming silo focused resistance.
3- Full-time design team – don’t try and get your staff to fit it in while doing their real job!
4- Managerial engagement – don’t let the team’s work sit in reports waiting endorsement; have a senior executive charged with sponsoring approvals fast.
5- Buy-in – the people at the front will be the ones doing things differently, don’t let this be a surprise to them; keep them informed.
6- Bias for action – “the perfect is the enemy of the good” Voltaire: To maintain momentum implement at a defined standard; fine tune when it is operational.

While the SMH article did not provide enough information to make even an uneducated guess at the contributing factors, when looking at the comments by the Union Spokesperson you get the feeling that Item 5 may be a bit harder to implement if you are considering offshoring the work that the frontline people are doing.

In defence of Michael Hammer on Item 5; The Union Spokesperson’s comments also indicated that any tension or apprehension the frontline staff may feel could be just gossip and rumour. If it is just speculation, decreased anxiety levels may have been delivered if the frontline people were provided with frequent up-to-date information on the progress of the internal review.

In the RCA’s we are conducting, the tight employment market is showing it’s self to be an increasingly relevant causal factor in operations process failure.

We are finding that enterprises are unable to consistently fill jobs with staff that have the same level of experience and skill as the staff they were employing to fill similar jobs three years ago, or even 12 months ago.

The employment of less experienced staff is often just overlooked on a day-to-day operating basis. No changes are being made to increase the resilience of the business process that supports the less experience staff member. The opportunity for new process failures to occur consequently increases.

Where a more experienced worker will monitor or correct obvious errors the less experience staff member just does not notice the error and therefore does not correct it.

A high-profile process failure, like the one at Westpac, would keep any CEO awake at night but the return on investment from process innovation is huge; in Michael Hammer’s case study a logistics firm generated hundreds of millions of dollars per annum, in just two years, from one process innovation initiative -improving response time on RFP’s.

Here is hoping that Gail Kelly keeps on remembering all of her earlier victories during the next couple of days. This event will end up being just a “blip” on what has otherwise been a golden start to her new role as CEO at Westpac.

© 2008 K. A Thompson http://www.systemic-resilient-precision.biz/ Please cite web-link with reference.
REFERENCES: Michael Hammer - http://www.hammerandco.com
Harvard Management Update Reprint # U0504B http://www.hbsp.harvard.edu/
SMH Article: http://www.smh.com.au/news/technology/computer-glitch-sparks-westpac-chaos/2008/07/03/1214950997519.html

S-R-P : SYSTEMIC-RESILIENT-PRECISION