Aviation Investigation Report A02P0261

1.0 Factual Information

1.1 The Accident

The engine (CFM International [CFM]1 CFM56-5C4, serial number 741-705) shut down without any warning given to the pilots or recorded by the centralized fault display system (CFDS).2 The flight crew contacted Cathay Pacific Airways technical personnel in Hong Kong, and, as a result of these discussions, the pilots decided to divert to Vancouver, British Columbia. There were no signs of engine compressor (N1) or turbine (N2) rotation from windmilling, leading the pilots to assess that the engine had seized. Therefore, they did not attempt to restart the engine.

The operation of each of the four engines on the Airbus A340-300 is controlled by the full authority digital engine control (FADEC) system. The FADEC comprises many components, two of which are the electronic control unit (ECU) and the permanent magnet alternator (PMA). The ECU receives electrical power from the aircraft during the engine start sequence. Once the engine has attained sufficient speed, electrical power is provided by the PMA, which is driven by the engine accessory gearbox. Should the PMA fail at anytime during engine operation, the ECU, by design, acquires electrical power from another aircraft source. The number 1 engine had accumulated 15 527 hours and 2622 cycles before the shutdown. The accessory gearbox and the PMA itself had accumulated 15 508 hours and 2619 cycles.

On the ground in Vancouver, maintenance personnel printed out a post-flight report from the CFDS. The report revealed no indication of the cause of the shutdown. They then examined the engine by borescope, verifying that the engine had not seized as the engine was rotated during the examination, and checked the accessory gearbox oil filter for contamination. No anomalies were detected. Maintenance personnel performed a non-motoring test to check the engine parameters and the ECU computer system. During this test, the N2 only reached 14 per cent rpm instead of the expected 28 per cent rpm. According to CFM, this lower-than-expected N2 speed is characteristic of a failure of either the PMA or the ECU. The PMA and the ECU computer were then removed. Maintenance personnel noted scoring and burning on the PMA rotor and stator and assessed excessive play in the drive shaft for the PMA rotor. Post-incident analysis shows that this indicates a potentially damaged drive shaft bearing.

Maintenance personnel found neither the procedures for measuring or checking the play of this drive shaft nor any reference to rotor scoring in the approved aircraft maintenance or troubleshooting manuals. Cathay Pacific Airways technical support in Hong Kong were similarly unable to find any information on drive shaft play or rotor scoring. TSB investigators also did not find any pertinent information.

The PMA and the ECU computer were replaced with serviceable units, and another non-motoring test was conducted. In this test, the N2 reached the required 28 per cent rpm. A full engine test run was carried out, but after about 10 minutes, the engine shut itself down. As with the in-flight scenario, there were no advanced warnings or CFDS record of this shutdown.

When the replacement PMA was removed and inspected, it showed scoring and burning similar to the original PMA. The entire PMA drive shaft assembly - comprising PMA rotor, roller bearing, drive shaft, ball bearing support, and ball bearing - was removed and examined. A visible crack was found in the ball bearing cage that supports the drive shaft where it exits the gearbox (see Photo 1).

Photo 1 - Ball bearing cage

Photo 1. - Ball bearing cage.

The crack could not be seen with the drive shaft assembly in place in the gearbox. A new drive shaft assembly and a third PMA were then installed, and another engine run was performed, this time without anomaly.

The ECU (part number 1851M42P06, serial number ECDN3879, software version C.3.G) installed on the number 1 engine at the time of the incident was subsequently sent to the component manufacturer for examination and test. No defects were identified, and the ECU was returned to the operator as a serviceable unit.

The PMA drive shaft assembly was sent to CFM for examination and analysis. The CFM analysis of the failed ball bearing (part number 305-100-410-0, serial number UR06967) indicated that there was generalized spalling3 of the balls, wear on the cage pockets (including a fractured pocket), and sectorial spalling on 90º of the inner race (see photos 1 and 2). The ball bearing at this particular location is subject to temperatures as high as 160ºC, and it rotates at about 20 000 rpm. There was no indication of corrosion. The roller bearing (part number 301-480-926-0) had two separate serial numbers on the races (inner race: UR31008, outer race: UR28466). No other anomalies were found on the remainder of the drive shaft components. The root cause of the spalling was not determined by CFM.

Photo 2 - Spalled balls

Photo 2 - Spalled balls

The ball bearing had been the subject of CFM Service Bulletin (SB) 72-457, issued in July 2001. The purpose of the SB was to force the introduction of the second source bearing (part number 305-100-415-0, manufactured by SNFA) already available in the fleet and with a better service experience. This bearing replaced the bearing made by SNR Roulements (part number 305-100-410-0), which was identified as an infant mortality4 issue. The SB applied to accessory gearboxes equipped with drive shaft bearings (part number 305-100-410-0) and having accumulated less than 1500 cycles, but applied to only 17 individual CFM56-5C engines. The serial number of the incident engine was not included in the 17 engines prescribed in the SB; accordingly, the SB did not apply and the ball bearing had not been replaced.

1.2 Permanent Magnetic Alternator (PMA) Bearing Failures

A search of the Transport Canada (TC) Service Difficulty Report (SDR) database did not reveal any malfunctions of either the PMA or the PMA drive shaft bearing. However, the CFM report of the failed bearing stated that there have been at least 26 reported failures of the PMA drive shaft bearing, from a total of about 3400 engines in the CFM56-5 series, which includes the Airbus A319, A320, A321, and A340 model aircraft. The CFM report revealed that bearings from two separate manufacturers have suffered similar types of failures. The CFM report indicated that there are two main contributors to the failure of the bearing: radial overload stress causing spalling on the inner race, and corrosion causing spalling on the outer race. This ball bearing unit is also used in other locations in the gearbox where no failures have been found or reported.

CFM concluded that the incident ball bearing failed as a result of radial overload stress, which induced inner race spalling, the origin of which occurred 50 to 70 µm in depth. Radial overload stress, also known as "contact fatigue," results from two curved surfaces moving over each other in a rolling motion, as seen in a ball bearing over a raceway.5 The contact geometry and the motion of the rolling elements produce alternating subsurface shear stress, which accumulates and generates cracking. The cracking then propagates until a surface pit is formed and spalling results. If this degeneration continues, complete bearing failure occurs. Rolling contact components have a fatigue life, that is, a number of cycles to develop a noticeable fatigue spall.

It should be noted that, unlike aircraft cycles, rolling contact cycles are 7 to 10 orders of magnitude greater. Rolling contact life typically involves cycle counts in the order of 106 to 107 before a noticeable fatigue spall develops. Correct and adequate lubrication of all bearings is essential for bearing life; oil delivery, temperature, and viscosity reduce wear and spalling, and increase fatigue life.

The PMA drive shaft assembly was also examined and analyzed at the TSB regional wreckage examination facility. In particular, the TSB examination focused on indications of electrical arcing that had previously been documented as causing similar spalling on ball bearings. The TSB examination did not find any indication of electrical arcing, and the root cause of the spalling was not determined.

TSB investigators determined that there is an extreme variation in the average total aircraft cycles before failure of the ball bearings due to spalling and other unknown causes. However, there is no direct correlation between these aircraft cycles and rolling contact fatigue cycles; the significant issue is the wide range of aircraft cycles before failure. Despite industry compliance with SB 72-457, failures of the ball bearing continued regardless of part number or manufacturer. The TSB also determined that the accessory gearboxes for the Airbus A319, A320, and A321 aircraft models, and the Boeing 777 aircraft series, are manufactured by the same company, Hispano-Suiza. All of these aircraft types have experienced PMA drive shaft bearing failures.

The roller bearing races are located at the opposite end of the drive shaft and are different than the ball bearings. These roller bearing races, previously identified as having different serial numbers for the inner and outer races, are normally a matched set, and according to CFM, the serial numbers for both races should be the same. The roller bearing did not exhibit any unusual wear characteristics and neither the TSB nor CFM determined if this serial number anomaly contributed to the failure of the ball bearing. It is noteworthy that previous ball bearing failures have occurred with correctly matched roller bearing races.

The Airbus A340-300 is a fly-by-wire type of aircraft in which there are no conventional mechanical flight controls or engine controls; operation of the flight control surfaces and the engines by the pilots is routed through onboard computers. The exception is mechanical backup or control of the trimmable horizontal stabilizer and rudder. There is no provision for mechanical engine operation should the FADEC system fail. By design, the ECU automatically acquires electrical power from other aircraft sources when a PMA fails. A failure of the PMA is indicated on the cockpit CFDS.

The U.S. Federal Aviation Administration (FAA), in Federal Aviation Regulation (FAR) 33.28, and Canadian Aviation Regulations, Commercial Air Standards, Part V, Airworthiness Manual, Section 533.28, require, in part, that each engine control system that relies on electrical and electronic means for normal operation must inter alia:

  • (b) Be designed and constructed so that any failure of aircraft-supplied power or data will not result in an unacceptable change in power or thrust or prevent continued safe operation of the engine;
  • (c) Be designed and constructed so that no single failure or malfunction, or probable combination of failures of electrical or electronic components of the control system, results in an unsafe condition;
  • (e) Have all associated software designed and implemented to prevent errors that would result in an unacceptable loss of power or thrust, or other unsafe condition and have the method used to design and implement the software approved by the [Administrator/Minister].

Failure of the ECU to acquire electrical power from other aircraft sources during a PMA failure has caused in-flight shutdown (IFSD) events in several recent aircraft incidents. Notably these include Singapore Airlines (A340-May 1999), Virgin Airlines (A340-May 1999), and Ansett (A320-September 1999).

Further investigation by the TSB determined that the failure of the ECU to acquire other aircraft electrical power is not isolated to the Airbus A340 or to the CFM56-5C engine. An FAA aviation safety report (number 295661) reported an IFSD on an Airbus A320 caused by a faulty PMA. As recently as 14 May 2003, a Boeing 777 equipped with two Rolls Royce Trent 800 engines suffered a similar failure in flight and one engine shutdown.

2.0 Analysis

2.1 General

Although the root cause of the spalling could not be determined, it is likely that the initial cause is one of design, application, or both. At the time of the incident, the two manufacturers of the bearings were experiencing similar types of failures, but not to the same extent, including an extreme variation in aircraft cycles before failure. Such variation does not lead to reasonable predictability of bearing failure. The bearings were also failing in various aircraft/engine combinations. Likely scenarios that could explain these failures are:

  • The bearing is subject to temperatures of between 120ºC and 160ºC and spins at 20 000 rpm. It may be under-designed for the application, thereby resulting in premature failure.
  • Oil delivery may be inadequate and oil temperature may be excessive, inducing premature wear, spalling, and fatigue. The origin of the failure on the inner race, 50 to 70 µm in depth, indicates that lubrication is a critical factor within the application.
  • Because of the high rpm of the PMA assembly, any instance of incorrect balancing - either initial or after maintenance - may subject the bearing to stresses beyond design tolerances.
  • Corrosion of the bearing due to improper storage or maintenance practices may result in premature failure. However, there was no evidence to suggest that corrosion was a factor in this particular occurrence.

2.2 Airbus A340 Maintenance Manuals

Radial and axial movement of the PMA drive shaft alone is not a conclusive indicator of bearing condition but, combined with scoring on the PMA rotor, is a reliable indicator of a failed PMA drive shaft bearing. Neither the Airbus A340 maintenance manual nor the fault isolation manual prescribe limits for radial or axial movement of the PMA drive shaft, or contain notations that scoring of the PMA rotor may indicate a damaged or worn drive shaft bearing. Without such information, maintenance technicians were unaware that the PMA drive shaft was faulty and dismissed the unusual score marks on the PMA rotor. This additional information would have facilitated more effective troubleshooting and probably precluded the failure of the second PMA during test, but it is unlikely that it would have prevented the in-flight incident.

2.3 Electronic Control Unit (ECU) Electrical Power Transfer

Technical examination revealed that an intermittent short circuit occurred in the PMA when failure of the ball bearing caused the rotor to contact the stator. The PMA was then unable to generate reliable electrical power for the ECU. The ECU continuously monitors the PMA, and, if the PMA no longer generates the required electrical power, the ECU will switch to other aircraft electrical power sources. The switch to other electrical sources, when it occurs, is rapid, usually with no significant change in engine performance. In this incident, the ECU became stuck in an endless loop of re-acquiring and losing PMA power due to the intermittent nature of the PMA failure. With no reliable or consistent source of electrical power, the engine eventually shut itself down. Without electrical power to the ECU, engine conditions were not transmitted to the cockpit instruments or CFDS, thus leading the pilots to assess that the engine had seized. CFM subsequently identified a problem with software version C.3.G, in the ECU, that prevented the switch-over to other sources of aircraft electrical power. The CFM document, entitled CFM56-5 Fleet Highlights (publication 00-01-7263-07), indicates that CFM has been aware of this deficiency since November 1999. Improved ECU software logic for better transfer to aircraft power was developed in early 2000 but was not certified until November 2003. The ECU software revision was identified by Airbus as a non-critical item, and non-critical ECU software revisions have taken two to three years to be implemented.

The FADEC system designed for use with the Airbus A340/CFM56-5C aircraft/engine combination was certified, in part, in accordance with FAR 33.28. In general, this rule is to minimize the probability that a FADEC system failure will adversely affect an otherwise serviceable engine. Specifically, the intent of FAR 33.28(c) is to ensure that the FADEC provides an engine control system that is considered equivalent in safety and reliability to one based on hydromechanical technology. To accomplish this, the FADEC system must be designed and certified to degrade in a fail-safe manner. That is, the design and certification process assumes that the FADEC will fail and ensures that the resulting failure condition does not jeopardize continued safe flight and landing. In the case of a loss of PMA electrical power, the FADEC fail-safe design used in the Airbus A340/CFM56-5C aircraft/engine combination relies on ECU software to acquire aircraft electrical power and prevent an unintentional IFSD.

Additionally, FAR 33.28(e) requires that all FADEC software be designed and implemented to prevent errors that would result in an unacceptable loss of power or thrust. Assuming that an unintentional IFSD would be categorized as an unacceptable loss of power or thrust, then a validation of ECU software would be required as part of the certification of the FADEC system. However, as this occurrence illustrates, the failure of the ECU to acquire power from the aircraft, due to a known software deficiency, raises concerns about both the continued airworthiness of the FADEC system and the certification process that approved the Airbus A340/CFM56-5C aircraft/engine combination.

Failure of the ECU to acquire other aircraft electrical power during a PMA failure has caused IFSD events in several other recent aircraft incidents. The failure of the ECU to acquire other aircraft electrical power is not isolated to the Airbus A340 or the CFM56-5C engine.

It is clear that the engine electronic controls should be capable of operation in the event of a total PMA failure; however, with latent deficiencies in the software of CFM56-5C FADEC systems, and potentially with other aircraft/engine combinations, it is likely that an engine will shut down during the loss of electrical power from the PMA.

3.0 Conclusions

3.1 Findings as to Causes and Contributing Factors

  1. As a result of radial overload stress (contact fatigue), spalling damage occurred to the balls in the inner race of the ball bearing on the drive shaft of the permanent magnet alternator (PMA) on the number 1 engine, resulting in bearing failure.
  2. It is likely that oil delivery, component design or inappropriate application, or a combination of factors, led to the contact fatigue of the ball bearing balls.
  3. When the bearing failed, the PMA rotor contacted the stator and created an intermittent short-circuit in the PMA, thereby removing the required electrical power to the electronic control unit (ECU).
  4. Because of a known deficiency in the ECU software, when the ECU lost power due to the intermittent failure of the PMA, it was unable to acquire alternate electrical power from the aircraft, as it was designed to do.
  5. The number 1 engine shut down spontaneously as a result of the ECU losing electrical power.

3.2 Findings as to Risk

  1. Scoring of the PMA rotor, combined with drive shaft play, is a reliable indicator of a damaged or worn drive shaft bearing. The Airbus A340 maintenance manual and the fault isolation manual do not contain information about such scoring, and, as a result, maintenance technicians dismissed the tell-tale score marks on the PMA rotor.
  2. Written procedures regarding the play of the PMA drive shaft, or notations about rotor scoring, would have provided maintenance personnel with the ability to troubleshoot more effectively and identify the failed components in a more timely manner. Failure of the second PMA during test likely would have been avoided.
  3. Software deficiencies in the ECU, identified by Airbus as non-critical items, can take two to three years to implement across the various engine programs.
  4. The software deficiency that prevented the ECU from acquiring aircraft power was not detected during the certification process, indicating that there is a risk of other software anomalies not being detected during certification.

3.3 Other Findings

  1. The roller bearing in the PMA had two different serial numbers on the inner and outer races instead of being a matched set as required by the manufacturer.

4.0 Safety Action

4.1 Action Taken

CFM International (CFM) issued Service Bulletin (SB) 73-0126 (published as CFM56-5C SB 73-0126, dated 13 November 2003). The SB changes the electronic control unit (ECU) software version from C.3.G to C.3.J and ensures that ECU electrical power successfully reverts to aircraft power in the event of a complete or partial permanent magnet alternator (PMA) failure. While this SB applies only to the Airbus A340 and the CFM56-5C engines, all CFM ECU software for the CFM56-5 series will have the improved logic at the next scheduled version release.

In October 2003, Airbus revised the A340 maintenance manual to include specific checks during the removal of the PMA for evidence of rotor/stator contact and radial play of the PMA drive shaft.

4.2 Action Required

4.2.1 Continuing Airworthiness

SB 73-0126 will update the ECU software to ensure that electrical power will successfully revert to aircraft power. This SB applies only to the Airbus A340 aircraft, and, although CFM recommends implementation within six months, the actual timeframe for accomplishing this SB is at the discretion of the operator. Additionally, Airbus advises that it has launched similar initiatives to incorporate software updates on CFM56-5A and -5B engines used on its A319, A320, and A321 family of aircraft. It is anticipated that compliance for these SBs will likewise be at the discretion of the operator. As of November 2004, the total number of aircraft in the Canadian civil aircraft register affected by these SBs approximated 120, most of which are two-engine aircraft.

Given the number of aircraft affected, the known problem with PMA bearing failures, the critical function that the ECU software provides in ensuring engine reliability, and the discretionary nature of the proposed software updates, the Board is concerned that, without regulatory intervention, this known unsafe condition will remain in service well beyond the manufacturer's recommended six-month timeframe for the implementation of SB 73-0126. The Board therefore recommends that:

The Direction G�n�rale de l'Aviation Civile and the Federal Aviation Administration issue airworthiness directives to require the implementation of all CFM56-5 series jet engine service bulletins whose purpose is to incorporate software updates designed to ensure that, in the event of a permanent magnet alternator failure, the electronic control unit will revert to aircraft power.
A04-03

Assessment/Reassessment Rating: Fully Satisfactory

The Department of Transport ensure the continued airworthiness of Canadian-registered aircraft fitted with the CFM56-5 series engine by developing an appropriate safety assurance strategy to make certain that, in the event of a permanent magnet alternator failure, the electronic control unit will revert to aircraft power.
A04-04

Assessment Rating: Fully Satisfactory

4.3 Safety Concern

The investigation revealed that full authority digital engine control (FADEC) system software anomalies may not be confined solely to the Airbus A340/CFM56-5C aircraft/engine combination. Similar in-service performance anomalies of other Airbus/CFM aircraft/engine combinations have resulted in the initiation of SB action to update the FADEC system software to prevent unintentional in-flight shutdowns (IFSDs). Further, the Boeing 777/Rolls Royce Trent 800 aircraft/engine combination has also experienced at least one occurrence wherein the ECU did not acquire aircraft power following a PMA failure. The categorization by CFM of an ECU software whose intended purpose is to prevent an unintentional IFSD has been deemed non-critical. The resultant two to three years span taken to implement an update designed to bring the software into compliance with its basis of certification is incompatible with Federal Aviation Regulation 33.28.

The Board believes that recommendations A04-03 and A04-04 above will address the safety deficiencies in the existing aircraft fleet, and notes that new engines will be incorporating the changes needed to address the specific software problems identified in this investigation. However, the Board is concerned that the current certification process, specifically as it relates to FAR 33.28(e), may not be sufficiently rigorous to ensure that software deficiencies are identified and corrected prior to the software being put into general use.

This report concludes the Transportation Safety Board's investigation into this occurrence. Consequently, the Board authorized the release of this report on 12 October 2004.

Appendix A - Glossary

Celsius
CFDS  centralized fault display system
CFM  CFM International
DGAC  Direction G�n�rale de l'Aviation Civile (France)
ECU  electronic control unit
FAA  Federal Aviation Administration (U.S.)
FADEC  full authority digital engine control
FAR  Federal Aviation Regulation (U.S.)
IFSD  in-flight shutdown
N1   rotational speed of the low-pressure compressor in rpm
N2   rotational speed of the high-pressure compressor in rpm
nm  nautical mile(s)
PMA  permanent magnet alternator
rpm  revolutions per minute
SB  Service Bulletin
SDR  Service Difficulty Report
Snecma  Soci�t� Nationale d'Étude et de Construction de Moteurs d'Aviation (France)
TC  Transport Canada
TSB  Transportation Safety Board of Canada
U.S.  United States
west
º degree(s)
µm micrometre(s)

Previous


1. CFM International (CFM) is a combination of Snecma of France and General Electric of the United States. The Direction G�n�rale de l'Aviation Civile (DGAC) of France and the Federal Aviation Administration (FAA) of the United States have joint certification responsibilities and equally share continuing airworthiness of the CFM engines. The DGAC and FAA cooperate on all certification and continuing airworthiness issues.

2. See Glossary at Appendix A for all abbreviations and acronyms.

3. Small fragments broken from the face or edge of a material.

4. Infant mortality failures are normal and usually predictable. The failures are caused by defects in a product that cause it to fail early in its lifetime.

5. W.A. Glaeser and S.J. Shaffer, Battelle Laboratories