Sequence of events
The Sydney Terminal Control Unit (TCU) provided air traffic services within 45 NM of Sydney Airport, to a height of 28,000 ft. The normal power supply for the TCU consisted of mains or generator supply feeding into two separate, independent Uninterruptible Power Supply (UPS) units consisting of "A" and "B" systems that shared the electrical load.
A routine 6 monthly performance inspection of The Australian Advanced Air Traffic System (TAAATS) Sydney TCU UPS was scheduled from 1500 Eastern Standard Time on 6 July until 1800 the next day, under an approved works plan. This performance inspection was conducted ahead of schedule to avoid the Olympics period.
The electrical technical officers scheduled to conduct the inspection were delayed because of higher priority tasking associated with the Sydney control tower. Subsequent approval to start the inspection at 1800 was gained under normal procedure through the Melbourne Technical Customer Interface and the Sydney TCU Traffic Manager. The approval task of the TCU traffic manager in allowing works plans to go ahead was not based on any structured risk assessment process. The approval was based on the experience of the traffic manager in assessing and forecasting aircraft movements, staff availability and knowledge of the likelihood and outcome of failure of the power supply for the TCU. Information contained in the works plan that there would be no interruption to service, combined with the stability of the power supply during past similar works, was used to support the traffic manager's decision to approve the works.
Work started on the performance inspection at 1800:36. At 1822:24 the Sydney TCU sustained a total loss of electrical power. Power was restored at 1822:38, 14 seconds later.
This loss of electrical power caused TCU Air Traffic Control (ATC) workstations, software switching of voice communications channels, satellite communications, provision of the Sydney Terminal Approach Radar to Melbourne and Brisbane and operational room lighting, to fail.
The ATC workstations automatically began rebooting after the initial 14 second power outage. The workstations were not available for about 7 to 10 minutes longer while they rebooted.
The air traffic controllers in the TCU were unable to determine the relative positions of aircraft under their jurisdiction for about 7 to 10 minutes. By using the emergency bypass air/ground radio, controllers were able to direct flight crews to keep a visual lookout for aircraft and to turn on their Traffic Alert and Collision Avoidance System (TCAS).
Power was not lost in the Sydney Control Tower and all systems continued to operate normally except the controller workstations that went into a "degraded mode" following the initial power outage, however the Sydney Tower controllers were able to provide normal ATC services to aircraft under their jurisdiction.
Full radar display from the Terminal Area Radar was available on the Tower Data Processing and Display System.
The Brisbane and Melbourne TAAATS Centres provided radar services and limited support during the period that the Sydney TCU ATC workstations were not available.
A review of the air traffic control data recorded at the TAAATS centre in Melbourne, showed that there was no infringement of separation standards.
Works plan
The works plan mentioned that an inspection would be carried out between 1500 on 6 June and 1800 on 7 June on the Sydney TCU power supply. The works plan wrongly referred to the month as June instead of July. The works plan stated that during the work there would be one UPS and generator set available and there would be no interruption to the power supply.
UPS
The two UPS units at the Sydney TCU had been upgraded from a 6 pulse system to a 12 pulse system in February 1999 and had been reported as being stable and having not caused any interruption to service since commissioning.
The two separate, independent Uninterruptible Power Supply (UPS) units consisted of "A" and "B" systems that normally shared the electrical load for the TCU. The performance inspection needed the full electrical load of the TCU to be placed on the "B" system while the "A" system was being tested off-line. The electrical technical officers reported the "B" system was switched from mains bypass to take the full load of the TCU at which point power to the TCU was lost. They also reported that all indications on the "B" system were normal, except that the output currents indicated zero.
Subsequent independent testing of the UPS equipment by Airservices and the UPS manufacturer found the system performed satisfactorily. Exhaustive testing of a similar UPS at the manufacturer's testing facility could not reproduce a power loss similar to that experienced by the Sydney TCU.
During testing by Airservices there was an inconsistency identified between the Rectifier Current Limit settings of the "A" and "B" systems. The "B" system rectifier current limit setting was set at 0.6V while the "A" system had a higher setting of 1.945V. The lower rectifier current limit setting of the "B" system may have occurred during training on the "B" system in May 2000. Airservices initial investigation found this setting may have caused the power outage. However, further investigation determined the lower rectifier current limit setting of the "B" system, by itself, would not have caused the power outage to the TCU.
The state of various conditions within each UPS unit is monitored and recorded independently by the National Technical Monitoring System. There are inconsistencies between these records and the recollection of site staff for the actions immediately prior to the power loss. The recorded data is not sufficiently comprehensive to allow the conclusive determination of the cause of the power loss.
Radar
Brisbane and Melbourne TAAATS centres operated normally and kept radar services during the power outage but lost the supply of Sydney Terminal Area and the Mount Boyce (Blue Mountains west of Sydney) radar data.
The Sydney TCU "fallback" radar system that was designed to assist controllers in maintaining situational awareness in the event of a complete TAAATS failure also failed because it was powered by the same power supply that was subject to the power outage.
Display of the Sydney Terminal Area Radar was available in Sydney Tower on the Tower Data Processing and Display System.
Communications
Recordings of air/ground and ground/ground communications for Sydney Tower and TCU were not available for 73 seconds from 1821:37 until 1822:51.
The Voice Switch panels at each console failed causing the loss of normal air/ground and ground/ground communications. Bypass air/ground communication remained available at those consoles which use remote Very High Frequency (VHF) outlets (50 volt battery powered communications equipment). Bypass air/ground communication was available within a few seconds after the 14-second power outage. Some controllers tried to use the air/ground bypass equipment during the 14-second outage and found that it was not available. The air/ground bypass equipment was used inconsistently depending on which controllers had tried to use the system within the 14 seconds and which controllers tried to use it later.
Following the power outage TCU controllers broadcast advice of the failure to air crew and advised them to keep a visual lookout and to ensure that their Traffic Alert and Collision Avoidance System (TCAS) was switched on. Some inbound aircraft were transferred to Sydney Tower controllers and outbound aircraft were transferred to Melbourne and Brisbane TAAATS centres.
There was no air/ground bypass equipment fitted to the Sydney TCU Directed Traffic Information position.
The individual voice switch restored to a ground/ground bypass mode that allowed controller's access to the PABX. However that system was not used because the secondary display windows that stored the telephone numbers were not accessible.
The Voice Switch Management Station that controlled restarting of the Voice Switch had an electronic latching switch that activated during the initial power outage. This needed a radio technician to manually reset a switch that allowed the Voice Switch to automatically reboot. The voice switch was not available for TCU controller use until 1832.
Sydney Tower air/ground communications were unaffected. However, ground/ground communications from Sydney Tower controllers to the TCU controllers were unavailable until 1832.
The Airport Rescue and Fire Fighting fire alarm system continued to work without any equipment error recorded.
Sydney TCU supervisor's PABX telephones were available during the power outage using separate handsets and dial pads.
The Sydney TCU Team Leader advised the Sydney Tower Traffic Management Coordinator by telephone that "we've lost everything down here". The Tower Traffic Management Coordinator replied "so have we". The Tower Traffic Management Coordinator did not understand the extent of the failure of the TCU. Consequently the TCU Team Leader did not request radar support from the Tower Traffic Management Coordinator.
Brisbane and Melbourne TAAATS centres were advised of the power outage by telephone from the Sydney TCU Supervisor's position and by relay of messages from aircraft near Sydney by using the air/ground bypass system.
There was no message on the Computerised Automatic Terminal Information System advising flight crews of the reduced services being provided by the Sydney TCU.
Air traffic control equipment
The 14-second power interruption caused the Sydney TCU TAAATS workstations to automatically reboot. The estimated time for the computers to reboot was between 7 and 10 minutes. This left the controllers without any air situation display for between 7 and 10 minutes.
During the outage, tools were not available for the controllers to maintain situational awareness.
The Sydney Tower air situation display lost new code/callsign correlation but kept the existing code/callsign correlations and automatically went into bypass (a degraded mode) because of the loss of the TCU radar data processor. This limited radar display and the Tower Data Processing and Display System was available for reference by the TCU controllers but was not used because of the misunderstanding of the extent of the equipment outage.
Airspace design
The segregated airspace design used in the Sydney TCU provided adequate short-term protection for aircraft to remain separated during the power outage without any ATC intervention.
The investigation found that for operations at Sydney Airport within the curfew period between 2300 to 0600, there was no airspace segregation between the arrival and departure stages of flight.
The Standard Terminal Arrival Route (STAR) communication failure procedures advise flight crew to track to the Sydney VOR and then fly the most suitable instrument approach to the nominated runway in accordance with the Enroute Supplement Australia (ERSA) emergency section. The investigation found the design of some of the instrument approaches into Sydney precluded flight crew from making an instrument approach from overhead the Sydney VOR during communication failure. As an example, flight crews that were advised to expect runway 34 Right for arrival before the power failure, would track to the VOR and would not be able to start the runway 34 Right ILS approach because this approach requires radar vectors to intercept final approach.
Maintenance documentation
The UPS switchboard is physically set up in a manner that when facing the switchboard, the "A" system, is on the left side and the "B" system is on the right side. This is the exact opposite to the schematic diagram for this UPS system, where the "A" system is on the right side of the diagram and the "B" system is on the left side.
There were two different UPS handbooks available to maintenance services staff. One was the Airservices controlled document which was not current for the equipment fitted at Sydney; the other was an updated handbook issued by the manufacturer during staff training in May 2000. The uncontrolled handbook contained the incorrect rectifier current limit setting. The switching procedure for removing and returning a UPS to service had been revised between the two handbooks.
Airways engineering instructions
The electrical performance inspection was carried out under Airway Engineering Instruction (AEI) AEI-3.4053 issue number 4. The purpose of this AEI is "to define the electrical tasks to be carried out during the performance inspection of the static UPS equipment... installed at Airservices Australia facilities". This AEI is non-prescriptive and is a generic instruction to cover all UPS equipment installed at Airservices facilities.
The document does not clearly define the tasks to be conducted, nor does it refer to the manufacturer's instructions about how to correctly carry out those tasks.
The AEI required the full load of the Sydney TCU to be placed on one UPS system. This removes the redundancy of the normal two independent UPS unit configuration and creates a single point of failure.
ATC procedures
The separation assurance techniques used by the Sydney TCU controllers before the power outage aided in maintaining separation standards during the power outage.
Degraded modes procedures were available in check list form for the TCU controllers. The degraded modes check lists were "designed to enable TAAATS controller's to be able to quickly identify degraded or abnormal system operation and to list the procedures:
- to enable an immediate response to maintain safety
- to be adopted for continued safe operations in the degraded mode
- to be adopted when upgrade to normal operations is available"
The check lists were not designed for nor did they address the multiple failures associated with the power outage. It was expected that the reliability and redundancies incorporated in the system design would preclude a total electrical failure.
The degraded modes check list recommended that "Operators should maintain familiarity with operating in these degraded modes and the need to prioritise actions in accordance with the following:
SEPARATE
- maintain separation by use of alternative standards if necessary
- issue traffic information as required
COORDINATE
- advise others of your degraded status
- adopt full verbal coordination and handoff procedures
UPDATE
- maintain and modify the flight data record (FDR)"
Sydney Tower controllers stopped aircraft departures from Sydney in accordance with degraded mode procedures.
ATC training
Most of the controllers rostered on duty during the power outage had completed their TAAATS conversion training in 1999; some had been more recently trained. The ATC TAAATS training did not cover or simulate the conditions experienced during the power outage.
Refresher training for elements of degraded operation, such as the use of the air/ground bypass system had not been conducted since the early TAAATS conversion training in 1999.
Flight crew procedures
The flight crews involved during the incident followed communication failure procedures as published in the Aeronautical Information Publication (AIP). The flight crews offered communications support to the Sydney TCU controller's by relaying controller instructions and advice of the power outage to other ATS units.
Maintenance services electrical personnel
All Sydney electrical technical officers took part in a training course provided by the UPS manufacturer in Sydney from 15-17 May 2000. The training was conducted on the operational UPS equipment.
Electrical technical officers usually worked in pairs when servicing the UPS equipment. These officers did not undertake any form of team resource management training to define their tasks while working as a team.
There were two electrical technical officers involved in testing the UPS. One was in the middle of his shift and the other had started duty at 0600 and had extended his shift to conduct the works plan that led to the occurrence.
According to Airservices National Technical Certification program (TechCert), both staff held suitable electrical qualifications to conduct the test.
The Airservices TechCert program assesses technical officers on their knowledge of and ability to safely remove and restore equipment from the national airways system. The TechCert program also required staff to work on the systems within predetermined timeframes for their TechCert certification to be current. Without this currency, staff cannot remove or restore equipment from the national airways system.
Contingency plans
The Sydney Contingency Plan was not activated because of the short duration of the power outage.
The Sydney contingency plan did not include any reference to the loss of the Directed Traffic Information service.
The contingency plan includes reference to radar fail procedures and directs further reference to the degraded mode handbook for which there was no radar fail procedure. It was noted that this document was under review at the time of the incident.