The concern related to the removal of paper train graphs to accommodate new train planning software prior to the software being adequately tested in an operational environment.
The reporter raised a safety concern regarding the new train planning software, [product name] implemented at control boards [Location 1] and [Location 2].
The reporter advised that [product name] was designed to replace the manual train graphs currently being utilised by network controllers for route planning. The reporter believes that [Product name] was not adequately tested prior to its implementation, resulting in the software effectively being tested in an operational environment for the first time, when it went live.
The reporter stated that the software has numerous glitches that need to be resolved before it is fit for operational use. [Product name] in its current format, has regularly been generating unreliable and unachievable train plans resulting in controllers needing to manipulate the plans or ignore the system generated plans completely and develop new ones. Due to the removal of the train graphs when [Product name] was introduced, there has been no mechanism for controllers to accurately record controller created or amended train plans. This has resulted in controllers memorising plans, which has obvious safety implications.
The reporter advised that they and other controllers have experienced workloads so high, that they have exceeded their maximum capacity to process information. The reporter believes that these high workloads have created controller distraction and resulted in numerous controller errors. The reporter states that after three weeks of use, [Operator] recognised that [product name] was not yet suitable for operational use and ceased using the system during peak times from 06:00 – 18:00 daily, with train graphs reinstated for use during these periods. However, [product name] is still required to be used during off-peak times from 18:00 – 06:00 daily.
The reporter acknowledges that whilst there is less traffic during these periods, [product name] remains an inadequate system to create and record train plans. Relying on controllers to memorise train plans, during a night period where fatigue levels are undoubtedly higher, is creating a situation where distractions and mistakes are inevitable.
In addition to [product name], all safe working rules in regards to the train plans are required to be data entered into a separate side application, which the reporter states is also not yet fit for purpose. Whilst the safe working rules have always been required to be managed separately to the train plans, the new application simply creates more work for the controller by adding an extra data entry task.
The reporter states that they believe [product name] has the potential to be a highly beneficial system once the software glitches have been rectified. However, the reporter believes that [product name] should be taken completely off line until testing is finalised.
Concerns regarding the Implementation of Digital Train Planning
[Operator] is currently implementing a digital train planning solution and can confirm an absolute emphasis on the safety-critical nature of a network controllers work has been considered throughout the design, risk review, consultation and involvement of staff, testing and training or personnel during this project.
The digital train planning solution is the tool that assists network controllers to plan train movements, which they then execute through the Train Control System (TCS). With the removal of paper train graphs, the permanent form recording requirements which were previously written by hand on the paper train graph have been digitized and are input into the solution by the network controller.
Whilst not specifically clear from the advice provided [Operator] believes the concern here more closely reflects the areas of plan quality where the train plan updates. The solution takes inputs from the operational environment, it creates new plans designed to optimise efficiency of the network and as the operational environment changes, the tool re-plans the network in consideration of these changes.
When network controllers intervene in the automated route plan, the solution also records and displays the altered plan and actual train journey accordingly. The solution does not require a network controller to memorise a plan as the solution produces and displays the plan for application in the TCS. The digital train planning solution captures the history of actual train movements, which a network controller can view at any time.
[Operator] recognise that there are opportunities to enhance the systems and through engagement and feedback from the network controllers a list of identified improvements for the user experience exist for consideration in the future release of Supporting Applications. The Safe Working Rules continue to remain unchanged as a result of the implementation of the Digital Train Planning Solution.
Risk Assessment and Planning
For the implementation of the digital train planning solution a safety assurance plan was developed through consultation and analysis with network controllers and key stakeholder groups. This included a human-factors assessment including both workload and ergonomics factors related to the safety-critical aspects of a network controller’s role. An output of this work was the development of supporting technical documentation identifying proposed risk treatments that would further reduce risk so far as is reasonably practicable. These reviews highlighted that the implementation into the operational environment required a phased deployment approach. The scale of change for network controllers was previously identified and was critical in the subsequent supporting processes, tools, resourcing, training and product testing/integrity. Engagement with network controllers was fundamental to identify change plans.
A ‘super user’ group was created comprising network controllers from each respective team to ensure support coverage and ongoing consultation throughout the course of the project. Network controllers were engaged as Subject Matter Expert’s (SME). A detailed training process including 3-day classroom training, rostered time on a digital train planning simulator and assessment for competence on the new train planning solution was also provided.
Solution Testing
Prior to the phased implementation (Go Live), formal solution testing and calibration was undertaken for the solution in the use of the live operational environment. This included dry run testing of greater than 250 test cases, completed in a structured format to validate the supporting IT infrastructure over a four-week period. Formal systems integration testing was conducted within five test cycles over a nine-week period on an endorsed suite of over 250 test cases ending in February 2019, utilising the SME and external specialist testing resources fulltime.
Following this, user acceptance testing was conducted involving a formal validation of the end-to-end solution within a defined area being two network control boards. This was undertaken within the test environment, using live operational train running information. This occurred through the execution of greater than 250 test cases over a circa twelve-week period run in ‘parallel’ to a network controller’s ‘board’, which satisfied readiness for deployment into live operations.
Upon demonstrating readiness, intensive resourcing and deployment actions were finalised, including ongoing IT systems support and defect management incorporating defined severities of issues across vendors. A further risk analysis was undertaken prior to ‘Go Live’ to ensure that adequate processes were in place including the roll back plan. Training and supporting processes for the use of a ‘Supporting Applications’ tool that streamlines the input into the digital train planning solution in one user interface, rather than multiple sources was developed and assessed for readiness.
[Operator] also enacted a series of review points to enable system monitoring including routine involvement of the network control team and leaders to ensure suitability for continuance of the solution implementation.
Solution testing was conducted over months with SMEs, experienced with the [Location] network operation and train control functions. Ultimately, after satisfying these requirements, using the solution operationally with the train signalling within the TCS following the digital train plan was an essential component of verification processes for sustained operations. Further support was achieved through the deployment of specialised support teams 24/7 during the phased implementation to monitor the solution performance and to provide further support to the network controllers (and leaders).
[Operator] does not consider this controlled phase of deployment should be regarded as ‘testing of the system’ as was raised in the ATSB REPCON this does not represent the rigorous and voluminous testing conducted since late 2018.
Status of implementation of the Digital Train Planning solutions
An issue/defect identification process is in place which, with network controllers’ assistance has culminated in the categorisation of eight issues, which [Operator] do not consider ‘glitches’ using the normal definition. Several data and system calibration issues were identified and resolved during the first three weeks of operations.
[Operator], in consultation with its network control teams have adjusted the digital train planning solution utilisation times to avoid potential for task distraction, including the introduction of changed times of system usage based on feedback from network controllers. Each team have made the respective changeover to using the digital train planning solution based on their own workload, train running status and review of the operational environment at the time.
It is also noted [Operator] has a structured fatigue management process and supporting tools in place at network control Centre [Location]. These changes made in prior years based on staff feedback at the time, are supported generally by the network control team. [Operator] has systems defining the way in which events are reported and investigated including fatigue, with deliberate intent to understand systemic failures and how to prevent recurrence of similar events. Management will continue to encourage the use of this system and raising of fatigue concerns.
[Operator] continues to work with its primary vendor to resolve and deploy resolutions to two pre-identified product items that will require further solution testing and calibration activities, including the network controllers during phased deployment.
[Operator] has scheduled a post implementation review for early in 2020, once all boards are ‘live’.
ONRSR has reviewed the reporter’s concerns and the operator’s response and in this instance is satisfied that the operator is addressing the issues raised and is managing the associated risks so far as is reasonably practicable.