We've established that already, as if it ever needed establishing.
Back in the day, before the closed-track circuit (the most important contribution to the safe separation of trains; number 2 being random and mandatory drug and alcohol testing), railroads lacked a method, and a mechanism, for separating field from office, that is a means for separating the communication of authority from the actual conditions of occupancy in the field that alone determined if the authority was valid or....lapped.
The closed track-circuit began to change all that. With the installation of closed track-circuits, regardless of what authority was communicated by the office, the field determined the restriction to the authority, and communicated that restriction in the field itself.
The vital logic embedded originally in a rule book and timetable became a mechanical, and electro-mechanical principle installed in the locking bed of interlockings, in automatic block signal systems, expanded into the network we call centralized traffic control. It became a logic written into and read by machines, not deus ex machina, but rather deus in machina.
No matter the request from a train dispatcher to provide movement authority, the field apparatus determined, based on field conditions, how much, if any of the authority, could be granted, and communicated to the train.
Now railroads have always stipulated that the means of communication from the office were not, themselves, vital. If office communications to the field apparatus were interrupted, no action would be taken; no authority would be granted.
Because because the information was vital, but the means were not, the failure of the means of communication had to treated as if the information had not been transmitted:
214. (2nd paragraph)
If the means of communication fails before an office has repeated an order or has sent the "X" response, the order at the office is of no effect and must be treated as if it had not been sent. (Peter Josserand, Rights of Trains, Fifth Edition, p. 39, Simmons-Boardman 1957)
An order restricting a train's pre-order authority could not be acted upon. No other train then could receive authority conflicting with that first train's original authority.
And today, or maybe I should say, as recently as today, the means of communication are considered non-vital. Requests for traffic direction, switch alignment and signal display have been, (and still are) communicated over telephone lines. Those means of communication can be non-vital, because vitality is preserved in the field.
"As recently as today..." I say that because I'm worried (no big surprise, right? authority, responsibility, worry, the three vertices of the operating officer's triangle). I'm worried because I received this notification just the other week from the government of the UK. With your permission, I'll reproduce the summary of the accident. Without objection? Good.
Summary: On the morning of 20 October 2017, four trains travelled over the Cambrian Coast line, Gwynedd, while temporary speed restriction data was not being sent to the trains by the signalling system. No accident resulted but a train approached a level crossing at 80 km/h (50 mph), significantly exceeding the temporary speed restriction of 30 km/h (19 mph) needed to give adequate warning time for level crossing users.
The line has been operated since 2011 using a pilot installation of the European Rail Traffic Management System (ERTMS) which replaces traditional lineside signals and signs with movement authorities transmitted to trains. These movement authorities include maximum permitted speeds which are displayed to the train driver and used for automatic supervision of train speed.
The temporary speed restriction data was not uploaded during an automated signalling computer restart the previous evening, but a display screen incorrectly showed the restrictions as being loaded for transmission to trains. An independent check of the upload was needed to achieve safety levels given in European standards and the system designer, Ansaldo STS (now part of Hitachi STS), intended that this would be provided by signallers checking the display. A suitable method of assuring that the correct data was provided to the display had not been clearly defined in the software design documentation prepared by Ansaldo STS and the resulting software product included a single point of failure which affected both the data upload and signallers’ display functions. The system safety justification was presented in a non-standard format based on documentation from another project still in development at the time of the Cambrian ERTMS commissioning and which, before completion, made changes that unintentionally mitigated the single point of failure later exhibited on the Cambrian system. Network Rail and the Independent Safety Assessor (Lloyd’s Register Rail, now Ricardo Rail/Ricardo Certification) were required to review the design documentation but did not identify the lack of clear definition in design documents and were not aware of the changes made during the development of the other project.
So as the full report makes painfully clear to the most casual observer, the following occurs (to which I append the origin of the flow of information, office or field).
a)we have a pilot program of ERTMS operating for six years. The system transmitted movement authorities, and restrictions on movement authorities, including speed requirements directly to trains. (office)
b) Certain safety-critical information, a temporary speed restriction (TSR) at a level crossing, failed to be carried over after a automatic restart of Radio Block Centre train control sub-system the evening prior to the event. (office)
c) The failure to carry over the information after the restart was invisible to the personnel at the Machynelleth signalling control center. (office)
d) While the controllers enter TSRs into a sub-system called GEST, and the GEST then retains the information and transmits it to the RBC, upon a rollover event, the GEST system is required to check its memory of TSRs with the information in the memory of the RBC sub-system. If no anomaly is detected, the controllers are then able to unlock the RBC and operations proceed normally. (office-office)
e) "The Cambrian ERTMS system was vulnerable to failure because the causal factors identified in this report were present in the software development process." (Item 50 of report). (office)
Causal factor 1: the GEST sub-system was in fault status, "probably due to a corrupt database." (office)
Causal factor 2: while in the fault status, the GEST sub-system displayed the TSR at the level crossing from its own memory, not that of the RBC. Information was not received from the RBC. (office-office)
Causal factor 3: (Item 62) "The memory used for storing temporary speed restrictionss in the RBC was volatile, allowing temporary speed restriction data to be lost during a rollover." Non-volatile member was used to store the geography, the "physical characteristics" of the railroad. Temporary data, including TSRs were stored in volatile memory which does not retain data during a reset, rollover, or power failure. The data has to be reloaded from the GEST sub-system, which, of course was in failure status without displaying an error message. (office)
f) The system designer, Ansaldo STS (now part of Hitachi STS) knew that an independent check of data upload and restoration after a rollover was needed, but thought that this could be satisfied by the controllers checking the GEST display. (office), not a display tied directly to the RBC output, such as a "twin" of the display in the operating cab of a train. (back office-office, without return input from field)
g) "The system safety justification was presented in a non-standard format based on documentation from another project still in development at the time of the Cambrian ERTMS commissioning and which, before completion, made changes that unintentionally mitigated the single point of failure later exhibited on the Cambrian system. Network Rail and the Independent Safety Assessor ...were required to review the design documentation but did not identify the lack of clear definition in design documents and were not made aware of the changes made during the development of the other project." (back office to back office)
h) short version: omissum Deum machinas
This worries me as the loops are closed, all right, but more than closed, they are impenetrable. Information circulates from office to field, and unlike the back in the day days, the means of communication are inseparable from the vitality of the system Precisely because there is no superiority of information from the field, the inter-communication of office sub-systems to validate movement authorities makes that communication process a vital system.
Clearly, this problem, and these types of problems can be overcome by setting up a mirror or twin of the locomotive engineer's display in the control center; and using that information for a checksum routine. To do that, you cannot be awed by the lines of code supposedly guaranteeing safe operations; and you have to recognize that this machine language has altered our operating terrain to where the means of communication are part of the vital process of safe train operations.
Me worry? Mos' def. Not about code. But about the coders- who have made safe train operations once again office-dependent.
Have a brave new year.
David Schanoes
December 28, 2019
I have had people walk out on me before, but not... when I was being so charming.--Deckard, Blade Runner
Copyright 2012 Ten90 Solutions LLC. All rights reserved.