NASA Safety Panel: Second Starliner OFT Software Error Could Have Been “Catastrophic”

NASA Safety Panel: Second Starliner OFT Software Error Could Have Been “Catastrophic”

NASA’s Aerospace Safety Advisory Panel (ASAP) revealed today that a second software error was discovered during the uncrewed Boeing Starliner flight test in December.  Had it gone undetected during the flight, it had the potential to cause “catastrophic spacecraft failure” during reentry.  The panel wants a complete review of Boeing’s software verification processes before NASA decides whether a second uncrewed flight test is needed.  In an email this evening, Boeing said it appreciates the input and is working on a plan with NASA to address all the issues and decide what comes next. [UPDATE: NASA announced late today that it and Boeing will hold a media teleconference about this tomorrow, Friday.]

Meeting at Kennedy Space Center for its first quarterly meeting of 2020, ASAP members overall were optimistic that commercial crew flights will begin in the next 12 months, but cited work that remains to be done by both contractors, Boeing and SpaceX.  Their assessment clearly is that SpaceX is closer to that milestone than Boeing.

ASAP member Paul Hill, a former Director of Mission Operations at NASA’s Johnson Space Center, began his report on the status of Boeing’s Starliner program on a positive note saying Boeing made significant progress with December’s uncrewed Orbital Flight Test (OFT) even though an in-flight software anomaly meant it could not rendezvous and dock with the International Space Station (ISS) as planned.

But then he bought up a second software anomaly that was detected by ground crews during the mission. He did not provide details, but said if it had not been corrected in-flight, it could have meant a catastrophic ending when the Service Module (SM) separated from the crew capsule during the return to Earth.

“However, during the mission, a second Starliner flight software anomaly was detected during ground testing.  While this anomaly was corrected in flight, if it had gone uncorrected, it would have led to erroneous thruster firings and uncontrolled motion during SM separation [during] deorbit with potential for catastrophic spacecraft failure.”  — Paul Hill

Although the spacecraft landed safely, the two software failures raise questions about Boeing’s software testing and verification processes that ASAP wants addressed before NASA decides whether to require Boeing to repeat the test or move on to the next milestone, a Crewed Flight Test.

NASA and Boeing already have established an Independent Review Team (IRT) to determine the root cause of the first software failure that is associated with Starliner’s Mission Elapsed Timer (MET), but ASAP wants them to go much further.

First it wants Boeing’s assessments and corrective actions for its flight software integration and testing processes as well as its Systems Engineering and Integration (SE&I) processes and verification testing more broadly.

Then it recommends that those three assessments — root cause of the MET anomaly, flight software processes, and SE&I processes — be “required input to a formal NASA review to determine flight readiness for either another uncrewed flight test or in proceeding directly to a crewed test flight.”

In an emailed statement to this evening, Boeing said it accepts and appreciates the recommendations of the IRT as well as suggestions from ASAP: “They are invaluable to the Commercial Crew Program and we will work with NASA to comprehensively apply their recommendations.”

Boeing agreed that the second software anomaly, a “valve mapping software issue,” would have caused “an incorrect thruster separation and disposal burn” but did not go so far as saying the result would have been catastrophic.  Rather ‘[w]hat would have resulted from that is unclear.”

Regarding the Mission Elapsed Timer anomaly, the IRT believes they found root cause and provided a number of recommendations and corrective actions.

The IRT also investigated a valve mapping software issue, which was diagnosed and fixed in flight. That error in the software would have resulted in an incorrect thruster separation and disposal burn. What would have resulted from that is unclear.

The IRT is also making significant progress on understanding the command dropouts encountered during the mission and is further investigating methods to make the Starliner communications system more robust on future missions.

We are already working on many of the recommended fixes including re-verifying flight software code.

Our next task is to build a plan that incorporates IRT recommendations, NASA’s Organizational Safety Assessment (OSA) and any other oversight NASA chooses after considering IRT findings. Once NASA approves that plan, we will be able to better estimate timelines for the completion of all tasks. It remains too soon to speculate about next flight dates. — Boeing statement

Dr. Patricia Sanders, Chair, Aerospace Safety Advisory Panel, testifying at House hearing “Keeping Our Sights on Mars,” May 8, 2019. Photo Credit: (NASA/Bill Ingalls)

ASAP Chair Patricia Sanders used this Boeing example to make a bigger point about the need to apply systems engineering principles to designing software across NASA’s missions.

 “We are no longer building hardware into which we install a modicum of enabling software, we are actually building software systems which we wrap up in enabling hardware. Yet we have not matured to where we are uniformly applying rigorous systems engineering principles to the design of that software.  These are serious and pervasive issues that NASA will need to address in all of its programs and certainly will be critical to space exploration endeavors. — Patricia Sanders

Sanders is a former Director of Test, Systems Engineering, and Evaluation in the Office of the Secretary of Defense and former Executive Director of the Missile Defense Agency.

The panel addressed many other aspects of NASA’s missions, including the Artemis program to return astronauts to the lunar surface by 2024.

Former astronaut Lt. Gen. (Ret.) Susan Helms and Don McErlean, currently a senior engineering consultant specializing in airworthiness, certification and airframe engineering and safety, both used the word “breathtaking” to describe NASA’s acquisition of lunar Human Landing Systems.  Helms said she had not seen such “aggressive outreach” by NASA before and finds it “incredible to see this innovative thinking on NASA’s part” to engage with the commercial sector.  McErlean noted that there is almost no one in the workforce today with experience building these types of systems and it is “very much a breathtaking level of responsibility from an engineering perspective.”

While both used the term approvingly, they also pointed out challenges that lie ahead.  Helms said there is almost no guidance for what type of test and evaluation program is needed for the landers and alerted NASA that this will be a particular interest for ASAP.  McErlean emphasized that NASA must be “absolutely clear” on its hardware performance requirements and not just tell contractors what the hardware must do, but “how are you going to prove to me that your hardware does it.”

Other panel members focused on the critical role that the ISS plays in scientific research, including risk reduction for human missions to the Moon and Mars.  David West, Examinations Director at the Board of Certified Safety Professionals,  said “budget uncertainty” is the “number one risk” facing ISS along with assured access.  Sanders stressed the need to replace the 40-year-old spacesuits used for extravehicular activity (EVA).  “We continue to be adamant that NASA put priority on replacement of these suits.”

George Nield, the former head of FAA’s Office of Commercial Space Transportation, pointed out that significant changes are on the horizon for NASA in terms of roles and responsibilities once commercial companies begin flying non-NASA passengers into space.  After they complete their test flights, launches and reentries of the SpaceX and Boeing commercial crew systems will be licensed by the FAA and may have nothing to do with NASA.  It will be important for NASA to convey to its stakeholders in Congress, the White House, the public, and the press what its role will be, if any, in case of a mishap.  He anticipates that the transition from NASA as the primary designer, developer and operator of human spaceflight systems to an “informed, supportive, but demanding” customer will not be easy.

User Comments has the right (but not the obligation) to monitor the comments and to remove any materials it deems inappropriate.  We do not post comments that include links to other websites since we have no control over that content nor can we verify the security of such links.