2018: A Space Odyssey – How NASA uses Machine Learning for Space Exploration

Neural network machine learning algorithms are revolutionizing the classifications of galaxies and giving us a deeper understanding of the origins and evolution of the universe.

In the most recent Age of Exploration, deep space exploration has become a pillar of the field of astronomy. The purpose of having a robust space program is simple: that we, as humans, should “undertake it for the most basic of reasons – our self-preservation as a creative, as opposed to stagnating, society [1].”

In recent years, the amount of data various spacecrafts and satellites have accumulated has overwhelmed the field of study. One way this data is being used by astronomers is in galaxy morphological classification, whose principal goal is to obtain insight into galaxy formation and evolution. Galaxy classification used to require human intervention, as the data received by spacecraft were photographic plates and required visual inspection and analysis. But as more and more data come online (The Sloan Digital Sky Survey alone will produce more than 50 million images of galaxies in the coming years), it has become unrealistic to devote human resources to the time-consuming and expensive process that is galaxy/planet/star classification [2]. With the advent of machine learning, companies like NASA now have the opportunity to automate image analysis through neural network and data mining algorithms [3]. Technological advances in data collection from space had not been matched by similar advances in data analysis (classification) creating a massive imbalance between the rates at which the data being collected was being processed; machine learning offers a way to correct this imbalance by allowing the rate of analysis to exponentially increase as computing power can take the place of human minds.

Neural networks have been one of the most popular machine learning algorithms deployed for morphological classification; in fact, scientists have attempted to use neural networks for classification of morphologies since the early 1990s, with limited degrees of success. But recently artificial neural network algorithms have become much more sophisticated and NASA has been able to see a significant increase in their classification accuracy. In 2017, for example, NASA discovered an eighth planet circling Kepler-90, tying the Kepler-90 solar system with our own for the most number of planets in a single solar system. The planet was found by feeding data from NASA’s Kepler Space Telescope into an artificial neural network programmed to identify exoplanets. To create the neural network, researchers trained the algorithm “using 15,000 previously-vetted signals from the Kepler exoplanet catalogue.” Once the neural network achieved a certain level of accuracy (96%), they applied the algorithm to a previously unanalyzed set of 670 star systems. With the success of this “first” application, NASA now plans to apply the algorithm to the full set of 150,000 star systems [4].

The success of neural networks in these instances has much broader implications for NASA; machine learning applied to astronomy has reached a level of accuracy and sophistication that NASA can more comfortably deploy these algorithms to process the massive backlog of current and archived astronomical data. These algorithms now have the capability to detect some of the weakest signals of morphology, signals that would have been missed entirely by human classification. NASA is not just realizing time and cost savings by no longer needing to utilize humans for classification; NASA can now analyze data they knew could never have been analyzed by humans in the first place.

One of the biggest challenges of applying machine learning to astronomy is the risk of creating “black box” applications that give little insight and questionable results [5]. In the short term, NASA needs to invest resources in two things: 1) expanding the “known” datasets of galaxy catalogues so that neural networks have a larger variety of input data fed into them and 2) advancing a technique known as deep convolutional neural networks, which have shown to be “as good as humans at face recognition”, to mitigate the “black box” risks of these algorithms [6]. In the longer term, NASA faces a major lack of funding, public apathy, and increased competition due to the commercialization of space. NASA must become more “network-oriented to develop and acquire the technologies it needs [7].” Relatedly, NASA must embrace open-source solutions, and support crowdsourced ideas; it is through these initiatives that many of the existing galaxy catalogues were developed in the first place. As the galaxy datasets will need to be greatly expanded to feed into machine learning algorithms, and quickly, NASA’s best option is to rely on amateur astronomers to build these datasets, so as not to impede the major progress being made with machine learning and morphological classification. (744 words)


[1] https://www.nasa.gov/missions/solarsystem/Why_We_01pt1.html

[2] Blanton, M., et al,  (2017). Sloan Digital Sky Survey IV: Mapping the Milky Way, Nearby Galaxies, and the Distant Universe. The Astronomical Journal,154(1), 28th ser., 35. doi:10.3847/1538-3881/aa7567

[3] Jorge De La Calleja, Olac Fuentes; Machine learning and image analysis for morphological galaxy classification, Monthly Notices of the Royal Astronomical Society, Volume 349, Issue 1, 21 March 2004, Pages 87–93, https://doi.org/10.1111/j.1365-2966.2004.07442.x

[4] https://www.nasa.gov/ames/kepler/briefing-materials-eighth-planet-circling-distant-star-discovered-using-artificial-intelligence

[5] Manda Banerji, Ofer Lahav, Chris J. Lintott, Filipe B. Abdalla, Kevin Schawinski, Steven P. Bamford, Dan Andreescu, Phil Murray, M. Jordan Raddick, Anze Slosar, Alex Szalay, Daniel Thomas, Jan Vandenberg; Galaxy Zoo: reproducing galaxy morphologies via machine learning, Monthly Notices of the Royal Astronomical Society, Volume 406, Issue 1, 21 July 2010, Pages 342–353, https://doi.org/10.1111/j.1365-2966.2010.16713.x

[6] https://www.technologyreview.com/s/536411/how-machine-vision-is-reinventing-the-study-of-galaxies/

[7] https://hbr.org/2018/04/the-reinvention-of-nasa



CircleUp: Riding The Wave of Machine Learning Into Your Pantry


A 3D-printed liver: not ready for prime time?

Student comments on 2018: A Space Odyssey – How NASA uses Machine Learning for Space Exploration

  1. Now, that was an interesting read! It is unbelievable that NASA was so innovative to use machine learning decades ago. I can relate to your recommendation. Open innovation and the usage of open source will be the key for NASA development, but I think that NASA should also be proactive and reach out to development communities, offering them to ‘play’ with the data and come up with ideas.

  2. I agree, what a great read! I think it is important to bring in “amateur” astronomers, and other young scientists, to support building data sets. But I would add that given the other challenge NASA is facing, lack of budget and increased competition, it can be hard to incentive scientists in the future to want to work at NASA when they could do the same work at the competition for higher compensation. This leads to another problem NASA is facing which is attrition of its work force to it the growing space and astronomy private sector.

  3. Terrific read. I agree with your points on the “black box” concerns raised about the classification of planets and solar systems. In particular, the concern with the public apathy could be further amplified as people begin to understand that the conclusions that NASA is coming to are not a product of human discovery but rather machines taking data and machines processing that data. While I agree that this process is simply highly automated, the general public opinion is still that NASA is “confirming” the existence of planetary objects with the human eye, but this is simply implausible. The data that is being collected is in fact very raw and messy, thus the use of machines to sift through it is critical, as you lay out, in order to make any meaningful progress in feasible process times.

    One concern I have with the use of ‘amateur’ scientists is the issue with sub-par inputs leading to lower accuracy ratings and more work over the long run for NASA scientists. In order to accurately process these data sets coming from satellites, a scientist has to be fairly advanced in their computational capabilities. Because of this, I would recommend specifically targeting academic research institutions to aid in processing this data as these are the scientists that will likely be closest to competency and perhaps one day competing for a job at NASA or a competitor.

  4. This was an amazing read, thank you! It’s a bit of a sad state of affairs that competition from private space companies and the lack of funding resulting from public apathy is contributing to the backlog of data. I often think of the innovations of companies like SpaceX as an unmitigated good since they improve our capabilities to explore space when no one else is doing that work. But the competition for talent and resources that these companies are creating may actually be impeding true scientific progress in the field of astronomy, and I think it’s that science that really serves as the spark for our imaginations and dreams about space exploration in the first place. On ML – thank you for the very clear presentation of the two horns of the dilemma. I agree with you that adopting an open innovation mindset and letting amateur astronomers help build the dataset is the better choice. Perhaps this might be one small way to rekindle our imaginations and our fascination with space and lead to some real progress.

  5. Wonderful read, I really enjoyed reading about the blend of machine learning and space, the final frontier! I agree that this is an opportunity to maximize technological advances and lean on their firepower while optimizing the human resources deployed to operate them. I also like your tie-in about open innovation and the value it could add in this context. I do think there are interdisciplinary fields and sets of expertise that could be utilized here to make sense of the data results. I also imagine it to be an opportunity for government agencies to work more closely and share their learnings with each other with the wealth of data being collected. It seems that the general trend has been towards decreasing the number of silos, but NASA has always felt as though as it’s been in a league of its own. Maybe its machine learning capabilities can help change some of that.

  6. NASA may consider crowdsourcing some of the analysis, like you mentioned in the final paragraph, in a similar manner as the SETI@home program out of UC Berkeley. Idle computer power from volunteers throughout the world can be used to run machine learning algorithms on data sets for which NASA has neither the funding or the computer power to analyze. Of course, this will still require humans to build appropriate data sets for the distributed system to prevent “garbage in, garbage out.”

Leave a comment