Citizen by Day, Scientist by Night?
The scientific discovery process is being disrupted by an open online community, and they want your help.
Is this a fad, or the future?
Citizen by Day, Scientist by Night?
The scientific discovery process is being disrupted by an open online community, and they want your help.
Is this a fad, or the future?
This community is called Zooniverse.com and it is a nonprofit that was started in 2007 by the University of Oxford, Chicago’s Adler Planetarium, the University of Minnesota, and the University of Portsmouth with a mission to “produce projects that use the efforts and ability of volunteers to help scientists and researchers deal with the flood of data that confronts them.” The first Zooniverse project was “Galaxy Zoo” which recruited volunteers to identify the shapes of over a million galaxies from images from the Sloan Digital Sky Survey [1]. Within 24hr of launch, Galaxy Zoo received ~70,000 classifications per hour and more than 50 million in its first year. [2] After this success, 32 projects were added including other astronomy based projects like Planet Hunters, projects to classify other things including bugs, serengeti animals, ocean floor fish, and projects for transcribing historical documents such as the “Anti-Slavery Manuscripts.” There are now >1.1 million volunteers globally. [2] Zooniverse is now run by the Citizen Science Alliance and is funded through federal and private foundation funding from the NSF, NASA, IMLS, NOAA, the Alfred P. Sloan Foundation, a Google Global Impact Award, Microsoft, STFC, the European Union, and the Leverhulme Trust.
Zooniverse.com lists 188 scientific publications from citizen science projects, with the names of top contributors listed as authors on many papers. Some have made significant impacts on the scientific community, notably including the discovery of Tabby’s Star (KIC 8462852) which had an pattern of brightness dips suggesting it was surrounded by an unusually large orbiting object [3]. Joe Cox, et al investigated the extent of resource savings by Zooniverse and found that “Most Zooniverse projects are broadly similar [..] cost savings, with an average across projects of approximately 34 full-time working years saved.” [2]
Crowdsourcing scientific research participation is not without challenges, and the main challenges one faces when creating such a platform include:
- Training users quickly and making it easy to contribute
- Retaining active users
- Attracting users, researchers, and funding
- Facilitating effective project management
Zooniverse is addressing these challenges, but still has room for improvement. There’s now a smartphone app, which works well for image classification, but could be improved for audio classification as it was found that a “discrepancy in performance among projects might be related to the nature of the subjects that volunteers are asked to classify ([..] both Whale FM and Bat Detective involve the use of audio clips). Those citizen science projects that involve visual tasks might be more likely to succeed compared with those that use other sensory inputs.” [2]
To improve active user retention, it is important to identify potential highly active users and recognize what motivates them. Corey Jackson, et al found a small number of newcomers are highly active users and often especially active in discussion boards debating classifications and discussing research, even early on, remarking “top performers in peer production communities are born, not made.” [4] Galaxy Zoo user motivation from interviews have been classified into 12 categories [5]:
|
|
These motivations are promoted through the Zooniverse blog which highlights discoveries and contributions that are made. Other forms of promotion include the BBC show The Night Sky, hosted by the Zooniverse founder, Dr. Chris Lintott.
New tools have made it easier to create new projects in Zooniverse, and there are new project guidelines which teach new teams about the commitment they need to make to spending time with heavy users on the forums and how to build an effective contributor community.
Zooniverse should now focus on improving retention and impact. They can organize local meetups and coordinate mentorship between active users and new users who could become active. Gamification can improve retention by giving badges for contribution milestones, persistence, community mentorship, and more. To improve impact in the long run, Zooniverse should expand the involvement of citizens in research beyond data processing to include the whole discovery pipeline of hypothesis generation, data collection, data processing, and data analysis. Zooniverse is already experimenting with not only using human data to train algorithms, but also use ML to prioritize difficult images for analysis, and they could open the ML design to their users to experiment with as well. [6]
Do you agree with these recommendations? As Zooniverse expands, do you think paying users for their time will dampen the intrinsic drive of new users who are seeking a primarily altruistic outlet, or reduce the quality of data labeling?
(799 words)
[1] “Exploring the Zooniverse” Michael Banks, Physics World Oct. 2013
[2] “Defining and Measuring Success in Online Citizen Science: A Case Study of Zooniverse Projects” Joe Cox, et al. Computing in Science & Engineering 17, 28 (2015)
[3] “Planet Hunters IX. KIC 8462852 – where’s the flux?” T. S. Boyajian, et al. Monthly Notices of the Royal Astronomical Society, Volume 457, Issue 4, 21 April 2016, Pages 3988–4004
[4] “Which Way Did They Go? Newcomer Movement through the Zooniverse” Corey Brian Jackson, CSCW ’16, FEBRUARY 27–MARCH 2, 2016
[5] “An Exploratory Factor Analysis of Motivations for Participating in Zooniverse, a Collection of Virtual Citizen Science Projects.” Jason Reed, et al. 2013 46th Hawaii International Conference on System Sciences
[6] “Optimizing the Human-Machine Partnership with Zooniverse” Lucy Fortson, et al. Collective Intelligence 2018
I don’t think financial compensation will be worthwhile for the current audience of engaged users at Zooniverse. Money is not a motivator on the list of Galaxy Zoo participants. Moving forward I think Zooniverse should consider dividing work into two streams. In the first stream it can have the more engaging tasks likely to result in the discoveries, pleasing images, and amazement that engage their current user base. In the second stream, I think they can have a separate team work-for-pay on the most mundane tasks or classifications. This division of work would allow them to maintain ongoing engagement with people interested in science and also increase their capacity to analyze data.
Even if they do implement a system like this, I agree that Zooniverse will need to do more to engage its audience in tasks beyond data processing. In the long term they will need to engage its audience with more meaningful research and projects to drive retention and collaboration on its platform.
I loved learning about this innovation. It seems like Zooniverse is making it possible to conduct research and studies that were previously impossible or at least highly infeasible. To answer your question on paying members – I do not think this is necessary. Zooniverse clearly has already generated significant participation without paying members. I think a big reason why this is true is because they are not directly selling or making a profit on the outcomes. This would be different if Zooniverse was making a profit on their work, but since it is not I do not expect users to demand or expect payment.
My bigger concern is actually not the cost but the quality. How do you ensure that your army of volunteers is actually identifying, labeling, and analyzing things correctly. For the same reason that you cannot except Wikipedia as a reliable scholarly source, shouldn’t it be true that you cant accept Zooniverse either?
Gavin,
I found your choice of topic fascinating, and moreover a highly notable and important example of open innovation.
In terms of how Zooniverse should proceed, I do not think that contributors should be remunerated, as it will crowd out more noble altruistic sentiments as others have already commented. The ideals of Zooniverse, according its website, are to “enable everyone to take part in real cutting edge research in many fields across the sciences, humanities, and more”. To me this sounds like a democratic, open-source effort where participants gain from the collective wisdom of others and contribute their individual knowledge as the primary form of “payment”.
On the user retainment issue, I believe we need to look deeply at what makes researchers successful in their respective fields. Oftentimes it is getting publications in peer-reviewed journals. This means that, in theory, time spent contributing to Zooniverse could have been spent on work towards writing a paper for a journal, which seems under the status quo to be more conducive towards one’s trajectory as a researcher. With this in mind, I think if Zooniverse were to attract and retain researchers of higher caliber, it needs to think about how to become a more respected organ of scientific authority that closes the gap between the level of rigor associated with traditional peer-reviewed journal outlets.
Very interesting concept.
One thought I had reading this: right now, everyone participating in this project is well aware that they are contributing to scientific research, meaning the project is only able to access a relatively small subset of the overall population (after all, how many people are willing to spend countless hours of their leisure time doing something simply because of their “amazement about the vastness of the universe”?). I think the folks behind the Zooniverse community should think about trying to reach a broader user base by capturing users who are completely unaware they are contributing to scientific research. Sound crazy? Similar things have been done before: the data from reCAPTCHAs (those annoying human vs. bot tests required to access many web pages) are actually used to digitize books [1], all without the vast majority of people who use them having any idea of this dual-purpose. Although it would be tricky to design, I think a similar model could be applied here.
[1] https://techcrunch.com/2007/09/16/recaptcha-using-captchas-to-digitize-books/
Interesting article, and something that I was completely unaware of before reading. Also, really interesting point about reCAPTCHA @Matt B. I also support the idea of broadening the base of contributors, and I would argue against paying top contributors or building in an incentive structure. I think what makes this idea so appealing is the nature of altruism and pure curiosity that drives comments today. While I wouldn’t necessarily use the reCAPTCHA model here, I like the premise, and I think that added awareness about Zooniverse can only help improve the quality of posts, and the opportunity for readers to learn from this platform.
I would liken this to Wikipedia, which is notoriously open and profit neutral. That platform relies on natural interest, and the desire to help teach or inform others. Because of this, Wikipedia receives posts from people with individual areas of expertise, and is not dominated by a few voices of “power users”. Zooniverse is very similar to me, and I would encourage the continued use of an unpaid and zero-incentive environment.
Gavin,
Great write up. One exciting opportunity I see with this mode of crowd-sourced open innovations is implications in annotating data, at scale, quickly.
The closest paid analog to Zooniverse that comes to mind is Amazon Mechanical turk, which pays users to annotate data ranging from dogs and cats to surgical instruments on laparoscopic video images. For both the free (Zooniverse) and paid user (Turk) scenarios, quality of labeling data is extremely important, especially for scientists who then need to use this data to train machine learning models, publish ground breaking research, etc. One would readily assume that the paid user, in this scenario may intrinsically be more motivated to produce better quality data since they may be dropped from the turk network if they annotate poorly. However, I would argue the opposite – and one can look at Wikipedia as a great example of this.
While Ben Newton in the previous post talks about the self-selecting nature of contributors who are intrinsically interested, I would extend this paradigm further by also arguing that they provide a crowd-sourced mode of quality control as well [1]. What’s really great about this is Wikipedia does not have to rely on a few people to manually scrub posts to ensure they are quality controlled, but can rely on its contributors instead. In a similar vain, I would hope that if and when Zooniverse can get a reliable set of its own annotators, that they can also self select out bad contributors, and permanently ban them from the network, similar to Wikipedia. Now, lets go back to the Amazon example – because their system is closed, Amazon has to manually screen for bad annotators (based on user feedback, or random audits) and remove them from the Turk network on a case-by-case basis. This is time consuming and costly!
[1] https://en.wikipedia.org/wiki/Wikipedia:Quality_control