How IBM helps monitor biodiversity in the Amazon

GreenBiz | February 18, 2014

aBrazil IBM Wikiflora Screenshot

A screenshot from IBM Brazil’s Wikiflora app.

In the future, tracking the growth and development of a certain stand of açaí palm trees in Brazil’s Amazon rainforest might just as likely be done by a student as a government biologist.

That’s because a new data portal and mobile app built by IBM in its São Paulo research lab — one of 12 R&D hubs run by the company experimenting with new approaches to sustainability — will be used by a cadre of citizen scientists to help the Brazilian government monitor and track the biodiversity of the Amazon rainforest. It’s part of a group of projects developed by the company to give users the tools to help manage and monitor their environment via crowdsourced data.

In 2010, Brazil’s Ministry for Environment and Innovation presented his group with a 500-page catalog of all the species recorded in an area of rainforest close to the city of Manaus, according to Sergio Borger, a team lead of the São Paulo human systems division.

“Though this area was well explored, it comes up as just a big square on Google Maps,” Borger said. “That paper catalog of its biodiversity was limited to the scientists who created it.”

The government challenged IBM to think of ways it could bring the experience of the rainforest alive to younger generations — and simultaneously develop a central repository for all of its data.

But instead of conjuring up the hot and humid climate of the Amazon rainforest as inspiration, Borger was moved by a winter memory from 15 years ago: the time he and his children counted birds for the Audubon Society during its Christmas Bird Count. The oldest known citizen science project in the world, data from the 114-year-old count helps scientists monitor biodiversity and gain greater insight from population changes over time.

“I was very passionate about that,” he said. “So I felt crowdsourcing was the way to go, as one of the elements of monitoring our environment is asking our citizens about it.”

As the government’s first priority was to get the project into the schools, Borger and his team first prototyped a website-based platform they dubbed Wikiflora. It took them a year and half, and was completed in October 2011.

Wikiflora enabled students to upload their photos of a plant species, enter specific characteristics and classify it after comparing it against an existing catalog photo. Each photo contained data to describe precisely where it was taken.

A key part of the platform was its identification gaming function. Students could review their peers’ entries and rate how well they thought that student classified photos. A user’s rating would determine the weight of his or her assessments.

Over the next year, the team developed a mobile app, Wikiflora 2.0, which carried over the feature that allowed students to upload photos and rate them with their smartphones. It also gave them the capacity to track and map individual plants and trees, as well as components such as the leaves and trunk, via more photo uploads. To identify the particular plant or tree, students were asked to choose from a long list of similar-looking plants or trees that the app would present for selection.

But because the students were too impatient to scroll through the options, Borger said, his team set off a year ago to revamp Wikiflora 2.0 so the public could collect data and monitor it over time using real-time processing and data aggregation.

“We’re on our third wave of learning now,” he said.

The platform’s third iteration, Missions, uses the company’s IBM InfoSphere Streams product to process collected data coming in (from many sources at any particular moment) before it gets stored in a DB2 database. Borger estimates that it will be released later this year for both Android and iOS smartphones, but in a research capacity only.

Multiple users will be able to add to the data file of a specific plant or tree by using unique identifying characteristics, such as the diameter of a tree trunk.

“That can be done with some level of certainty,” Borger said. “At this point, the system makes an assumption [on which plant or tree the data belongs to], but we’re working to refine it even further.”

His São Paulo team is developing tools to advance the capability of real-time processing and aggregation of images so that when a user takes a photo of a certain element of a plant or tree, for example, Missions will help the user determine its species and give the user five to six options to choose from. After the user makes the final identification, the data is sent to the database.

Tracking mobile species, such as frogs and insects living close to water, tack on another dimension of complexity altogether. They will be included in the future, Borger says, as his team is determining how to handle this type of monitoring.

IBM is particularly interested in knowing if insects living close to water are present because they can serve as bioindicators.

“If they’re not there, the water may not have enough oxygen for them to live,” Borger said.

The company’s efforts to get the public to crowdsource environmental indicators through technology is not new territory for IBM.

In 2009, an engineer in its San Jose, Calif., Almaden Research Lab developed CreekWatch, an app where users help the state to monitor drought conditions by uploading photos and weighing in on water levels, flow rate and the amount of trash at each location. Borger applied what he learned from this project to the development of the Wikiflora and Missions platforms.

And IBM’s Accessible Way app allows the greater public to map accessibility barriers in urban areas, so that mobility-challenged individuals can select suitable routes well in advance of their trip.

“We look at sustainability,” Borger said, “as a way to make our environment smarter.”

View the original story here.

Screenshot of Wikiflora web app courtesy IBM Brazil

IBM, SAP open big data platforms for citizen science

The Guardian US/UK | January 27, 2014

Ant in Amazon Rainforest

Ectatomma tuberculatum, an ant species that lives in the Amazon Rainforest

Sujeevan Ratnasingham is on a race to identify all living species on earth. With the tally anywhere between 10 million to 100 million – and one-third estimated to become extinct by the next century – it’s a Herculean task in the least.

But undiscovered species are just as likely to be found in one’s backyard as the Amazon rainforest. So it’s no surprise that in this age of crowdsourcing and citizen science, the bioinformatics expert and his colleagues at the International Barcode of Life (iBOL), a consortium of universities, natural history museums and research institutes, are asking people around the world to gather samples. Then, back in their labs, scientists can identify the species by sequencing a section of its DNA (a procedure known as barcoding).

With hundreds of millions of records to analyze – and even more data per record poised come in over the next year, iBOL decided to host its database on HANA, SAP’s enterprise platform that makes data available in a computer’s memory. The switch will allow researchers and citizen scientists to quickly analyze the huge volumes of data in the cloud.

By merging their records with other datasets such as weather, researchers can conduct predictive analyses that can reveal patterns between species and location. The results can provide clues into how outside forces – from invasive species to climate change – are affecting the environment, and suggest how to manage wild land and agricultural land more sustainably.

IBM, too, has been working on a platform to support crowdsourced citizen scientist data. Its research lab in São Paulo, Brazil, developed a portal and mobile app as a way to gain more knowledge about biodiversity in the Amazon rainforest. Users of all ages and educational backgrounds will be able to collect data points and identify species.

Sergio Borger, an IBM team lead in São Paulo, devised the crowdsourced approach when Brazil’s Ministry for Environment and Innovation approached the company in 2010. They were looking for a way to create a central repository for the rainforest data.

Borger and his team developed a platform and mobile app that allowed users to upload photos of a plant species and its components, enter its characteristics (such as color and size), compare it against a catalog photo and classify it. The classification results are juried by crowdsourced ratings.

Titled Missions, the platform will enable multiple users to collect data and monitor conditions on the same plant or tree over time through uniquely identifying characteristics such as the diameter of a tree trunk. Borger’s team is currently working through how to handle monitoring of more mobile species, such as frogs and insects.

Borger used the knowledge gained in IBM’s first experiment with gathering crowdsourced data as leverage for the new platform. Developed in partnership with California’s state water agency, the Creekwatch app enables citizens to help the government monitor drought conditions in local watersheds via uploading photos and evaluating water levels, flow rate and amount of trash present.

The company has also developed Accessible Way, an app allowing citizens to report accessibility problems in the urban environment.

The beginning of a new trend?

Could the forays by IBM and SAP signal a larger trend of IT companies opening up their platforms to crowdsourced citizen science projects?

Since the excitement and interest in big data dawned a few years ago, startup Kaggle has helped companies, organizations and researchers gain insight from their data by holding crowdsourced predictive analysis contests, while Crowdflower, another startup, has provided the service of generating the “crowd” itself. Although both Microsoft and Google have engaged in data-related conservation projects, large IT companies have mostly shied away from crowdsourced citizen science, for what could be a seemingly obvious reason, one scientist says.

“Citizen science is not the most lucrative [venture],” said Dawn Wright, academic oceanographer and chief scientist for Esri, the Redlands, California-based company behind the GIS (geographic information system) mapping platform ArcGIS.

Yet despite this disincentive, Wright says she’s seen an increased interest from industry over the past two years, despite its rise in the academic community in the past five.

Benefits to business

But while devoting resources to citizen science projects can be viewed as part of a company’s goodwill efforts, are there other business benefits to be gained?

After all, independent efforts by local groups are already underway. In the San Francisco Bay Area, Nerds for Nature has organized several ‘bioblitz’ events where volunteers document biodiversity using the iNaturalist smartphone app. They’re also working with a small biotech company and hackerspaces to perform DNA barcoding independently.

SAP says it’s not trying to sell HANA for those wanting to analyze iBOL’s biodiversity database. For these users, access will be given to the data platform at no cost.

“This is not an effort to sell our products,” said SAP’s David Jonker, head of big data marketing. “We’re passionate about using our technology for good in this world and applying it to citizen science.”

Still, Mike Gualtieri, an IT industry analyst for Forrester Research, says that there are reasons why large IT companies might be interested in making their products available for free to a non-enterprise audience.

Gualtieri says that the rise of Hadoop – an open source system enabling storage, processing and quick analysis of big data – has disrupted these companies’ core products such as databases, data analytics and data warehousing.

Although Hadoop will not necessarily replace the larger vendors’ technology, Gualtieri says they will have to work with Hadoop.

“They see a threat, so they figure they better get it out there and let people use it,” he said. “By making them available, they’re building awareness among the average user.”

As a result, Gualtieri expects to see more of these large IT companies use their platforms for more crowdsourced citizen scientist data analysis projects in the future.

Commercial applications

In five to 10 years, SAP says the public will have the ability to identify species on the spot, thanks to a DNA barcoding mobile app it’s working on with the International Barcode of Life. While the technology is being developed in part for the citizen science biodiversity project, Jonker says that the technology can be used in a commercial context.

There does appear to be a demand for it, if recent food mislabeling scandals – from horsemeat masquerading as beef, to fox meat sold as donkey meat, and mislabeled fish are any indication. Shopkeepers would be able to verify products by identifying a sample on the spot via DNA barcoding.

SAP is already in talks to commercialize the product with a few partners. In the meantime, the company will release an app enabling anyone to contribute samples to the International Barcode of Life project through uploading a photo (with location metadata) and mailing in a sample for analysis. The app is scheduled for launch in late March.

View the original story here.

Photo of Ectatomma tuberculatum, an ant species that lives in the Amazon Rainforest, by Alex Wild via Wikimedia Commons