Rose Rosette Disease diagnosis through Deep Learning

Published in

SoftwareMill Tech Blog

7 min readJun 3, 2020

First, an important note. We need your help in spreading the news about our initiative to fight Rose Rosette Disease in the United States.
We have teamed up with pioneering researchers from the University of Arkansas — Fayetteville, to provide an easy to use disease detection tool for any rosarian.
If you are residing in the US, and either involved in the rose cultivation industry, or in the relevant research, please contact us at roses@softwaremill.com to find out how you can easily express support to the cause.
If you know such a person, please consider forwarding this article to them.
Any of the aforementioned actions will help us secure funding through a USDA FACT grant. Thank you!

This article has been jointly written by Dr. Ioannis Tzanetakis and Dr. Tobiasz Druciarek from the University of Arkansas, and Mikołaj Koziarkiewicz from SoftwareMill.

What is this about?

There are few plant diseases more devastating than Rose Rosette Disease (RRD). We are currently witnessing an epidemic spreading throughout the US, with countless numbers of wild (multiflora rose) and landscape roses being infected and serving as a vector for further spread.

Disease symptoms include: leaf reddening, mosaic, and mottling. Shoots have an overabundance of thorns and tend to bunch together, forming witches’ brooms with deformed flowers.

*A popular commercial rose, KnockOut ‘Radrazz’ uninfected (left) and infected (right) with rose rosette.*

Infected roses lose their aesthetic appeal and usually die within 3–5 years after the infection. The root cause of the disease is rose rosette emaravirus (RRV), spread by microscopic mites, so small they can spread by wind currents. Every day, asymptomatic infected roses are moving undetected in the plant nursery trade, posing a significant threat to the rose industry.

Many commercial growers now report declines in rose sales from nursery production, citing the increasing concern among consumers and landscape managers. Nursery producers also suffer tremendous financial losses if the virus is identified in their stock, as all infected plant material must be destroyed.

One of the major concerns in this crisis is the lack of an instant detection method enabling early removal of infected plants. Current detection methods are sensitive, but also expensive, time consuming, and require specialised personnel.

Fortunately, one of the avenues to explore is offered by recent breakthroughs in machine learning: the process of developing computer systems able to discover — or “learn” — patterns from provided examples. This breakthrough is called Deep Learning.

How Deep Learning can support agriculture

Deep Learning, a branch of machine learning, has appeared increasingly often in the news over the recent years. You may have heard about some of its uses — more effective face recognition, self-driving cars, or paintings generated by computers.

What these — often sensationalist — news stories fail to mention is that Deep Learning’s usefulness lies in recognizing patterns: all kinds of patterns in fact, whether in images, video, or audio. Deep Learning itself is a recent development in artificial neural networks — computer models inspired by brains which are naturally predisposed to notice familiar patterns in the surrounding world.

In fact, Deep Learning is already enjoying substantial research, as well as increased usage, in agriculture in general, and horticulture in particular. Various so-called Deep Learning models count crop yields, determine the quality of fruits, detect weeds, or identify various wild plants. We at SoftwareMill have some experience in that regard, having personally developed models that automatically examine interactions between agriculture and forestation across large land areas — something that takes human experts substantial amounts of time.

One other use of Deep Learning in agriculture, of special relevance here, is detecting plant pests, or infections of various pathogens including bacteria, fungi and viruses.

What has not been considered so far is using Deep Learning for the diagnosis of Rose Rosette Disease.

We are going to take a look into how this new approach can help ensure that your roses are RRD-free and possibly, in the long run, help eradicate the disease.

How can Deep Learning help with managing Rose Rosette Disease?

The primary value that Deep Learning can introduce is automated detection of RRD. Here is how it works:

A large number of photographs of rose plants is gathered.
Experts label the photographs in various ways: whether the photo shows a healthy or a diseased plant, which parts of the depicted plant are diseased, etc.
Machine learning researchers set up a “fresh”, untrained model.
The model is trained, using the labelled photographs, to recognize healthy and infected roses.

This entire process is repeated at various steps, until the model achieves an adequate ability to correctly recognize infection.

The detection process itself may be differentiated into several possible approaches. Let’s describe the two most relevant to the problem at hand: classification and segmentation.

For the purpose of RRD detection, classification means that the Deep Learning model places a photograph of a rose plant in at least one of two categories: healthy, or infected. With more data, additional categories can be added on, representing the stages of infection.

The diagram below depicts the process, with some example inputs and outputs from preliminary research.

*The process of classification. Photographs are fed into a trained Deep Learning model, the model estimates their RRD infection status.*

With segmentation, a Deep Learning model is trained to actually recognize specific symptoms of infection, such as misshapen leaves, discoloration, excessive amount of thorns, witch’s brooms, or other growth abnormalities. A segmentation model, in its output, shows specific areas on the input photographs where symptoms are apparently present.

The diagram below shows, likewise based on preliminary research, how segmentation works.

The process of segmentation (simulated). Photographs are fed into a trained Deep Learning model, the model marks the areas with detected symptoms. In this example, abnormally growing leaves are marked in blue, and suspect stems in magenta.

The reason why we bother with both approaches is that they complement each other. Classification offers less information — we can only know how healthy a photographed plant is as a whole. On the other hand, classification Deep Learning models require much less effort across the board, i.e. fewer photographs collected, less researcher work, and a quicker training process. Classification models are also capable of higher detection speeds, as they are simply less complex.

Segmentation models, in turn, prove themselves to be more resource-intensive: precisely marking infected areas of a plant for every photograph is more laborious than just labeling the entire image as “healthy” or “infected”, not to mention other factors. However, the ability to see which exact parts of the plant have been determined as suspect cannot be underestimated.

It is additionally worth noting that, in machine learning in general, and Deep Learning in particular, multiple models can be used at the same time, in order to compare their respective results. Whether their interpretations of a given photo agrees — or not — provides potentially more confidence in the results.

What lies ahead

As already mentioned, the diagrams included in this article are based on actual preliminary research — we have taken a small sample of photographs, and trained prototype Deep Learning models to recognize Rose Rosette Disease symptoms. This may leave you under the mistaken impression that the problem has been mostly solved. Unfortunately, this is far from the truth, and many issues remain.

The most immediate problem is the number of photographs. To sufficiently train a Deep Learning model, thousands, tens of thousands, or even millions of properly labelled images need to be prepared. Unlike Artificial Intelligence from science fiction, real-life computer models are incapable of creativity or originality — they can only properly react and recognize inputs that are “familiar” to them in some way.

When data is insufficient, plants are misclassified, like in our preliminary research. The samples in the top row are healthy plants that have been erroneously classified as infected. The reverse applies to the bottom row.

This variety of images needs to encompass, among others, various plant growth stages, different growth locations, and, last but not least, plants affected with other, similar disorders. We do not want to, for example, misidentify simple iron deficiency as late stage RRD symptoms.

Another aspect is the epidemiological situation of RRD in the United States. The primary complication is multiflora rose. This rose species is extremely widespread in some parts of the US, and often considered a weed. Unfortunately, the multiflora rose can also be infected by RRD, and significantly contributes to the spread of the disease across the country. We need to take this factor into account, which entails developing additional models just for multiflora infections.

Regardless, the most crucial factor that contributes to the amount of research effort is the following: we do not want this to be a pie-in-the-sky, ivory tower project. The results of our investigation should be accessible to — and usable by — virtually every rosarian in the United States.

Our end goal is a smartphone application and/or a web page, where users can upload photographs taken from their cultivations for automatic diagnosis, easily and for free.

Controlling conditions. Top row contains photographs used for our preliminary research, bottom row sample pictures snapped “in the field”. Detecting RRD in the latter is our target. Note the varied conditions such as lighting, backgrounds, or distance from the photographed plant.

We need to take into account that the photographs will be taken in various conditions — different lighting, varied backgrounds, incomplete framing of the subject plant. The complexity, therefore, increases further.

*End goal. A smartphone application and/or web page accessible to everyone in the USA.*

We are confident that the end goal is achievable — that a valuable tool of combating RRD in America can be created, complementing existing efforts, such as the RRD Monitoring Network, considerably.

How you can help

As mentioned in the headline section, we at SoftwareMill have teamed up with the researchers from the University of Arkansas — Fayetteville who have been pioneering RRD research, starting from the characterization of the virus in 2011 to understanding the virus and the vector. We want to apply for a USDA grant fostering data-supported innovation in agriculture. As USDA requires a clear demonstration of the importance of the project for the industry, we ask for your support to improve our chances to obtain the funding. If this problem and the project is close to your heart, please contact us at roses@softwaremill.com, or Dr. Ioannis Tzanetakis (itzaneta (at)uark(dot)edu) directly, for further information.