Predicting rice crop disease: a data modelling challenge

blogpostdavid1

Rice is the staple food of more than half of the world’s population [1] – more than 3.5 billion people depend on rice for more than 20% of their calorie intake. This includes 70% of the world’s 1.3 billion poorest people who live in Asia. Hence, any improvement in the production of this crop can have a big impact on the wellbeing of many of these people.

“Rice Blast is a rice disease which represents a major problem for rice growing in paddy fields all over the world. It results in great crop loss and also makes it necessary to apply large amounts of fungicide. Thus, if we can anticipate the conditions in which rice blast occurs, we can win on two counts: reducing crop loss and also reducing fungicide use. This can be achieved by taking a-priori preventive measures to reduce the crop loss and greatly reduce the duration of fungicide spraying.

IRIS Advanced Engineering has participated in a major H2020 project, Rice Guard [2] to conduct empirical in situ studies for rice blast prediction using the widely referenced Yoshino model [3] and comparing it with a more recent variant of one called WARM [4]. The Yoshino model determines infection periods of the pathogen based on weather conditions.

The Yoshino model has two steps. The first step considers favourable conditions for Rice Blast, then the second step estimates the number of “infection hours. The first step evaluated three “rules”:

  1. The moving average of air temperature during the past 5 days is between 20 and 25°C.
  2. The rainfall is below 4 mm per hour,
  3. The continuous wet period is 4 hours more than a base wet hours value, calculated in terms of the ambient temperature while the leaves are wet.

The second step of the Yoshino model then estimates the number of “infection hours”, that is, the hours where the three conditions of the first step are true. Finally, a risk value is given, based on the number of “infection hours”, which has four levels: zero (0 hours), low (between 1 and 3 hours), medium risk (between 3 and 6 hours), high risk (more than 6 hours).

The WARM (Water Accounting Rice Model) model, on the other hand, is somewhat more complex, given that is predicts eight different conditions as output (such as soil hydrology, grain quality, floodwater effect, …), as well as diseases. It is a simulation model, that is, given a set of initial conditions it evolves over a number of time steps to predict what the situation will be in N steps. The disease module uses several subsets of input variables, such as weather data (rain, wind, temperature), leaf wetness (hourly: wind, humidity, rain) and crop variables (e.g. green leaf area, hot tissue).blog post david 2

In [5], Katsantonis et al. conduct a detailed review of many Rice Blast forecasting models. For the Rice Guard project, data capture is a key aspect and advances in IoT (Internet of Things) technology has allowed the implementation of wireless networks and radiofrequency communications linking in-situ sensors to capture the meteorological data (see Image 1).

 

The sensor development was performed by IRIS and became a key enabler to make data capture possible and therefore prediction. Also, as part of its AgriBox portfolio of solutions, IRIS develops software platforms for tailored sensor networks to respond to different needs regarding monitoring and decision support in the agricultural and agri-food sectors.

In order to contrast these two state of the art models, at IRIS we also tried an ad-hoc machine learning approach. We used a machine learning algorithm (M5Rules[6]) to build a data model from the data on one site as an early warning to predict the onset of rice blast in another geographically distant site. As a tough test, we trained the model using data from paddy fields located in Isla Mayor (Seville, Spain), and then tested it on (unseen) data from paddy fields located in Kalochori (Greece). In both cases, the data covered the summer period from mid-June through to the end of August 2016.

blogpostdavid3

Key input attributes include moving averages of the previous 1, 3 and 7 days for the humidity and temperature readings. The output attribute is a FLAG which is a continuous value which indicates the risk of rice blast occurring. However, rather than just correlating the output of the model with the real flag, we also compared the firing of the rules of the data model over time. In order to obtain the rules we trained using a rule/decision tree induction algorithm (M5Rules) and then reprogrammed the rules individually in C++, maintaining their firing order based on the decision tree logic. This enabled us to identify individual rule firings (when they become true), plot them over time and see how they correlate with the onset of the real rice blast outbreak, as shown in Figs. 1 and 2.

 

 

Rule 3:

IF relhum1 > 57.85 AND relhum3 < 61.25

THEN FLAG =

0.0017 x hour + 0.0012 x relhum + 0.003 x relhum1 – 0.0114 x relhum3 – 0.0096 x relhum7 + 1.9487

[163/0%]

Rule 5: 

IF relhum7  <  62.3

THEN FLAG =

0.0053 x hour – 0.0005 ´ leafwet + 0.0012 x windspeed + 0.0769 x relhum7 – 4.5537

[49/0%]

 

In rule 3 we see that 163 cases were predicted with 0% training error. Rule 3 uses two moving averages (relative humidity last day and last 3 days) in its antecedent to predict Rice Blast (Flag) which is quantified in the consequent part using the time of day, and the three humidity moving averages (1 day, 3 days and 7 days). In rule 5, we see the consequent uses some different variables: leaf wetness and wind speed.

blogpostdavid4

Fig. 1. Seville (Spain) paddy field data used for model training, 8th July – 28th July 2016

In Fig. 1 we see the modelling results for the Seville paddy field (data used for training). We see how the rule firings of the predictive model (WHICHRULE, red) anticipate and coincide with the Rice Blast event (FLAG, black) for the Seville paddy fields which were used as the training dataset. On the other hand, the model output (MODEL, red dotted) follows the flag very closely. In the middle of the time period shown (from 18/7 to 21/7) is the rice blast onset event and the FLAG (black) indicates this. On the left it can be seen that rule 4 fires on 17/7 (the vertical axis indicates which rules are “true”). This is followed by rules 5 and 6 also on 17/7 – 18/7 and then rules 3, 7 and 8 which coincide with the flag going positive from 18/7 to 21/7.blogpostdavid5

 

Fig. 2. Kalochori (Greece) paddy field data used for model testing, 20th June – 10th July 2016

In Fig. 2 we see the modelling results for the Kalochori paddy field (data used for testing). Again, we see how the rule triggers (WHICHRULE, red) coincide with the onset of Rice Blast event (FLAG, black) for the Kalochori paddy fields which were used as the test dataset. On the left it can be seen that rule 2 (the vertical axis indicates which rules are “true”) is the first to fire, followed by rules 4, 3 and 5 over the period 23/6 to 30/6. This coincides with the true Rice Blast onset event from 22/6 to 6/7 indicated by the FLAG (black). Hence there is a strong correlation of the rule firings of our model with the onset of the Rice Blast disease.

If we compare our model to the rice leaf blast risk indicators of the Yoshino and WARM models in Fig.2, it appears that our induced rule model triggers before the other models, however it is less sensitive to posterior events.

To summarise, we have used input data constructed from different moving averages (1,3, and 7 days), to obtain triggers for detecting the conditions which are a precursor to the onset of Rice Blast disease. It is clear from the results that our rule firings give an early warning signal before the outbreak actually becomes evident  – making it possible to take effective preventive action, reducing crop loss and minimizing use of fungicide spraying. However, although this seems really promising, further validation is required: for example, false positives have to be further studied and testing in other geographical scenarios and consecutive crop growing seasons.

Rice Guard is just one of more than a dozen applied research projects in which IRIS participates, making real advances in fields which have a positive impact on economic and social development around the world.

Acknowledgements

The author is grateful to Pau Puigdollers (formerly of IRIS Advanced Engineering) and Dr. Dimitrios Katsantonis (Hellenic Agricultural Organization-demeter, Institute of Plant Breeding and Genetic Resources) for their support in providing and explaining the data. Also thanks to Carlos Urrego, Creative Marketing Designer of IRIS for the photo-composition seen in the first image.

The Rice Guard project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration : Under grant agreement nº 606583. EU Flag

References

[1] www.knowledgebank.irri.org/ericeproduction/Importance_of_Rice.htm

[2] Rice guard project – www.riceguard.eu

[3] Yoshino R. (1979). Ecological studies on the penetration rice blast fungus, Pyricularia oryzae into leaf epidermal cells [in Japanese, English summary]. Bulletin of the Hokuriku National Agricultural Experiment Station 22, 163–221.

[4] Bregaglio S., Titone P., Cappellia G., Tamborini L., Mongiano G., Confalonieri R. (2016). Coupling a generic disease model to the WARM rice simulator to assess leaf and panicle blast impacts in a temperate climate, European Journal of Agronomy 76, pages 107–117.

[5] Katsantonis, D., Kadoglidou, K., Dramalis, C., Puigdollers, P. (2017), Rice blast forecasting models and their practical value: a review, Phytopathologia Mediterranea (2017), 56, 2, 187−216.

[6] Holmes, G., Hall, M., Frank, E. (1999). “Generating Rule Sets from Model Trees”, in Twelfth Australian Joint Conference on Artificial Intelligence, 1-12, 1999.

 

Refs images:

  1. Data capture sensor station located at the paddy field – http://multisite.iris.cat/riceguard/files/2014/09/Riceguard_WeatherStation.jpg
  2. Rice Blast – different grades of leaf lesion – http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0026260
PhD David Nettleton
Data Scientist
PhD in Artificial Intelligence

Leave a Reply

Your email address will not be published. Required fields are marked *