A supervised machine learning method to detect anomalous real-time broiler breeder body weight data recorded by a precision feeding system

You, J., E. Lou, M. Afrouziyeh, N. M. Zukiwsky, and M. J. Zuidhof. 2021. A supervised machine learning method to detect anomalous real-time broiler breeder body weight data recorded by a precision feeding system.

About

Publication Metrics

A precision feeding (PF) system is an intelligent computer-controlled feeding system that can be used to feed individual broilers, breeders, or layers automatically based on measuring real-time body weight (BW). Vast amounts of real-time BW data can be obtained every day when birds visit a PF station. However, anomalous observations in real-time BW observations occur and they should be removed. Manually cleaning the anomalies is accurate, but it is time-consuming and labor-intensive. The objective of this study is to report a supervised machine learning method to detect anomalies in real-time BW recorded by the PF system.

Approach

The data were obtained from a flock of Ross 708 broiler breeders (n= 95) raised in 2 environmentally controlled chambers. Each chamber had two PF stations that were used to feed birds and record data automatically. The breeders were fed with the PF stations from day 15 to the end of the trial (day 306). To manually label anomalous real-time BW of broiler breeders, 5 breeders from day 15 to day 306 were randomly selected from the flock and their real-time BW data were checked. If the real-time BW value of an observation deviated a lot from that of other observations in one day and the deviation could not be explained by the feeding activities of a bird, the observation was defined as an anomaly or an anomalous observation. Python 3.7.0 was used to facilitate all the data analysis work including data preprocessing, feature engineering, algorithm selection, and comparison with other common anomaly detection methods.

Analysis of Results

Were evaluated 4 machine learning algorithms including k-nearest neighbor (KNN), random forest classifier (RF), support vector machine (SVM), and artificial neural network (ANN). KNN had the highest precision (0.9746) and SVM had the highest recall (0.9917). RF produced the highest F1 score (0.9712) and area under the precision-recall curve (0.9948). Compared with 4 other common anomaly detection methods including Z-scores, interquartile range (IQR), density-based spatial clustering of applications with noise (DBSCAN), and local outlier factor (LOF), RF had a higher average F1 score (0.9448), which indicated that RF was a more effective anomaly detection algorithm for this type of data. RF detected almost all anomalous observations without identifying many normal observations as anomalies.

Application

The current study was the first to propose a supervised machine learning method to detect anomalies in real-time BW data of broiler breeders collected by a PF system. Real-time BW data of 5 randomly selected broiler breeders were used in the current study. To detect the anomalous observations over the period of trial (from day 15 to day 306), 12 variables considering statistical distribution of data and features regarding the feeding activity recorded by the PF system for each day were created and then machine learning algorithms were used to identify anomalies from normal observations. RF was selected as the best algorithm among 4 different machine learning algorithms because it had the highest F1 score (0.9712) and AUCPR (0.9948). Comparing with common anomaly detection methods (Z-scores, IQR, DBSCAN, and LOF) that just checked data distribution, the RF method in the current study had a higher average F1 score (0.9448). The current study provided an effective solution to clean anomalous observations of real-time BW of broiler breeders fed by the PF system.

Abstract

A precision feeding (PF) system is an intelligent computer-controlled feeding system that can be used to feed individual broilers, breeders or layers automatically based on measuring real-time body weight (BW). Vast amounts of real-time BW data can be generated every day when birds visit a PF station. However, anomalous observations occurred in real-time BW observations, which were caused by multiple birds entering the station at the same time, upward or downward variation in scale measurement in the recorded data due to the movement of the bird, or a misread for radio frequency identification tag. Known anomalous data should be removed because they have a negative impact on the interpretation of the data. Manually cleaning the anomalies is accurate, but it is time-consuming and labor-intensive. Statistical methods and unsupervised machine learning methods are effective in detecting anomalies to some extent because they just check data distribution. The current study reported a supervised machine learning method to detect anomalies in real-time BW recorded by the PF system. Real-time BW data of 5 broiler breeders from day 15 to 306 were checked and the anomalies were manually labeled. Variables regarding the statistical distribution of data and features regarding the feeding activity recorded by the PF system in each day were extracted from the dataset. Among the 4 machine learning algorithms including k-nearest neighbor (KNN), random forest classifier (RF), support vector machine (SVM), and artificial neural network (ANN), RF produced the highest F1 score (0.9712) and area under the precision-recall curve (0.9948). Compared with 4 other common anomaly detection methods including Z-scores, interquartile range (IQR), density-based spatial clustering of applications with noise (DBSCAN), and local outlier factor (LOF), RF had a higher average F1 score (0.9448), which indicated that RF was a more effective anomaly detection algorithm for this type of data.