Abstract:
Cry, the universal communication language of the infants encodes vital information about the
physiological and psychological health of the infant. Experienced caregivers can understand
the cause of cry based on the pitch, tone, intensity, and duration. Similarly, pediatricians can
diagnose hearing impairments, brain damages, and asphyxia by analyzing the cry signals,
providing a non-invasive mechanism for early diagnosis in the first few months. Hence, automated
cry classification has gained great importance in the fields of medicine and baby-care.
With the emergence of the concept of the Internet of Things coupled with Artificial Intelligence,
baby monitors have recently gained huge popularity due to features like sleep analysis,
cry detection, and motion analysis through multiple sensors. Since cry classification involves
audio processing in real-time, most of the solutions have either complex and costly designs or
distributed computing, which leads to privacy concerns of the users. This research presents a
low-cost intelligent hardware system for real-time infant cry detection and classification. The
proposed solution presents the selection of the hardware to suit the requirements of audio processing
while adhering to financial constraints and the firmware design, which includes voice
activity detection, cry detection, and classification. This proposes the use of the multi-agent
system as a resource management concept while proving that AI concepts can also be extended
to resource-limited hardware platforms as the novelty. Firmware and algorithm are designed
to maintain the accuracy figures above 90% while processing the audio signal at a higher rate
than its production to maintain stability. A voice activity detector was designed to filter human
voice through temporal features while cry detection and classification were respectively based
on Artificial Neural Network and K-Nearest Neighbor algorithm trained with a spectral-domain
feature vector called Mel Frequency Cepstral Coefficients (MFCC). Evaluations under
diverse conditions showed accuracy figures of 96.76% and 77.45% in cry detection and classification,
respectively