Towards an Audio-based CNN for Classroom Observation on a Smartwatch

Document Type

Conference Proceeding

Publication Date



Classroom observation is an important tool to help achieve the United Nations' fourth sustainable goal on quality and inclusive education. However, manually deploying this tool is expensive and not congruent with resource constraints in parts of the world where it is needed the most; Sub Saharan Africa and South and Central Asia. This paper presents the design of an initial implementation of an automated classroom observation system based on a convolutional neural network (CNN) which was optimized using the Hyperband approach. The system implements parts of the Stallings class observation system on a teacher's smartwatch and uses audio data only. Based on 'data in the wild' collected in Pakistan, the CNN performed close to the level of human experts on unseen data (Cohen's Kappa = 0.687 with human annotated data). F1-measure was 0.78 on unseen data. An Apple 4 smartwatch natively running the CNN was able to provide real-time inference (< 1 second for 3 second audio segments).

This document is currently not available here.