A major advance in AI training
This is not the first COVID cough classification algorithm to be developed, but the RMIT model outperforms existing approaches and has another major advantage that makes it more practical to use across different regions – the way it learns.
Study co-author Professor Flora Salim said previous attempts to develop this type of technology, like those at MIT and Cambridge, relied on huge amounts of meticulously-labelled data to train the AI system.
“The annotation of respiratory sounds requires specific knowledge from experts, making it expensive and time-consuming, and involves handling sensitive health information,” she said.
“Using a narrowly-targeted data set – such as cough samples from one hospital or one region – to train the algorithm also limits its performance outside that setting.”
Salim said it was this limitation that had proven a challenge for this technology’s practical application in the real world, until now.
“What’s most exciting about our work is we have overcome this problem by developing a method to train the algorithm using unlabelled cough sound data,” she said.
“This can be acquired relatively easily and at larger scale from different countries, genders and ages.”
During the pandemic, many crowdsourcing platforms have been designed to gather respiratory sound audios from both healthy and COVID-19 positive groups for research purposes.
The team accessed datasets from two of these platforms – COVID-19 Sounds App and COSWARA – to train the algorithm using contrastive self-supervised learning, a method by which a system works independently to encode what makes two things similar or different.
The team are open to collaborating with potential partners on developing the technology and expanding its application for a range of respiratory diagnostic tools.
‘Exploring Self-Supervised Representation Ensembles for COVID-19 Cough Classification’ is available now in pre-print, ahead of it being presented at the prestigious data science conference KDD 2021 in Singapore this August.
This research is supported by Australian Research Council (ARC) Discovery Project DP190101485.
Story: Michael Quin