Posts

Conclusion | Automatic Audio Replacement of Objectionable Content for Sri Lankan Locale

Image
When considering Sri Lanka, the government had to temporarily ban a few social media platforms several times from 2018 to 2019 to control the violence between the communities which was accelerating via hate speech and racist fake news spreading through social media. Yet, there is no proper mechanism to specifically filter and replace objectionable content in audio for Sri Lankan locale. In addressing this issue, we proposed a system for Sri Lankan locale, which automatically detects and replaces objectionable content in audio. In this proposed system, to selectively filter out the potentially objectionable audio content, the input audio is first preprocessed and converted into text format. The presence of racist, cursing, and sexist objectionable content are detected along with their corresponding locations and timestamps through the filtering mechanism. Afterwards, the detected objectionable content is seamlessly replaced with predetermined audio input. The model was tested against a

System Evaluation

Image
After the models were finalized, the hyperparameters of the classifiers and vectorizers were tuned for optimum accuracy using the pipelining method in Scikit-learn. The classification reports of final chosen models for Sinhala, Tamil and English are depicted in TABLE XV. The performance of the finalized binary classification models was evaluated using the Receiver Operating Characteristic (ROC) Curve in Fig 3, Fig 4. and Fig 5, which display the graphical relationships among the metrics shown in TABLE XIV. With the use of ‘Area Under the Curve’ measure, from Fig 1, Fig 2. and Fig 3, it can be observed that the performance of the classifiers is greater than the classifiers with no power. Fig. 1. Receiver Operating Characteristic Curve of the model for Sinhala. Fig. 2. Receiver Operating Characteristic Curve of the model for Tamil.                               Fig. 3.  Receiver Operating Characteristic Curve of the model for English.  The Precision vs Recall curve can be used to measure