Special Algorithms
Special Algorithms
Mel Frequency Cepstral Coefficients (MFCC)
The ultimate goal of using MFCC in our project is to transform the audio signal into vector form to train the AI model to predict the words corresponding to a string of phonemes in the audio signal. The primary concern in our proposed system regarding the discourse is that the sounds produced by a human are sifted by the state of the vocal tract including tongue, teeth and so on. This shape determines what sound turns out. In the event that we can decide the shape precisely, this should give us an exact portrayal of the phoneme being created. These phonemes are utilized to distinguish the words in a programmed discourse recognizer.Through this method most of the clear audio inputs can be analysed considering that their rate of noise is minimal.
Hidden Markov Model (HMM)
For the speech recognition module of the project, HMM is used for the purpose of speech-tagging. When converting an audio file to text format in our project, the system needs to identify each phoneme in the audio clip and map it to a word. Here, HMM provides an estimate if a given sequence of speech segments matches a string of phonemes. Considering that a string of phonemes can be mapped to a word, HMM becomes a core technique to find the most probable words for the audio clip inserted. The major motivation to pick this strategy is its simplicity and accessibility of preparing
calculations for evaluating the parameters of the models from limited preparing sets of discourse information.
Support Vector Machine (SVM)

SVM are built on the Structural Risk Minimization principle from computational learning theory. SVMs, which are universal learners, learn a linear threshold function in their basic form. However, using a simple "plug-in" of a suitable kernel function, SVMs can be used to learn polynomial classifiers, radial basic function (RBF) networks, and three-layer sigmoid neural nets. One astounding feature of SVMs is their ability to learn can be independent of the dimensional of the feature space. SVMs measure the complexity of hypotheses based on the margin with which they separate the data, not the number of features. This concludes that it is possible to generalize even in the presence of many features if the data is separable with a wide margin using functions from the hypothesis space.
To find out what methods are promising for learning text classifiers, the properties of text should be analyzed.
● High dimensional input space: When learning text classifiers, it is general to deal with quite an amount of features which can be more than ten thousand. Owing to the fact that SVMs use over-fitting protection, which does not necessarily rely on the number of features, they have a strong potential to handle these large feature spaces.
● Few irrelevant features: In order to avoid high dimensional input spaces, the assumption that most of the features are irrelevant is made. Feature selection attempts to determine these irrelevant features. But, in the scenario of text categorization, there are very few irrelevant features.
● Most text categorization problems are linearly separable.
Taking the aforementioned arguments into consideration, in order to create a model which does not over-fit on the training data, but has the ability to generalize well, SVM approach was chosen for our binary classification models.
Comments
Post a Comment