Evaluation of Acoustic Parameters for Angulimala Sutta using Voiced to Unvoiced Ration and Vowel Distribution

Pirith is believed as a protective doctrine preached by the Load Buddha in Pali language. The aim of this study is to analyze acoustic properties of Pirith using computer-aided methods and identify special characteristics and patterns. In this study, two methods were used to identify special characteristics of Angulimala Sutta. First method calculates voiced to unvoiced ratio using zero crossing rate and energy content associated with the acoustic signal while second method recognizes vowel distribution using first and second formant frequencies. Results of the first method indicates approximately 96% of frames are voiced while the second method suggests approximately of vowels concentrate in the square region of and demonstrating when chanting the Angulimala sutta most of the time the tongue height is low positioned in back levels while lips shaped unrounded.


INTRODUCTION
Speech production process begins at the point of converting an idea developed in the speakers' mind to a language code. With the aid of articulatory motion and vocal tract movement, the phonemes which are lined up in a set of sequences propagate outside as an acoustic waveform. "Pirith" means protection from all aspects and this protection is to be obtained by reciting or listening to Pirith suttas. The practice of reciting and listening to Pirith suttas began very early in the history of Buddhist culture.
Voiced to unvoiced ratio (V/UV ratio) is an important parameter as it indicates the involvement of speech production system with vibration of vocal codes. In this work, we combined the results of V/UV ratio with zero crossing rate (ZCR) and energy of short time segments of the signal to strengthen the analysis.
In voiced speech, the vibrating glottis generates periodic pulses which resonate in the vocal tract. Therefore, when vowels are pronounced similar frequencies are generated. However, in the unvoiced speech, vocal chords held open and a continuous air beam flow through them. The air beam turns into a turbulent flow because of narrowed vocal tract and it creates, nonperiodic, noise-like sounds (Lee and Yoo, 2003) The zero crossing rate measures number of intersections a given signal makes with the time axis per unit time in an amplitude-time plot. Voiced speech shows a low zero-crossing rate due to the excitation of vocal tract by the periodic air flow, whereas the unvoiced speech shows high zero-crossing count as it is produced by the turbulent airflow flowed through the narrowed vocal tract (Bachu et al., 2010). Additionally, the voiced part of the speech has high energy content because of its periodicity. According to the acoustic theory of speech production, vocal tract is modeled as a nonuniform tube closed at vocal folds and open at the lip end (Stevens and House 1955). Cross sectional area of the vocal tract depends on the position of tongue, lips, jaw and velum. Due to varying cross section along the vocal tract, different resonance frequencies (harmonics) are generated in response to varying vocal fold vibrations. Consequently, the complex output voice signal is composed of several harmonics called as formants which are clearly visible in spectrographic displays of voice segments. Normally, they occur on average at intervals of , where c is the speech of sound and is the length of the vocal tract (Fant,1973).
Vowels can be mapped using the relationship between lip opening width to the first formant frequency, and tongue constriction width to second formant frequency, Cardinal vowels, which are not of any particular language but a measuring system in describing sounds of languages are used as a set of reference vowels in this work. These vowel sounds demonstrate if the tongue is in an extreme position, either front or back, high or low. The cardinal vowel system was modified by Daniel Jones (Jones, 1953) based on the original idea proposed by earlier phoneticians, notably Ellis and Bell (Bell, 1867).

LITERUTURE SURVEY
As reported by Jayaratne 2007, an experiment was performed at Kanduboda International

Evaluation of Acoustic Parameters for Angulimala Sutta using Voiced to Unvoiced Ration and Vowel Distribution
Meditation Centre, Sri Lanka to understand the effect of Pirith on human beings. When a sample of human subjects could listen to Pirith chants, it is observed that within of the commencement of the chanting, their heart beat reduced, heart pulse amplitude halved and reached to an alpha state similar to what is obtained under a meditative trance.
A famous Japanese researcher, Masaru Emoto had provided evidences that human thoughts can affect the molecular structure of water through words, ideas and human vibrational energy (healingsounds.com, Jan 2018).
Several speech classification systems are proposed in literature. According to the Ayadi 2011, speech features can be grouped into four categories as continuous features, qualitative features, spectral features and TEO (Teager Energy Operator) based features (El Ayadi, Kamel and Karray, 2011). Energy, zero crossing rate, pitch, formants are called traditional speech signal features while energy and pitch are prosody features within low-frequency domain and formants and zero crossing rate are high frequency features (Chen et al., 2012).

METHODOLOGY
In the analyzing process, Samples of Angulimala Sutta recited by male monk chanters were recorded under high precision conditions and 15 samples were subjected to analysis. Voiced recording was then subjected to splitting of smaller voiced segments of frame length using sampling rate of . This specific frame length was selected as vocal tract has fixed characteristics over a time interval of the order of Voiced to unvoiced ratio is calculated by counting number of frames less than a reference zero-crossing rate and higher than a reference short time energy as voiced frames and others as unvoiced frames according to the algorithm shown in Figure 1.
In the computational speech model, a preemphasis filter is applied to the sampled time series of voiced segment to cancel out the effect of glottis. Then frame-by-frame analysis was used with hamming windows and liner predictive coding (LPC) and auto correlation to extract the formant values. In vowel analysis, frequency values regarding first formant, and second formant, were extracted and the vowel distribution was obtained by plotting . In the analysis of vowel distribution, primary cardinal vowels introduced by Daniel Jones were used as a reference. (Jones, 1953).

RESULTS & DISCUSSION
MATLAB is used for scripting, calculations and analysis. In the frame by frame analysis, speech signals are divided into a non-overlapping frame of samples. Figure 2 shows the vowel distribution for the all 15 samples. Percentage distribution is shown in Figure 3, while a further analysis of denser areas is indicated by Figure 4 and Figure 5. Figure 6 offers a comparison of vowel distribution with primary cardinal vowels.    Table 1.

CONCLUSION
The Angulimala sutta is rich with vowels as it shows approximately voiced to unvoiced ratio. Analysis of these vowels suggest that of vowels concentrate around the frequency range of ~ and showing high amount of low back unrounded vowels.
When comparing the results with Cardinal vowel chart, the densest vowel area shows the qualities of cardinal vowel [a] and [ɑ] as shown in Table 2.
It can be concluded as when chanting the Angulimala sutta, the arrangement is inclined to be the tongue is low positioned in back levels while lips shaped unrounded. Less number of vowels is represented by cardinal vowels [i] and [u], showing high front unrounded vowel and high back rounded vowels respectively.