ABOUT BEAM
Most of these systems either affect the musical performance in an abstract and non-explicit way, or focus on the multi-studied features of concentration, stress and relaxation. The aim of this project was to study specific unexplored emotional states of musicians as they performing so that we can detect them in a real-time live performance and exploit them for the instant reshaping of their sound, without necessarily the performer’s awareness.
Juslin and Laukka [JL03] conclude that emotions can be communicated on different instruments that provide to the performer’s disposal relatively different acoustic cues, which largely reflect the musician’s sound production mechanisms. However, as the emotional state is highly subjective, we assume that the self-reported state of musicians who participated in the training procedure was valid.
Our system consists of an EEG biosensor, which detects the brainwave signals, a software of analysing the biometric data and detecting in real-time the musician’s happiness or sadness, and a multi effect processor for reshaping the sound.
This choice is considered quite crucial and was made due to the fact that for categorizing brain signals we need the actual emotion, which is what the user gives us and not what accompanies the track. These two values are not always identical since the feeling that a listener or a musician feels when listening to or playing a piece is purely subjective [SSH13].
Particularly, for each musician we kept track of their alpha, beta, theta, delta and gamma values, with a non-invasive wearable biosensor (headset). The songs were selected by the musicians for each of the aforementioned emotional states and belonged to different genres of music. For each case (listening, performing) and for each state (happiness, sadness) we recorded their EEG values for five minutes, with sampling rate 10 sec, i.e., 80 minutes and 480 samples. For the performing case we also recorded the performance in wav format.
In total this process took about 20 minutes for each participant. Our system records the measurements every 0.2 sec. We have therefore stored 4 matrices (one for each segment of the experiment) $T\in R^{(c \times t)}$ for each subject, with $c$ being the 8 different signals recorded and $t$ being the number of measurements made for this signal (5 minutes x 60 second x 0.2 resolution = 1500 samples).
The system uses 3 of the 4 subjects for training and the fourth for testing. In total, the dataset had 16260 training and 5520 testing samples.
For identifying the emotional state of each user, we used XGBoost. To find the optimum hyper-parameter values of the network we run a grid search, testing the model’s performance for 9 different hyper-parameters. From the above run we achieved acc: 98.3\% for a 7-fold cross validation, while in the test set we have 89\% accuracy.