Finger Movements Classification from Electromyography (EMG) using fastai
I’ve been interested in Electromyography (EMG) and wanted to make bionic arm 🦾. To make this arm I thought that first I should start with something easier that is finger movements classification.
Dataset
In this project I used 2 channels surface electromyogram (EMG) dataset from
This dataset contains 10 movements of fingers each finger movements have 60 of 5 second data (divided into 10 set) from eight subjects, six males and two females, aged between 20 and 35 years.
Outline of the project
In this project I used fastai library to train CNN and Resnet34 pre-trained model in kaggle.
✧ Why fastai and Resnet34?
- fastai is a deep learning library includes several pretrained models such as Resnet34, squeezenet1_0, densenet121, vgg16_bn, etc.
- Resnet34 is a is a 34 layer convolutional neural network image classification pre-trained model
Since I still don’t really understand about how to make layers or training models by myself right now, and this project is for participating in AI builder program which have classes about using fastai to make classification models so I decided to follow. I also researched more about fastai and Resnet34 and find that is essay to use, The size of the architecture is not that big so it doesn’t take a lot of time to train compare to larger architecture such as Resnet50.
✧ Data preparation
To use data to train in Resnet34 model I needed to change data which is 1D (x=time(s),y=amplitude) to 2D (x=time(s),y=frequency(Hz),z=amplitude) data or spectrogram. The number of data is 60 files of 5 second data, I choose to predict finger movements from 1 second data so I divide 1 file of data in to 1 second each. I could get 60 × 5 = 300 data. I wanted to increase the number of data by overlapping each data 0.9 second or moving with 0.1 second step. I could get 60 × 40 = 2400 data for each class.
- Short time Fourier transform (STFT) is a sequence of Fourier transforms that used to analyze how the frequency of a signal changes vary over time.
- Fourier transform is used to analyze average of how the frequency of a signal changes over entire signal time.
STFT gives frequency information at each time step. This converts 1D in time to 2D in time and frequency as I needed.
✧ Why overlap?
“ Due to the stochastic nature of the EMG, any instantaneous sample of the EMG contains relatively little information about the overall muscle activity and hence some form of smoothing or windowing must be performed on the data.”
Data preparation
✧ Import data to kaggle
Download data from the website and upload dataset into kaggle and create new dataset. (link)
✧ Short time Fourier transform (STFT)
Import data from dataset, the data will be at kaggle/input/emg2channels
. All filenames are contained in all_files
.
Create directory for saving output data.
Import Fourier transform and other package .
From dataset information the sampling rate (collect 4000 data in 1 second) is set to 4000 Hz. The data time(s) can be calculated by dividing data length with sampling rate (t = data number/fs).
- Dynamic range is difference between maximum value and minimum value of STFT data. The maximum value is set to
max
.
Use STFT to convert 0.1 second data in to 2D. STFT will automatically move 0.1 second data by overlapping each data half of data length (0.1 ÷ 2 = 0.05 second) or moving with 0.05 second step.
After converting the data with STFT, I concatenate data from channel 0 data0
and channel 1 data1
to obtain array of 60 × 40 ( data01
). 60 frequency steps with 10 Hz/step or from 0 to 590 Hz and 40 time steps with 0.05 s/step.
- Array of 60 × 40 is array of 60 × 20 from
data0
anddata1
.
I set the dynamic range of data01
to max
and scale the array to image data with maximum of 255 and minimum of 0.
Convert Numpy array files into .png by using Pillow library Image.fromarray()
.
All files is named in to 9 digit number, this will make files extracting easier.
- First two number is set number + 10 ex. 11 is set number 1 (11–10 = 1)
- The next five is data number + 10000 because I made 2400 data for each class, to make all filenames the same length I have to add number that is larger than files number which is 10000 ex. 10295 is data number 295 (10295–10000 = 295)
- The last two number is label number or class number + 11 ex. 11 is class number 0 (11–11 = 0)
I divided data in to train set and test set in ratio of 70 : 30 first and divide 70% train set in to 20% validation set (56 : 14) for model training later. (The ratio of train : validation : test is 56 : 14: 30)
Training Classification models
I also made dataset to predict finger movements from 0.2 second data with the same way I did with 1 second data set. I increase the number of data by overlapping each data 0.1 second or moving with 0.1 second step. I could get 60 × 49 = 2940 data for each class. The model for training 0.2 second will be using the same model with 1 second data set.
Import the packages required for the classification.
Create DataFrame with path and class.
Random split train set into validation set. For this model I set the percentage for validation set to 20%.
Import Resnet34 model.
From the graph below learning rate start decreasing at 1e-3 and the minimum point is around 1e-1, the base learning rate should be approximately at 1e-2 ( base_lr = 1e-2
).
- base learning rate should be around close to minimum point of the graph.
learn.lr_find()
Use learn.fine_tune()
to train the model.
learn.fine_tune()
is a function
learn.fine_tune(epochs=10,
base_lr=1e-2, #max lr; when unfrozen base_lr/2
freeze_epochs=1, #how many epochs to train frozen
lr_mult=100, #train feature extractor with max lr at base_lr/lr_mult
pct_start=0.2, #start decreasing lr at
div=5.0, #start at base_lr (max lr) / div
cbs=[SaveModelCallback(monitor='accuracy'),]
# WandbCallback(), #track to wandb] #monitor accuracy and save best model
)
Result
The model
✧ Confusion matrix
This show us that
- 0.2 second data : The model
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()
✧ Test the model with test set
- 1 second data
Accuracy:99.17%
- 0.2 second data
Accuracy:76.52%
Demonstration
I tested the models by generating a 20-second signal from one subject. The duration of each finger movement is chosen by random between 0.5 second to 1.5 second. The class of finger movement is also chosen randomly.
The tested results of 1.0-second model is not in good agreement with the actual finger movements even the accuracy of the model is more than 90%. The reason is that the model need the whole second of the signal for each prediction. There are many perdition with the signal from two types of finger movements.
The tested results of 0.2-second model give better agreement with the actual finger movements even the accuracy of the model is less than 80%. The reason is that there are fewer perdition with the signal from two types of finger movements.
Conclusion
The results show that ResNet34 can be used to predict 10 classes of the finger movement from 2-channel EMG data by converting 1-diminsional signals to 2-dimensional signals using STFT.
The future work
The results are not good as the original work published for this EMG dataset. Other CNN architectures could be explored to get a better result.
What I learned in AI Builder program
As I mentioned at the beginning this program include classes, it makes me gain more knowledge and understanding about machine learning, fastai, transfer learning (learning rate finder, freezing/unfreezing), NLP, ethical problems in field of data science, etc.
Making this project gives me a lot of knowledge about how Convolutional Neural Network (CNN) works (about layers, filter, max pooling, etc.) since I have to read a lot about that to understand the model I used or Resnet34. I also gain experience and fun doing work in this field too!