Signal Classifier - Distinguish between EMG and ECG
Difficulty Level:
Tags train_and_classify☁classification☁biosignals☁emg☁ecg

Machine learning is a branch of artificial intelligence that emerged with the increase of computational power that has accompanied the evolution of technology. It allows for the computer to learn the outcome of numerous problems by exploiting the internal structure of the datasets given as input.

There are three main settings of machine learning :

  • Unsupervised Learning - Learning the internal structure of a given dataset without prior knowledge. It is often used as a way of finding interesting features or similarities in a dataset that may not be perceptible to the human eye. For example, in retail, it may be used to aggregate a set of customers that exhibit similar shopping patterns in order to send directed advertisement to them.
  • Semi-Supervised Learning - Learning the internal structure of a given dataset with some knowledge about it. It is usually used in cases where it is possible to distinguish classes from datasets without the need of labelling every class, which is an expensive work. For example, it may be used in anomaly detection scenarios, where the number of normal instances is usually immensely higher than the anomalous instances. A concrete example is that of aircraft failure detection, in which we know the normal functioning of the system, but we lack in anomalies because it may be too expensive or rare to happen (p.e. engine malfunction). In this cases, the normal instances are labelled, but the anomalous ones are not.
  • Supervised Learning - Supervised learning consists of learning the patterns in a given dataset in which we have full comprehension about it. For example, there are databases of ECG signals that include high number of arrhythmias, where every heartbeat is labelled by the type of arrhythmia or if it is normal.
In this Jupyter Notebook , it will be shown how to use the package scikit-learn in order to easily deploy machine learning models to determine the nature of a given biosignal, in this case, if it is an ECG , an EMG or other .


1 - Import the required packages

In order to facilitate our work, we will use the biosignalsnotebooks package, which includes multiple sample signals among other features, scikit-learn , that has an high-level implementation of a high number of methods and models used in machine learning applications, and numpy , that implements mathematical functions in an easy way.

In [1]:
import biosignalsnotebooks as bsnb

from numpy import mean, std, zeros, diff, sign, vstack, concatenate, ravel
from scipy.stats import kurtosis, skew

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import ShuffleSplit
from sklearn.metrics import classification_report, accuracy_score

2 - Load sample signals

biosignalsnotebooks Python package provides a set of sample signals that we will be using in order to train and test our model. As we want to distinguish between ECG, EMG and other types of signals, we will use multiple files.
Note: Since we will only use one channel of each signal, we parse the channel in the same line where we load the signal. Furthermore, we will use the z-standardisation to normalise our signals to have 0 mean and unit variance.

First, we will load the ECG signal, that consists of a 20 seconds acquisition at 1000 Hz sampling rate and a resolution of 16 bits.

In [2]:
# Relative path to the signal samples folder
path = "//biosignalsplux.com/signal_samples"
# Load the ECG signal
ECG = bsnb.load(path + "/ecg_20_sec_1000_Hz.h5")['CH1']
# Normalise ECG signal
ECG = ECG - mean(ECG); ECG /= std(ECG)
In [3]:
ECG_time = bsnb.generate_time(ECG)
bsnb.plot(ECG_time, ECG)