
From Strategy to Signals: Kicking Off ECG Analysis
This week, our team, BEATNET, made significant progress in defining the technical roadmap for our project. A productive meeting with our liaisons at Aventusoft provided crucial clarity and set a clear direction for the weeks ahead.
Defining Our Mission: Project Goals and Data Strategy
The primary outcome of our meeting was the confirmation of our two main project goals: arrhythmia classification (specifically targeting conditions like AFib, Flutter, and PVCs) and the detection of ECG landmarks (fiducial points). We learned that while Aventusoft has implemented Q and R point detection, the P, S, and T points remain open tasks for us to tackle.
Since most of Aventusoft’s data is currently unlabeled, we will begin by using well-known public datasets for our initial model development, including PTB-XL and MIT-BIH. This approach will allow us to build and validate our models before applying them to Aventusoft’s data in later stages.
Architecting Our Approach: Preprocessing and Deep Learning
A key technical decision from our meeting was to focus on a deep learning approach where the raw ECG signal is fed directly into the neural network. The network itself will act as a feature extractor, which avoids the need for manual feature engineering and allows the model to learn the most predictive patterns from the data.
To prepare the data for our models, we received clear guidance on the preprocessing pipeline. The core steps will include:
- Applying a Butterworth bandpass filter to clean the signal
- Resampling all data to a standard 500 Hz frequency
- Segmenting the recordings into 5 or 10-second windows for analysis
- Applying z-score normalization to standardize the signal amplitude
Our dataset exploration revealed that PTB-XL contains 21,837 clinical 12-lead ECGs from 18,885 patients with 10-second recordings and comprehensive multi-label annotations across 71 diagnostic classes. Meanwhile, MIT-BIH provides longer recordings but focuses primarily on arrhythmia detection with beat-level annotations. This diversity will strengthen our model’s generalization capabilities.
Next Steps: Diving into the Data
With a clear plan in place, our immediate focus shifts to hands-on data exploration. For the upcoming week, each team member will download and analyze at least one public dataset, with the goal of exploring six datasets in total. Our primary objectives are to understand the available labels, learn how to load and visualize the data, and assess data quality and class distribution. In parallel, we will begin implementing the preprocessing pipeline and start replicating baseline CNN-based models for arrhythmia detection.
Recent research shows that single-lead ECG analysis using deep learning can achieve impressive results, with models like VGG16 reaching F1-scores of 98.11% on certain leads, while lightweight architectures like MobileNetV2 achieve 97.24% accuracy with faster inference times suitable for real-time monitoring. This validates our approach of exploring individual lead performance before moving to Aventusoft’s proprietary data.
This week marked a critical transition from high-level planning to detailed technical execution. We are excited to get our hands on the data and begin building the foundation for BEATNET.
See you next week!