Blog Posts

Week 5

Refining Stability and Clinical Plausibility: Validating the Single-Lead ECG-FM

This week marked a transition from initial adaptation to rigorous validation and system stabilization. Our focus shifted toward evaluating the fine-tuned ECG foundation model on real-world single-lead inputs and enforcing physiological constraints to ensure our outputs meet clinical standards.+2

Rather than just proving feasibility, we are now hardening the system—moving from experimental code to a reproducible, high-performance diagnostic pipeline.+1


Key Accomplishments This Week

  • Validated High-Performance Single-Lead Adaptation We evaluated the fine-tuned ECG-FM on duplicated single-lead inputs using real-world samples provided by our sponsor. The model achieved a strong AUROC of approximately 0.96, demonstrating that single-lead adaptation can reach performance levels comparable to dual-lead baselines.
  • Optimized Temporal Aggregation for Prediction Stability By comparing 10-second and 30-second inference windows, we observed that longer temporal aggregation significantly improves stability. This strategy reduces inconsistent arrhythmia predictions, providing a more reliable output for clinical review.
  • Advanced PQRST Delineation with Physiological Constraints We improved our post-processing by enforcing rules such as temporal ordering and minimum inter-peak intervals. By clustering nearby candidate peaks, we successfully reduced duplicate detections and began addressing over-generation issues in T-wave localization.+1
  • Refined Clinical Plausibility and Thresholding In collaboration with sponsor feedback, we analyzed multi-label outputs—including PVC, tachycardia, and bundle branch blocks—to assess their clinical plausibility. We investigated thresholding strategies to suppress low-confidence or physiologically impossible diagnoses in a deployment setting.+1
  • Initiated Model Interpretability Research We began exploring saliency and attribution maps to localize PVC-related regions. This work supports the debugging of both classification and fiducial detection, ensuring the model is looking at the correct features for its predictions.

Next Steps: Toward a Deployment-Ready System

In the coming week, we will:

  • Benchmark window aggregation (10s vs. 30s) alongside tuned confidence thresholds to quantify the trade-offs between system responsiveness and prediction stability.
  • Refine T-wave localization by further adjusting temporal windows and confidence filtering within the post-processing pipeline.
  • Finalize patient-level data splits and document preprocessing workflows to support future regulatory-oriented documentation.
  • Package the fine-tuning pipeline and inference scripts into a clean, reproducible repository for team sharing and review.
  • Draft a structured technical report detailing the single-lead adaptation approach, experimental setup, and key performance results.

With our core performance validated and physiological constraints in place, we are moving closer to a cohesive, deployment-oriented system.

See you next week!

Week 4

From Validation to Adaptation: Fine-Tuning the ECG-FM for Single-Lead Deployment

This week marked our transition from diagnosing limitations to actively adapting our models for deployment. With last week’s findings confirming that naive channel duplication is insufficient, the team shifted focus toward fine-tuning the ECG foundation model directly on duplicated single-lead inputs and tightening the end-to-end inference pipeline for real-world use.

Rather than treating the foundation model as a fixed black box, we began re-shaping it to better reflect the statistical properties and constraints of wearable ECG data. This week was less about proving what doesn’t work—and more about building what will.


Key Accomplishments This Week

Initiated Fine-Tuning on Duplicated Single-Lead Inputs
We set up the fine-tuning pipeline starting from a pretrained ECG-FM checkpoint that had not yet been biased by 12-lead-specific downstream tasks. Single-lead ECG signals were duplicated across channels to match the model’s expected input format, and supervised fine-tuning was launched for multi-label arrhythmia classification. Early training curves indicate improved stability compared to directly using the frozen multi-lead model, suggesting the model is beginning to adapt to the single-lead distribution shift.

End-to-End Inference Pipeline Prototyping
We implemented a standardized Python-based inference wrapper that takes a 5-second single-lead ECG segment as input and returns calibrated probabilities across all 17 diagnostic classes. This wrapper formalizes preprocessing, normalization, segmentation, and post-processing steps into a single callable interface, laying the groundwork for future cloud and mobile deployment.

PQRST Delineation Integration Progress
We continued refining the PQRST delineation pipeline and began aligning its input/output schema with the classification module. This included unifying windowing strategies, timestamp conventions, and output formats for fiducial points, which is critical for downstream clinical interpretation and UI integration. Initial tests confirm that both pipelines can now operate on consistent 5-second windows without manual intervention.

Calibration and Post-Processing Strategy Design
Building on last week’s observation of rare clinically implausible label combinations, we began drafting rule-based constraints and threshold calibration strategies to reduce invalid multi-label outputs. This included early experiments with probability threshold tuning and label co-occurrence filtering to improve clinical plausibility without over-constraining the model.

Sponsor Checkpoint and Deployment Alignment
We shared our fine-tuning plan and early pipeline design with Aventusoft mentors. Sponsor feedback reinforced the importance of treating the inference interface as a “product surface,” not just a research artifact—emphasizing clear contracts for inputs, outputs, latency expectations, and failure modes. This helped anchor our development priorities around deployability rather than pure model performance.


Next Steps: Toward a Deployment-Ready System

In the coming week, we will:

  • Continue fine-tuning the ECG-FM on duplicated single-lead data and conduct controlled comparisons against frozen baselines.
  • Run more systematic evaluations on sponsor-provided ECG samples to assess real-world generalization.
  • Finalize the unified inference API for both classification and PQRST delineation, including standardized JSON-style outputs.
  • Begin profiling latency and memory usage to inform future model compression and mobile deployment planning.

With the core assumptions now validated and the adaptation pipeline in motion, the project is moving from experimental prototyping toward a cohesive, deployment-oriented system. The foundation is set—now it’s about refining and hardening the stack.

See you next week!

Week 3

From Assumptions to Evidence: Validating Single-Lead Limits and Refocusing for Deployment

This week was a pivotal checkpoint for Team Aventusoft. After establishing our preprocessing and modeling foundations, we stress-tested a key assumption: whether a pretrained multi-lead ECG foundation model could reliably operate on single-lead data through simple channel duplication. Through large-scale evaluation and detailed analysis, we confirmed that this approach introduces significant performance limitations.

Rather than viewing this as a setback, the results gave us clarity. The team aligned with our sponsor on a more principled direction—fine-tuning directly on duplicated single-lead inputs and prioritizing a deployment-ready inference pipeline that reflects real-world constraints.

These findings mark a transition from exploratory validation into targeted adaptation and system integration.


Key Accomplishments This Week

Single-Lead Adaptation Stress Testing:
We evaluated the pretrained ECG-FM classification model using single-lead ECG inputs duplicated across 12 channels. Testing on approximately 80,000 ECG segments revealed clear performance degradation compared to native multi-lead inputs, confirming that naive duplication is insufficient for stable downstream use.

Multi-Label Output Analysis and Clinical Plausibility Checks:
We analyzed predictions across all 17 diagnostic labels and verified that most outputs remained clinically plausible. However, we also identified rare invalid combinations (such as simultaneous LBBB and RBBB), reinforcing the need for improved calibration and post-processing strategies.

Progress on PQRST Delineation Pipeline:
Using LUDB data, we advanced our PQRST delineation model by training on 5-second windows with overlapping inference and Gaussian-encoded labels. Evaluation using tolerance-based metrics (±10 ms) and visual inspection of predicted versus ground-truth fiducial points demonstrated improved localization accuracy.

Sponsor Alignment on Deployment Requirements:
We presented both quantitative results and qualitative visualizations to Aventusoft mentors. Sponsor feedback emphasized the importance of a standardized, deployment-ready inference pipeline with clearly defined input–output interfaces for both disease classification and fiducial detection.


Next Steps: Fine-Tuning and Pipeline Consolidation

With assumptions validated and direction aligned, the focus now shifts to execution. Next week, we will begin fine-tuning the ECG-FM model directly on duplicated single-lead inputs, rather than relying on a pretrained multi-lead checkpoint.

In parallel, we will finalize a standardized inference interface that accepts 5-second single-lead ECG segments and outputs calibrated probabilities for all 17 diagnostic classes. We will also continue strengthening the PQRST delineation pipeline by validating performance on more challenging, real-world ECG signals and consolidating classification and delineation components into a unified Python-based library.

Finally, we will test robustness using additional ECG samples provided by Aventusoft to better assess real-world generalization.

The project remains on schedule, and this week’s results provide a strong technical foundation for the fine-tuning and integration work ahead.

See you next week!

Week 2

Screenshot

From Preprocessing to Fine-Tuning: Building the Foundation for Team Aventusoft

This was a pivotal week for our team! After spending time establishing our core data strategy, we successfully completed the preprocessing pipelines required to align single-lead data with the ECG-FM foundation model. We spent the week diving deep into the technical architecture of the fairseq framework, ensuring that our setup is robust and ready for the computational demands of model training.

These foundational steps have moved us from the planning phase into active experimentation, setting the stage for our first supervised fine-tuning runs using single-lead duplicated ECG signals.


Key Accomplishments This Week

  • Multi-Channel Signal Synthesis and Validation: We completed preprocessing pipelines to extract Lead I ECG signals and duplicate them across 12 channels. This ensures full compatibility with the ECG-FM foundation model architecture while maintaining signal alignment and normalization.
  • Architectural Mastery of fairseq-signals: The team conducted an in-depth study of the fairseq and fairseq-signals frameworks. We analyzed model architecture, training pipelines, and the specific mechanisms used to load and freeze pretrained checkpoints for downstream tasks.
  • Fine-Tuning Infrastructure Ready: We reviewed prior ECG-FM training scripts and documentation to identify the necessary manifests, inputs, and hyperparameters. This preparation was essential to ensure our upcoming fine-tuning runs are efficient and stable.
  • Data Integrity and Label Mapping: To ensure clinically meaningful results, we established a plan for patient-level training, validation, and test splits to avoid data leakage. We also began generating supervised diagnostic labels for our dataset using clinical metadata and established mappings.

Next Steps: Launching Experiments and Benchmarking

With the infrastructure finalized and the project remaining on schedule, next week is all about execution. We will launch the first fine-tuning experiments using the ECG-FM pretrained checkpoint with our single-lead duplicated inputs.

A major priority will be monitoring training stability and loss convergence to verify the integration of our new preprocessing pipeline. We will also begin comparing our fine-tuned results against simpler single-lead baseline models to contextualize performance gains. Finally, we will coordinate with our liaison engineers to confirm preferred evaluation metrics like AUROC and F1-score, ensuring our model meets the high standards required for clinical disease classification.

See you next week!

Week 1

New Year, New Architecture: Setting the Stage on HiperGator

This week, Team Aventusoft hit the ground running with a clear mission: adapting the ECG-FM foundation model for single-lead application.

Our main focus was solving the distribution mismatch challenge. After evaluating our initial approach, we finalized a revised technical strategy to fine-tune the model by duplicating single-lead ECG signals across 12 channels. To support this, we moved our operations to the HiperGator high-performance computing cluster. We successfully downloaded the complete MIMIC-IV-ECG dataset and installed the “fairseq-signals” framework, ironing out all the environment and dependency wrinkles along the way.

With the infrastructure in place, we implemented preprocessing scripts to generate the necessary dataset manifests. The rig is ready. Next steps include generating diagnostic labels and launching our first full-scale GPU fine-tuning run to see how our new strategy performs.

Week 14

Ending the Semester on a High Note: A Successful SLDR for BeatNet

This week marked an exciting milestone for Team BeatNet — the completion of our System-Level Design Review (SLDR) and the wrap-up of the Fall semester. Our presentation showcased how the project has evolved far beyond a standard deep learning classification task. We’ve successfully transitioned into building a Self-Supervised Foundation Model for ECG learning — positioning BeatNet at the forefront of intelligent mobile cardiology solutions.

A Big Thank You

We want to express our sincere gratitude to:

  • Aventusoft, our project sponsor, for continuously challenging us with ambitious goals — especially in adapting advanced AI techniques to single-lead wearable ECG.
  • Dr. Kejun Huang, our team coach, for expert guidance, thoughtful feedback, and helping us refine both our engineering approach and long-term system vision.

Your support and encouragement helped drive every breakthrough we achieved this semester.

Looking Ahead: Ready for the Spring Sprint

What’s next for BeatNet?

With the holidays around the corner, our immediate plan is simple: rest, recharge, and reconnect with family and friends. Once the Spring semester kicks off, we’ll resume at full speed — laser-focused on:

  • Validating segmentation and classification performance
  • Advancing explainability for clinical trust
  • Model compression and deployment to mobile

The momentum is strong, and we’re excited to accelerate our journey toward a real-world, deployable system.

Week 13

11/18 Presentation

From Pretraining to SLDR : A New Milestone for Team BeatNet

This week, We made significant progress by moving from model development to system-level validation in preparation for our upcoming System-Level Design Review (SLDR). After weeks of developing and fine-tuning the key elements of our ECG analysis pipeline, the team achieved important technical and documentation milestones that advance us toward a deployable and clinically valuable system.

Key Accomplishments This Week

SLDR Preparation and Architecture Finalization

A significant portion of our efforts went toward preparing the SLDR report and presentation, which required consolidating architecture decisions, documenting interfaces, validating functional requirements, and aligning our work with Aventusoft’s deployment expectations. This process allowed us to refine our system blueprint and ensure each subsystem—preprocessing, fiducial detection, classification, and deployment—fits into a cohesive and verifiable design.

Progress on the Foundation Model Pipeline

We continued advancing the foundation model workflow by evaluating lightweight architectures and early-stage distillation strategies to support mobile inference. This included comparing smaller CNNs and Transformer variants, analyzing their compute profiles, and assessing whether they meet BeatNet’s <30-second runtime requirement for mobile deployment.

Fiducial Detection & Classification Integration

Work also progressed on the integration between the fiducial regression module and the arrhythmia classifier. We refined the windowing logic, updated segmentation strategies based on coach feedback, and prepared the models for the upcoming validation phase, which will begin after SLDR.

Explainability as a New Priority

Following sponsor discussions, explainability has become a major focus for the next sprint. Physicians must understand why the model makes its decisions. In line with that requirement, the team has begun planning explainability modules such as:

  • Attention heatmaps for rhythm classification
  • P/Q/R/S/T waveform overlays for delineation transparency
  • Interpretability dashboards for both cloud and mobile inference

These additions will support clinical trust and align the system with regulatory expectations for traceability and interpretability.

Documentation & Technical Review

We completed substantial sections of the SLDR document—including system architecture, requirements traceability, software interfaces, and deployment considerations. This helped us validate the technical cohesion of the project and prepare for sponsor review.

Next Steps: Validation, Explainability, and Deployment Readiness

With SLDR approaching, our focus for next week will shift toward running validation experiments for both the segmentation and classification modules, implementing the first generation of explainability tools, and continuing our research on model compression, quantization, and distillation to support mobile deployment. At the same time, we will be finalizing the SLDR presentation materials to clearly communicate our path toward cloud and mobile integration. As we enter a phase where accuracy, interpretability, and efficiency converge, the SLDR checkpoint will serve as an important opportunity to demonstrate our progress and gather targeted feedback from our sponsors, guiding us into the next stage of system development.

See you next week!

Week 12

From Pretraining to Explainability : A New Focus for BeatNet

All of our efforts this week were geared toward one major goal: validating our new foundation-model This was a landmark week for Team BeatNet! After spending the last few weeks building our core pipeline, we successfully completed our first full pretraining run of the foundation model. We presented these exciting initial results—which already show the model is learning the deep structure of ECGs—to our sponsors, Dr. Kejun Huang and Dr. Keider Hoyos.

This successful review meeting not only validated our technical approach but also gave us a critical new focus for the next sprint: clinical interpretability.

Key Accomplishments This Week

First Pretraining Round Complete We successfully completed the first round of pretraining using our contrastive learning approach. By teaching the model to recognize ECG segments from the same patient as “positive pairs,” we are forcing it to learn the fundamental, patient-invariant features of a heartbeat. Early results are promising, showing “emerging latent structure” in the embeddings, which validates our entire model setup.

Strategic Focus on Clinical Interpretability During our technical review, Dr. Hoyos emphasized that for this tool to be trusted by physicians, it cannot be a “black box.” It’s not enough for the AI to be accurate; doctors must understand why it’s making a specific diagnosis. Following this guidance, the sponsors approved our plan to integrate explainability modules (like attention heatmaps) in the next sprint.

Full Pipeline Integration and Fine-Tuning Initiated With the foundation model prototype finalized (combining CNN encoders with a Transformer decoder), we have begun attacking the downstream tasks. We have already started the first fine-tuning experiments for arrhythmia classification and have implemented the pipeline for fiducial landmark (P/Q/R/S/T) regression.

Deployment & Optimization Research In parallel, we have been optimizing the model and preparing for deployment. We worked on benchmarking different window lengths (5s vs. 10s) to find the sweet spot for context learning. We also continued research on model distillation and lightweight architectures, assessing how we can shrink this powerful model to run efficiently on BeatNet’s embedded system.

Next Steps: Validation, Explainability, and Deployment

With the pretrained model in hand, next week is all about validation and implementation. We will begin cross-validation testing, using the model’s embeddings for both arrhythmia classification and fiducial detection.

A major priority will be implementing the new explainability modules—generating waveform attention maps and fiducial localization overlays. We will also evaluate model compression and quantization strategies for mobile deployment. Finally, all of this will be packaged into our updated slides for the upcoming System-Level Design Review (SLDR), where we will highlight our model’s explainability and a clear path to deployment.

See you next week!

Week 11

Prototype Inspection Day

From Public Validation to Pipeline Implementation

All of our efforts this week were geared toward one major goal: validating our new foundation-model strategy with the wider UF community and beginning the core implementation of our self-supervised pipeline. After a successful presentation at the Prototype Inspection Day (PID), Team BeatNet is now fully focused on building the components necessary to pre-train our model on large-scale datasets.

This week was about turning plans into concrete action—coding the contrastive learning framework, benchmarking encoder backbones, and preparing scripts for downstream fine-tuning.

Key Accomplishments This Week

Foundation-Model Strategy Defined:Successful Prototype Inspection Day (PID)
We presented our prototype progress and foundation-model strategy to UF faculty and alumni. Judges praised the team’s clear communication and cohesive technical direction. Dr. Chenhao Wang highlighted our clarity, Dr. Tingsao Xiao called our foundation-model approach “intuitive and reasonable,” and Dr. Catia Silva encouraged us to begin embedding implementation as soon as possible.

Contrastive Learning Pipeline Initiated
We began implementing the SimCLR-style contrastive learning pipeline, developing an augmentation module that generates positive pairs through noise, jitter, and scaling—an essential step toward robust patient-invariant embeddings.

Downstream Pipeline Development
In parallel, we started building the downstream fine-tuning workflow to adapt pretrained embeddings for arrhythmia classification and later for fiducial landmark regression.

Encoder Benchmarking
We benchmarked multiple 1D encoder backbones—including ResNet1D and EfficientNet1D—to identify the optimal architecture for feature extraction efficiency and downstream transferability.

Improved Communication Flow
Following judge feedback, the team is also refining presentation visuals, ensuring balanced speaking roles, and better illustrating the link between preprocessing, embedding, and disease classification.

Next Steps: Pre-training and Visualization

Next week, we will execute the first pre-training runs on the PTB-XL dataset, finalizing and debugging the contrastive learning script. We will develop a t-SNE visualization notebook to evaluate whether embeddings effectively cluster normal and arrhythmic segments, and improve visualization of time-positional encoding within our Transformer prototype.

We will also begin preparing documentation for the upcoming System Level Design Review (SLDR) and conduct internal comparisons between baseline and foundation-model performance.

See you next week!

Week 10

External Engagement – UF AI Days 2025:
We presented our poster “BeatNet ECG AI: Foundation Model for Cardiac Signal Understanding.” During the event, we met Dr. David Winchester, a UF Health cardiologist, whose insights on ECG morphology and diagnostic workflows reinforced the importance of interpretability and clinical trust in AI-driven cardiology.

Advancing the Foundation Model

This week was a significant milestone for us as we moved from architectural design to actively prototyping our ECG foundation model. Following last week’s design discussions, we concentrated on developing and testing the initial version of our self-supervised pretraining pipeline, turning the idea of contrastive learning from theory into practice.

Our collective goal was to transform unlabeled ECG data into structured, patient-invariant representations that can serve as the backbone for future landmark detection and disease classification models. The week was defined by rigorous experimentation, cross-validation, and early visualization of emerging cardiac signal embeddings.

Key Accomplishments This Week

Foundation-Model Strategy Defined

We conducted a detailed technical meeting with Dr. Kejun Huang and Dr. Keider Hoyos and finalized the move toward a foundation model built on large-scale ECG corpora such as PTB-XL and MIMIC-III/IV. The model will leverage contrastive pre-training, treating 5-second and 10-second ECG windows from the same patient as positive pairs, while windows from different patients act as negative pairs. This approach allows the model to learn invariant, patient-specific embeddings.

Explored Alternate Self-Supervised Approaches

We evaluated potential pre-training routes including masked autoencoder (MAE) learning and distance-map regression, which predicts landmark-wise distance functions rather than explicit waveform peaks.

Semester 2 Modeling Pillars Finalized

Our Semester 2 (Spring 2026) deliverables are now centered around three primary components:

Landmark Detection Model: multi-output regression to identify P/Q/R/S/T time-stamps.

Arrhythmia & Disease Classification: fine-tuning foundation embeddings for AFib, PVC, LBBB/RBBB, and conduction block detection.

Model Distillation for Mobile Deployment: quantization and pruning to adapt the foundation model for Aventusoft’s single-lead (500 Hz) device.

Single-Lead Adaptation Framework

Discussions also clarified strategies for retraining 12-lead models to function effectively with single-lead input, aligning with Aventusoft’s mobile ECG system requirements.

Next Steps: Prototype Implementation

In the coming week, our focus will shift toward implementing the prototype foundation-model pipeline. We will begin with contrastive pre-training using PTB-XL segments to establish the base embedding space and develop an augmentation module capable of simulating inter-patient variability through signal inversion, noise injection, and time-warping. Next, we will visualize these learned embeddings using t-SNE to validate whether the model can effectively cluster normal and arrhythmic patterns. Simultaneously, we plan to benchmark different encoder backbones, such as ResNet and EfficientNet-1D, to identify the most efficient architecture for downstream fine-tuning. Finally, our team will document the complete workflow from pre-training to distillation for inclusion in the upcoming System Level Design Review (SLDR) and coordinate with Aventusoft regarding access to internal anonymized ECG data and available GPU resources.

This week was about transforming technical insight into implementation readiness—laying the foundation for self-supervised ECG intelligence that bridges research innovation with real-world device deployment.

See you next week!