This paper evaluates various interpretable machine learning (ML) models for predicting blood glucose in Type 1 diabetes patients by combining heart rate variability (HRV) features extracted from ECG signals with continuous glucose monitoring (CGM) data. The key conclusion is that the Decision Tree model excels in both range classification accuracy and interpretability. Additionally, incorporating HRV features improves predictive power, pointing toward future possibilities for expanded datasets and non-invasive blood glucose monitoring.
1. Introduction: The Importance of Blood Glucose Prediction in Type 1 Diabetes
Type 1 diabetes (T1D) is an autoimmune disease caused by a complex interplay of genetic, environmental, and immune system factors. The disease leads to insufficient insulin production, making blood glucose regulation difficult, and accounts for approximately 2% of all diabetes cases worldwide.
"The primary characteristic of this disease is hyperglycemia due to insulin deficiency. Ultimately, the body loses its ability to properly regulate blood glucose."
Because no complete cure has been established, continuous management through insulin administration and healthy lifestyle habits is critically important. To prevent complications, predicting short-term blood glucose fluctuations is essential.
Recently, continuous glucose monitors (CGMs) have become widely available, and artificial intelligence (AI)-based prediction models using this data have advanced considerably. However, most existing AI models are "black box" structures that do not explain the reasoning behind their results, making it difficult for users to trust them.
In contrast, interpretable AI models clearly reveal the decision-making process, which earns greater trust from healthcare professionals and can contribute to personalized treatment and improved outcomes.
2. Prior Research: Achievements and Limitations of Existing AI Models
Various AI techniques have been introduced in the blood glucose prediction field, including regression, classification, artificial neural networks (ANN), recurrent neural networks (RNN), and LSTM. While deep learning neural networks intuitively deliver high performance, they remain black boxes whose internal workings are difficult to understand.
"Deep learning neural networks show high performance, but their internal operations cannot be understood. In medical data contexts, this can be a critical limitation."
Recently, studies have emerged that experimentally compare the performance of interpretable models such as LASSO regression, multiple linear regression (MLR), and decision trees to strengthen interpretability and clinical applicability.
Key prior studies summarized (partial list):
- Zhu et al.: DRNN, not interpretable, no ECG/HRV, small sample
- McShinsky et al.: Multiple ML algorithms, best-performing LASSO is interpretable, multiple features used
- Zhang et al.: 30-minute prediction not interpretable; 60-minute prediction shows superior interpretable MLR
- This study: Interpretable regression and tree models, includes HRV features, detailed data explanation possible, future expansion to non-invasive monitoring elements feasible
"Our study is the first to utilize the D1NAMO dataset's BGL and HRV features with interpretable models."
3. Data and Methodology
3.1. Dataset
- D1NAMO dataset: Data collected over 8 weeks from 29 participants (20 healthy, 9 T1D patients), including real-time ECG, respiration, acceleration signals, and blood glucose data.
- Only data from 3 diabetic patients (002, 007, 008) were used (remaining patient data excluded due to quality/synchronization limitations).
3.2. ECG Signal Preprocessing
- Rigorous preprocessing steps including filtering, R-peak detection, and QRS complex analysis were performed to remove noise from ECG signals.
- Segments below a certain signal quality threshold were excluded.

Top: before filtering; Bottom: after filtering ECG signal
3.3. Feature Extraction
- Only time-domain and frequency-domain HRV features (e.g., SDNN, RMSSD, NN50, HF/LF band power) were used.
- The final data structure was as follows:
i = subid; BGLt; HRV features...; BGLt+30
- That is, blood glucose at each time point + simultaneous HRV features -> predicting blood glucose 30 minutes later
3.4. Machine Learning Models and Evaluation Metrics
- Models: Linear, Ridge, LASSO, Elastic Net, Bayesian Ridge, Decision Tree (all interpretable)
- Cross-validation: 10-fold
- Regression metric: RMSE (more sensitive to large prediction errors in medical contexts)
- Converted to classification: Blood glucose divided into 7 categories, evaluated by F1 score
"The model's goal is to predict blood glucose 30 minutes ahead, or to predict the correct class among 7 ranges (very low to very high)."
4. Results
4.1. Regression Performance
- Best models per patient and RMSE (significant error reduction):
- Patient 002: Ridge, Elastic Net, Bayesian Ridge (1.98 mmol/L)
- Patient 007: LASSO, Elastic Net, Linear (1.33 mmol/L)
- Patient 008: Bayesian Ridge, Ridge (1.45 mmol/L)
4.2. Multi-Class Classification (F1 Score) Performance
- The Decision Tree model achieved the highest weighted F1 scores across all 3 patients:
- Patient 002: 0.87
- Patient 007: 0.84
- Patient 008: 0.82
"The Decision Tree model most accurately distinguished blood glucose ranges and could play a pivotal role in providing early warnings when entering dangerous (hypo/hyperglycemia) categories."
- Interpretability of Decision Trees:
- Each node clearly visualizes which feature and threshold value were used for branching.
- Blood glucose range predictions are easily visible through color coding from blue (hypoglycemia) to orange (hyperglycemia).

"The structure of the Decision Tree is transparent, allowing users to easily identify which features contribute to decision-making at each node."
4.3. Additional Performance (Speed and Memory)
- Decision Tree average training time: 2.97 seconds, inference: 5ms per case, memory: approximately 134kB.
- Linear Regression is faster and smaller, but its prediction performance falls short of the Decision Tree.
5. Discussion
- Interpretable models are generated as patient-specific personalized structures, so tree structure and branching features may differ for each patient.
- Integrating HRV features resulted in a 5% to 14% RMSE reduction for each patient, clearly outperforming blood glucose-only prediction.
- Combining ECG-based HRV features suggests the potential for developing non-invasive blood glucose monitoring systems.
"Integrating HRV features noticeably improves personalized blood glucose prediction accuracy."
6. Conclusions and Future Research
- This study demonstrated that interpretable ML model combinations (particularly Decision Trees) excel at predicting 30-minute blood glucose ranges for 3 Type 1 diabetes patients.
- Validation with larger, more diverse blood glucose datasets and longer prediction horizons (45, 60, 180 minutes, etc.) is needed.
- The goal is to conduct in-depth analysis of HRV-blood glucose interactions to establish a practical technological foundation for non-invasive blood glucose prediction.
Closing Thoughts
This paper demonstrates how interpretable AI can enhance trust and applicability in real clinical settings, going beyond mere prediction accuracy. It is expected to serve as a foundation for personalized, non-invasive diabetes management systems that both medical professionals and patients can easily understand and confidently use.
"Interpretable machine learning models produce prediction results that both patients and physicians can understand and trust, enabling safer and more personalized diabetes management."
