Early prediction of bronchopulmonary dysplasia in extremely premature infants: a cohort study
- Authors: Permyakova A.V.1, Bakhmetyeva O.B.2, Mamunts M.A.1, Kuchumov A.G.3, Koshechkin K.A.4
-
Affiliations:
- E.A. Vagner Perm State Medical University
- Perm Regional Perinatal Center
- Perm National Research Polytechnic University
- I.M. Sechenov First Moscow State Medical University (Sechenov University)
- Issue: Vol 41, No 3 (2024)
- Pages: 120-128
- Section: Methods of diagnosis and technologies
- Submitted: 04.03.2024
- Published: 19.07.2024
- URL: https://permmedjournal.ru/PMJ/article/view/628590
- DOI: https://doi.org/10.17816/pmj413120-128
- ID: 628590
Cite item
Abstract
Objective. To develop the model for early prediction of clinically significant bronchopulmonary dysplasia in extremely premature infants.
Materials and methods. 226 premature infants with gestational age less than 31 weeks, birth weight from 490 to 999 g, age from 0 to 7 days, and respiratory failure requiring ventilatory support (ventilator support) were included into a retrospective study conducted in the Perm Regional Perinatal Center. Machine learning algorithms such as logistic regression, support vector machine, random forest method, and gradient boosting method were used for the prognostic model building. Five variables were used: birth weight, Apgar score in the 5th minute of life, Silverman score, number of days of invasive ventilatory support, median oxygen fraction in the inhaled air measured daily during the first seven days of life.
Results. In the 36th week of postconceptional age 148 out of 182 infants (81.3 %) in the study cohort developed bronchopulmonary dysplasia (BPD), among them 15.4 % had a mild form, 29.7 % a moderate one, and in 36.3 % of patient it was severe. Among the four studied prediction algorithms, logistic regression model was chosen as the final model with metrics: AUC = 0.840, accuracy 0.818, sensitivity 0.972, specificity 0.666. The practical application of the modeling results was implemented in the form of a probability calculator.
Conclusions. In the early neonatal period of extremely premature infants, a combination of clinical predictors such as birth weight, Apgar score in the 5th minute of life, Silverman score, number of days of invasive ventilatory support, median oxygen fraction in the inhaled air measured during the first seven days of life can be used to predict the development of bronchopulmonary dysplasia. The logistic regression model shows high sensitivity that minimizes the probability of an error of second kind. Thus, its application is useful in the early prediction of bronchopulmonary dysplasia in premature infants.
Full Text
Introduction
Bronchopulmonary dysplasia (BPD) is one of the most crucial complications of preterm labor as it has long-term consequences [1]. Due to the advances in modern neonatal care, the survival rate of profoundly premature infants has improved significantly, which is contributing to the increasing incidence of BPD worldwide [2]. The optimization of strategies for the prevention and treatment of BPD is based on scientific prediction of the probability of its development, the main goal of which is to ensure a personalized approach to each child.
Many BPD prediction models have been developed in recent years. For example, T. C. Kwok (2023) included 64 studies with 53 prediction models in his review [3], H. B. Peng (2022) described 21 prediction models from 13 studies [4], M. Romijn (2023) examined 65 studies, including 158 development models and 108 externally validated models, however, the problem is that the existing models are of varying quality and they may produce contradictory results, and this leads to difficulties about the kind of model to use or to recommend [5]. Mathematical approaches in medical prediction include the use of statistical methods and machine learning. Statistical methods can be used to analyze disease data, patient data, and epidemiological trends to identify patterns and factors that affect health. Machine learning allows to create models based on large amounts of data, which helps in predicting diagnoses, treatment results and possible complications [6; 7]. Various algorithms have been effectively applied to process data generated in neonatology over the past decade, for example, for the prediction of the hemodynamic significance of a functioning ductus arteriosus among preterm neonates [8; 9]. The model based on machine learning of support vectors was proposed in Denmark in 2021 to predict the occurrence of BPD by combining postpartum clinical characteristics and the amount of nitrogen in exhaled gas, the accuracy of the model was about 90 % [10]. In another study was created a machine learning model to predict serious BPD using clinical data and genomics, the AUC of the model was 0.872 [11]. The results of BPD prediction based on deep machine learning technologies, particularly with the help of neural networks, have been published now [12; 13]. Traditionally, all researchers identify risk factors for BPD's development by classifying preterm infants for the presence or absence of BPD at 28 days postnatal period or 36 weeks postconceptional age (PCA). Then researchers examine all factors that influenced risk up to the time of diagnosis. Most BPD's prediction models use clinical indicators, including prenatal, perinatal, and postnatal factors. Although there were a lot of attempts to examine the correlation between biomarkers and BPD in the majority of studies, only few biomarkers have been included in prediction models (14). Today the main known risk factors for BPD's development listed in studies are low birth weight, gestational age, male sex, open ductus arteriosus, sepsis, and artificial pulmonary ventilation. Nevertheless, considering the fact that the development of BPD is determined by the influence of a large number of factors, the interrelationship of which is still controversial, the optimal set of factors predicting the development of BPD is still unknown.
The aim of the study is to develop an algorithm for early prediction of the development of clinically significant bronchopulmonary dysplasia among profoundly premature infants. It is hypothesized that there is an optimal combination of predictive features (predictors) that will result in the highest probability of BPD's development.
Materials and methods
A retrospective study, conducted at the Perm Regional Perinatal Center, included 226 profoundly premature infants that were born between October 2015 and April 2020. Conditions of inclusion in the observation groups were: gestational age less than 31 weeks, birth weight from 490g to 999g, age from 0 to 7 days, respiratory insufficiency requiring artificial pulmonary ventilation (ALV), main diagnosis according to
ICD-10: P 27.1 – bronchopulmonary dysplasia that occurred in the perinatal period. Exclusion criteria: serious congenital malformations such as chromosomal abnormalities, congenital lung disease, congenital heart defects (except open ductus arteriosus (OAD) and atrial septal defect) and malformations of the central nervous system, as well as incomplete clinical data. The information was obtained by retrospective examination of medical records of reporting forms No. 112/у. In our study, we defined BPD according to the wording of R.D. Higgins (2018) stated in the clinical recommendations: bronchopulmonary dysplasia is a chronic diffuse parenchymatous (interstitial) lung disease that occurs among premature infants as an outcome of respiratory distress syndrome and/or pulmonary hypoplasia, diagnosed on the basis of oxygen dependence at 28 days of life and/or 36 weeks postconceptional age [15]. 60 potential prognostic features were identified based on literature review and our own hypotheses, and 37 of them were excluded as uninformative in subsequent analysis. As a result, 5 variable characteristics (predictors) of the early neonatal period were used to develop a prediction model: birth weight, the 5-minute Apgar score, Silverman score, the amount of days of invasive ALV, and the median value of the fraction of oxygen in the respired oxygen (FiO2) recorded daily during the first seven days of life. Invasive ALV was defined as any type of assisted ventilation requiring intubation and artificial ventilation from a CPAP machine. The indication for ALV was frequent apneas, increasing symptoms of RI in the form of participation of auxiliary muscles in the breathing process, persistent respiratory acidosis in blood gasses, increasing PaCO2 > 50 mm Hg at FiO2 60 % in the supplied mixture. Laboratory methods of research included general clinical blood analysis (Sysmex XN 9000 analyzer), biochemical blood analysis (Sapphire 400 analyzer). Echocardiographic study (echocardiography) was conducted among all infants on the 1st, 3rd, 7th, and 28th days of life with Vivid&GE (USA), 12S-RS and 8S-RS transducers. Neurosonographic study (NSG) was conducted on the 1st and 3rd day of life using a Vivid&General Electric ultrasound multifunctional scanner (USA) with color coded Doppler flow mapping. Standard ECG recordings were conducted among all infants using an electrocardiograph “Alton EKZT-12-03 (2007)” on the second day of life. Chest organ radiography (OGC) was conducted on the 1st, 3rd, 28th days and at 36 weeks of PCA (TMS 300 RDR mobile X-ray unit). Round-the-clock monitoring of vital functions was conducted among all infants and included monitoring of heart rate, saturation and blood pressure.
The results were subjected to statistical processing by using parametric and nonparametric analysis methods. Quantitative indices, which distribution differed from normal, were described using median (Me) values with quartiles (Q1 – Q3) corresponding to the 25–75 % interval. Nominal data were described by stating the absolute value and percentage. Arithmetic mean (M), standard deviation (SD) and 95 % confidence interval (95 % CI) limits were calculated for quantitative indices having normal distribution. The Student's t-criterion (for normal distribution) and the Mann – Whitney (U) criteria for non-normal distribution were calculated for comparative analysis of mean values. Nominal data were compared using Pearson's χ2 test. Differences were considered statistically significant if the level of significance was determined to be p < 0.05. The connection between the phenomena, which were represented by quantitative data, was evaluated using Spearman's rank correlation coefficient. The following algorithms were used to develop the model: logistic regression, support vector machines (SVM), Random Forest Classifier and Gradient Boosting Classifier. Continuous variables were standardized such that their values ranged from 0 to 1. Non-binary categorical variables were converted to binary variables via One Hot Encoder. Models were developed using the training dataset and evaluated using 5-fold cross-validation. The test dataset was used for internal validation. The area under the ROC curve (AUC) of each model was calculated to evaluate the characteristics of the models. The evaluation of the following metrics were used: Accuracy, Precision, Recall, and F1 Score.
Data accumulation, adjustment, summarization and visualization were accomplished in standard Microsoft Office Excel 2016 spreadsheets. Jamovi, SPSS 26.0 software was used for statistical analysis.All experiments were conducted in Python 3.9.5 using the following libraries: scikit-learn 0.24.1, matplotlib, scipy. All procedures in this research involving human people were conducted in accordance with the Declaration of Helsinki (revised in 2013). The study was approved by the local ethical committee of the Federal State Budgetary Educational Institution of Higher Education "Perm State Medical University named after Academician E.A. Wagner" of the Ministry of Health of the Russian Federation (Perm, Russia). Written informed consent was obtained from the patients' parents or legal guardians.
Results and discussion
A total of 226 infants, born before 30 weeks gestation, were enrolled in the study. Retrospectively, 44 infants were excluded from the study due to death before the 28th day of life, 21 dropped out of the study because of other reasons. Thus, there were a total of 182 children included in the final analysis, including 94/182 (51.6 %) girls and 88/182 (48.4 %) boys. The median birth weight was 880.0 g with an interquartile range (Q1 – Q3) from 770 to 960.0 g, the average gestational age was 26.7 ± 1.74 weeks, and the average mother's age was 27.2 ± 6.5. 148 out of 182 infants (81.3 %) diagnosed with BPD at the 36th week of postconceptional age, 28/182 (15.4 %) of them were categorized as mild, 54/182 (29.7 %) as moderately severe, and 66/182 (36.3 %) as serious. Considering the insignificant clinical manifestations of mild BPD, it was decided to divide the data into two groups: moderate/serious BPD (main group, 120 patients) and absence/mild BPD (comparison group, 62 children). There were significant differences between the groups, such as median weight in the main group was 806
(720–900) g, in the comparison group it was 949 (893–990) g, p < 0.001, the average gestational age in the main group was
26.1 ± 1.5, in the comparison group it was 28 ± 1.5 – (p < 0.001). The length of stay in the intensive care unit was prolonged among infants with BPD (in average of 52.2 days vs. 21.7 days without BPD (p < 0.001)). Apgar score was lower among patients with later BPD's progression (main group): 6.13 ± 0.91 vs. 7.06 ± 0.86, p < 0.001. The Silverman scale score (respiratory disease severity score) in the main group was 5.98 ± 0.80 vs. 5.11 ± 0.88 points, p < 0.001. The average amount of days on ALV was significantly higher in the main group (5.18 ± 2.54 vs. 1.44 ± 2.51, p < 0.001). The median value of respired oxygen fraction FiO2 in the first 7 days of life was significantly higher in the main group: 28.70 (25.5–33.0) vs. 23.50 (22.00–27.00), p < 0.001 (Table 1).
Four machine learning algorithms were used to develop a BPD prediction model. The task type is binary classification, the target variable is the probability of BPD progression, it takes one of two possible
values – 0 or 1, the independent variables are a set of five studied features. Data preparation was conducted, outliers (four values) were removed, and the final dataset size for the simulation was 178 observations. The dataset was randomly divided into two subsets: the training dataset, which consisted of 75 % of the cohort (133 children), and the test dataset, which consisted of the remaining 25 % (45 children). The following variables were used as predictors: birth weight, the 5-minute Apgar score, Silverman score, number of days of invasive ALV and median FiO2 value. In our work, in the context of the task of predicting the probability of developing BPD among preterm infants, we chose the Recall metric as the leading one, because it is important to minimize false negative results. When the model incorrectly predicts the absence of BPD of the infant, as the wrong treatment tactics may be chosen. We reduce the number of such errors by choosing the model with the maximum Recall value. The logistic regression model showed the highest Recall value among the four used algorithms (Table 2).
Table 1
Clinical characteristics of profoundly premature infants
Patients | Main group, n = 120 | Comparison group, n = 62 | p-value |
Birth weight, g | 806 (720–900) | 949 (893–990) | 0.001 |
Birth gestational age, weeks | 26 ± 1.5 | 28 ± 1.5 | 0.001 |
Apgar score, score | 6.13 ± 0.91 | 7.06 ± 0.86 | 0.001 |
Silverman scale score, score | 5.98 ± 0.80 | 5.11 ± 0.88 | 0.001 |
Days on ALV | 5.18 ± 2.54 | 1.44 ± 2.51 | 0.001 |
FiO2, median share, % | 28.70 (25.5–33.0) | 23.50 (22.00–27.00) | 0.001 |
Table 2
Classification characteristics (metrics) of the final models
№ | Model | Accuracy | Precision | Recall | F1 Score | AUC |
1 | Logistic Regression | 0.818 | 0.795 | 0.972 | 0.875 | 0.840 |
2 | Random Forest | 0.763 | 0.780 | 0.888 | 0.831 | 0.830 |
3 | Gradient Boosting | 0.740 | 0.823 | 0.777 | 0.799 | 0.800 |
4 | SVC | 0.720 | 0.733 | 0.916 | 0.814 | 0.800 |
A logistic regression equation was developed on the basis of the obtained results with the coefficients intercept = 1.18, variable “Birth weight” = –0.68, variable “Silverman score” = 0.67, variable “Apgar score” = –0.62, variable “Number of days on ALV” = 0.37, variable “FiO2 fraction” = 0.78. The final equation is provided to the user in a convenient format in the form of a calculator (Web interface). Our final logistic regression model has the following classification characteristics (metrics): Recall 0.972; AUC 0.840; Accuracy 0.818, which allow its application in clinical practice (Figure).
Fig. ROC-curve graph for the logistic regression model
The advantage of this study is that the proposed algorithm is conducted on the seventh day of the infant's life, providing clinicians the opportunity of early prognosis. In addition, the used predictors are uncomplicated and available in clinical practice. A limitation of our study is the relatively small number of participants. It can lead to potential bias. Therefore, further larger studies are needed to confirm the findings and determine their clinical utility.
Conclusions
A combination of clinical predictors such as: birth weight, the 5-minute Apgar score, Silverman score, number of days of invasive ALV, median respired oxygen fraction measured in the first seven days of life can be used to predict the development of BPD in the early neonatal period of profoundly premature infants. The logistic regression model shows high sensitivity values that allow minimizing the probability of the second type of error, which makes its application useful in the tasks of predicting the BPD progression among premature infants with ELBW in the early neonatal period.
About the authors
A. V. Permyakova
E.A. Vagner Perm State Medical University
Author for correspondence.
Email: derucheva@mail.ru
ORCID iD: 0000-0001-5189-0347
DSc (Medicine), Head of the Department of Childhood Infectious Diseases
Russian Federation, PermO. B. Bakhmetyeva
Perm Regional Perinatal Center
Email: derucheva@mail.ru
ORCID iD: 0000-0003-2343-3602
Assistant of the Department of Anesthesiology, Resuscitation and Emergency Medical Aid, Resuscitation Anaesthetist
Russian Federation, PermM. A. Mamunts
E.A. Vagner Perm State Medical University
Email: derucheva@mail.ru
ORCID iD: 0000-0001-5326-6740
PhD (Medicine), Associate Professor of the Department of Pediatrics with Polyclinic Pediatrics Course
Russian Federation, PermA. G. Kuchumov
Perm National Research Polytechnic University
Email: derucheva@mail.ru
ORCID iD: 0000-0002-0466-175X
DSc (Physics and Mathematics), Associate Professor, Professor of the Department of Computational Mathematics, Mechanics and Biomechanics
Russian Federation, PermK. A. Koshechkin
I.M. Sechenov First Moscow State Medical University (Sechenov University)
Email: derucheva@mail.ru
ORCID iD: 0000-0001-7309-2215
DSc (Pharmaceutics), Associate Professor, Professor of the Department of Information and Internet Technologies
Russian Federation, MoscowReferences
- Cheong J.L.Y., Doyle L.W. An update on pulmonary and neurodevelopmental outcomes of bronchopulmonary dysplasia. Semin Perinatol. 2018; 42 (7): 478–484. doi: 10.1053/j.semperi.2018.09.013.
- Lui K., Lee S.K., Kusuda S., Adams M., Vento M., Reichman B., Darlow B.A., Lehtonen L., Modi N., Norman M., Håkansson S., Bassler D., Rusconi F., Lo-dha A., Yang J., Shah P.S. International Network for Evaluation of Outcomes (iNeo) of neonates Investigators. Trends in Outcomes for Neonates Born Very Preterm and Very Low Birth Weight in 11 High-Income Countries. J Pediatr. 2019; 215: 32–40.e14. doi: 10.1016/j.jpeds.
- Kwok T.C., Batey N., Luu K.L., Prayle A., Sharkey D. Bronchopulmonary dysplasia prediction models: a systematic review and meta-analysis with validation. Pediatr Res. 2023; 94 (1): 43–54. doi: 10.1038/s41390-022-02451-8.
- Peng H.B., Zhan Y.L., Chen Y., Jin Z.C., Liu F., Wang B., Yu Z.B. Prediction Models for Bronchopulmonary Dysplasia in Preterm Infants: A Systematic Review. Front Pediatr. 2022; (12): 10: 856159. doi: 10.3389/fped.2022.856159.
- Romijn M., Dhiman P., Martijn J.J. Fink-en, Anton H. van Kaam, Trixie A. Katz, Joost Rotteveel, Ewoud Schuit, Gary S. Collins, Wes Onland, Heloise Torchin. Prediction Models for Bronchopulmonary Dysplasia in Preterm Infants: A Systematic Review and Meta-Analysis. J Pediatr. 2023; Jul: 258 (113370). doi: 10.1016/j.jpeds.2023.01.024.
- Кучумов А.Г., Голуб М.В., Ракишева И.О., Дорошенко О.В. Алгоритм построения метамодели для прогнозирования гемодинамики в аортах детей с врожденными пороками сердца. Сборник научных трудов VII съезда биофизиков России. Сборник материалов съезда: в 2 т. Краснодар 2023; 228–229 / Kuchumov A.G., Golub M.V., Rakisheva I.O., Doroshenko O.V. An algorithm for creation of metamodel for predicting hemodynamics in the aortas of children with congenital heart defects. Sbornik nauchnyh trudov VII kongressa biofizikov Rossii. Sbornik materialov kongressa. Krasnodar 2023; 228–229 (in Russian).
- Ter-Levonian, A.S., Koshechkin K.A. Review of machine learning technologies and neural networks in drug synergy combination pharmacological research. Research Results in Pharmacology 2020; 6 (3): 27–32. DOI: 0.3897/rrpharmacology.6.49591
- Породиков А.А., Биянов А.Н., Пермя-кова А.В., Туктамышев В.С., Кучумов А.Г., Поспелова Н.С., Фурман Е.Г., Оноприенко М.Н. N-терминальный фрагмент мозгового натрийуретического пептида как предиктор гемодинамической значимости функционирующего артериального протока у недоношенных новорожденных. Пермский медицинский журнал 2021; 38 (1): 5–15 / Porodikov A.A., Bijanov A.N., Permjakova A.V., Tuktamyshev V.S., Kuchumov A.G., Pospelova N.S., Furman E.G., Onoprienko M.N. N-terminal probrain natriuretic peptide as a predictor of hemodynamic significance of functioning ductus arteriosus in premature newborns. Perm Medical Journal 2021; 38 (1): 5–15 (in Russian).
- Permyakova A.V., Porodikov A., Kuchu-mov A.G., Biyanov A., Arutunyan V., Furman E.G., Sinelnkov Y.S. Discriminant Analysis of Main Prognostic Factors Associated with Hemodynamically Significant PDA: Apgar Score, Silverman–Anderson Score, and NT-Pro-BNP Level. J. Clin. Med. 2021; 10 (3729). doi: 10.3390/jcm10163729.
- Verder H., Heiring C., Ramanathan R., Scoutaris N., Verder P., Jessen T.E., Höskuldsson A., Bender L., Dahl M., Eschen C., Fenger-Grøn J., Reinholdt J., Smedegaard H., Schousboe P. Bronchopulmonary dysplasia predicted at birth by artificial intelligence. Acta Pae-diatr. 2021; 110 (2): 503–509. doi: 10.1111/apa.15438.
- Dai D., Chen H., Dong X., Chen J., Mei M., Lu Y., Yang L., Wu B., Cao Y., Wang J., Zhou W., Qian L. Bronchopulmonary Dysplasia Predicted by Developing a Machine Learning Model of Genetic and Clinical Information. Front Genet. 2021; 2 (12): 689071. doi: 10.3389/fgene.2021.689071.
- Na J.Y., Kim D., Kwon A.M., Jeon J.Y., Kim H., Kim C.R., Lee H.J., Lee J., Park H.K. Artificial intelligence model comparison for risk factor analysis of patent ductus arteriosus in nationwide very low birth weight infants cohort. Sci Rep. 2021: 11 (1): 22353. doi: 10.1038/s41598-021-01640-5.
- Son J., Kim D., Na J.Y., Jung D., Ahn J.H., Kim T.H., Park H.K. Development of artificial neural networks for early prediction of intestinal perforation in preterm infants. Sci Rep. 2022; 12: 12112. doi: 10.1038/s41598-022-16273-5.
- Журавлева Л.Н., Новикова В.И., Дер-кач Ю.Н. Определение возможности развития бронхолегочной дисплазии путем определения цитокинового профиля у недоношенных детей. Иммунопатология, аллергология, инфектология 2021; 3: 21–27. doi: 10.14427/jipai.2021.3.21. / Zhuravleva L.N., Novikova V.I., Derkach Ju.N. Determin-ing the possibility of developing bronchopulmonary dysplasia by determining the cytokine profile in premature infants. International journal of Immuno-pathology, allergology, infectology 2021; 3: 21–27. doi: 10.14427/jipai.2021.3.21.
- Higgins R.D., Jobe A.H., Koso-Thomas M., Bancalari E., Viscardi R.M., Hartert T.V., Ryan R.M., Kallapur S.G., Steinhorn R.H., Konduri G.G., Davis S.D., Thebaud B., Clyman R.I., Collaco J.M., Martin C.R., Woods J.C., Finer N.N., Raju T.N.K. Иronchopulmonary Dysplasia: Executive Summary of a Workshop. J Pediatr. 2018; 197: 300–308. doi: 10.1016/j.jpeds.2018.01.043.
Supplementary files
