Abstract
Accurate prediction of crack width is essential for serviceability design and durability assessments of prestressed concrete structures. This study presents an explainable machine learning framework for predicting the maximum crack width of prestressed concrete beams based on experimental data. A comprehensive database of 404 specimens, including the bending moment, prestress index, load ratio,effective depth, and section stiffness, was constructed and enhanced through mechanically informed feature engineering. Random forest and XGBoost regression models were developed and systematically tuned using cross-validated hyperparameter optimization. Among the evaluated models, XGBoost achieved the highest predictive accuracy, with a coefficient of determination of 0.6507 and a root mean square error of 0.184 mm.Model interpretability was investigated using feature importance measures and SHapley Additive exPlanations, which identified the bending moment, load ratio, concrete compressive strength, and prestress index as the dominant factors influencing crack width. The observed relationships are consistent with the established flexural cracking theory, confirming that the proposed model captures physically meaningful behavior. The results demonstrate that combining explainable artificial intelligence with structural mechanics provides a robust and transparent tool for crack width prediction, offering valuable support for the performance-based evaluation and design of prestressed concrete members.
1 Introduction
Prestressed and partially prestressed concrete systems are commonly employed in both bridge and building construction due to their favorable strength-to-weight characteristics, good serviceability behavior, and ability to effectively limit tensile stresses under service loads. Nevertheless, cracking continues to be a significant serviceability issue(particularly in beams), where factors such as prestress losses, end-zone behavior, corrosion, or flexural demands may trigger earlier crack initiation or lead to greater crack propagation than typically assumed in design standards. Even when these cracks are not critical to structural capacity, they can still reduce stiffness, accelerate the corrosion of reinforcement, increase maintenance challenges, and diminish overall structural longevity. Therefore, accurate estimation of crack width in prestressed concrete beams is vital for assessing structural performance, durability, and safety throughout the full service life of the system.
Early research on pretensioned elements revealed that cracking in the anchorage zone commonly develops right after detensioning, largely due to stress concentrations and the interaction between prestressing forces and the local concrete characteristics. Okumus et al. reported that end-zone cracks may continue to extend over time as a result of creep, shrinkage, and temperature effects, emphasizing the limitations of simplified analytical models for these areas [1]. Comparable issues have been noted for flexural cracking under service conditions. Chalioris and Karayannis found that conventional crack-control approaches based on reinforcedconcrete principles may not reliably represent the response of partially prestressed members, particularly with respect to predicting reinforcement stresses and permissible crack widths [2]. The combined influence of these mechanisms and contributing parameters on crack initiation and development in prestressed concrete beams is summarized in Figure 1.
Figure
1
Conceptual framework illustrating the mechanisms of crack initiation, governing parameters, and crack evolution in prestressed and partially prestressed concrete beams
In addition to mechanical effects, corrosion-driven processes play a major role in initiating and propagating cracks in prestressed beams. Wang et al. reported that the combination of a prestressing force and strand corrosion accelerates the onset of cover cracking, enlarges crack widths, and alters the stress-transfer behavior between tendons and the surrounding concrete [3]. Dai et al. introduced a three-phase global deterioration model—microcracking, cover cracking, and subsequent crack width evolution—to characterize this progression, offering a valuable framework for assessing deterioration in prestressed members [4]. Comprehensive reviews further indicate that corrosion-induced cracking remains among the most common and structurally significant damage mechanisms in prestressed concrete bridge structures [5,6].
Crack formation is also highly sensitive to section shape, reinforcement layout, and the partial prestressing ratio(PPR). The results of large-scale tests on continuous partially prestressed beams revealed that lowering the PPR—reflecting a higher proportion of conventional reinforcement—leads to a greater number of cracks with reduced widths and smaller spacing, illustrating the crack-distributing effect of tensile reinforcement [7]. Similarly, experimental findings for prestressed and posttensioned ledge girders indicate that inadequate hanger reinforcement or elevated localized stresses within the D-region can trigger severe re-entrant corner cracking, a phenomenon associated with several historical bridge failures [8,9].Collectively, these investigations highlight the multifaceted nature of crack behavior, governed by factors such as tendon geometry, reinforcement ratios, span-to-depth proportions, and load distribution.
Numerous analytical, empirical, and design-code approaches have been developed to estimate crack widths in prestressed members, yet notable gaps persist between predicted behavior and experimental observations. Several investigations have demonstrated that standard provisions, including those in ACI 318 and Eurocode 2, can produce either unconservative or overly conservative results when applied to partially prestressed members without modification [10–12].Padmarajaiah and Ramaswamy introduced improved crack width formulas for prestressed beams incorporating fiber reinforcement, showing that enhanced confinement effects influence crack initiation and spacing in ways not reflected in conventional design rules [13]. Similarly, numerical analyses employing finite element techniques reveal that crack patterns and widening are governed by complex nonlinear relationships among concrete tension–stiffening behavior, reinforcement strain, and local bond characteristics [14].
In real bridge settings, full-scale experiments further demonstrate the shortcomings of the current crack-prediction approaches. Lu et al. found that the maximum crack widths recorded in a 35-m pretensioned I-girder containing fractured strands were substantially smaller than those estimated using national bridge design codes, suggesting that existing provisions may be overly conservative for certain structural arrangements [15]. Additionally, field surveys of posttensioned ledge girders have documented horizontal cracks ranging from 0.1 to 0.3 mm that have remained for decades, raising concerns about long-term prestress losses and emphasizing the necessity for more accurate predictive and assessment models [16].
Overall, the literature indicates that crack width is governed by a core group of key parameters, such as the bending moment, effective depth, concrete strength, reinforcement ratio, and level of prestressing; however, the interactions among these factors are highly nonlinear and not easily represented by conventional predictive formulas [17, 18]. In addition, the absence of standardized experimental datasets and the variability in reporting practices across studies pose major obstacles to creating consistent, data-driven models for crack width prediction [19-22].
Given these obstacles, emerging artificial intelligence(AI) and machine-learning approaches present a promising path forward. Data-driven models are uniquely capable of capturing complex, nonlinear interactions among multiple variables without imposing restrictive theoretical assumptions [23]. By leveraging diverse datasets, ranging from laboratory tests and field measurements to numerical simulations, AI models can learn patterns in crack initiation and propagation that cannot be identified using conventional equations [24]. Furthermore, once trained, such models can provide rapid, accurate predictions and help identify the most influential parameters governing crack behavior. Thus, AI-based prediction tools offer a powerful opportunity to overcome longstanding limitations in crack width assessment and to support more reliable, performance-based designs and evaluations of prestressed concrete structures [25]. Recent advancements in AI have shown promising results for crack width prediction in concrete members. For reinforced concrete beams, machine learning models such as artificial neural networks, support vector machines, and ensemble methods have been applied to flexural crack width estimation, often outperforming empirical formulas by capturing nonlinear interactions among parameters such as the reinforcement ratio, cover, and applied stress. For partially prestressed concrete beams, boosting frameworks(e.g., XGBoost, LightGBM, AdaBoost, and GBRT) have been utilized to predict the maximum crack width, addressing the limitations of traditional semiempirical models, which are not always effective in cases with high-strength materials and complex stress states. Recent studies have also explored interpretable ML for restraint-induced cracking in RC elements using random forest and XGBoost methods, achieving improved accuracy over code provisions. However, applications specifically to prestressed concrete beams remain limited, with most prior work focusing on RC or PPC without full integration of mechanics-informed features and explainability tools such as SHAP for flexural serviceability assessment. The present study bridges this gap. An explainable, mechanics-informed framework tailored to prestressed beams is developed, leveraging a curated experimental database and emphasizing interpretability aligned with flexural cracking theory. To address the traditional limitations, a curated dataset of the results of 404 prestressed concrete beam tests that are rigorously standardized to resolve missing values, unit inconsistencies, and nonnumeric entries is established. To embed structural mechanics into the prediction process, engineered features—including the effective depth, slenderness ratio, prestressing index, sectional stiffness, and cover-related parameters—are formulated to reflect the fundamental principles governing crack behavior.
Two supervised learning models(Random Forest and XGBoost) were trained using a two-stage optimization strategy. XGBoost outperforming alternative methods. An interpretability module demonstrated that applied moment, effective depth, prestressing level, and concrete strength exert the strongest influence on crack predictions—a finding consistent with decades of experimental and analytical research.
A mechanics-informed and fully explainable machine learning framework for predicting crack width in prestressed concrete beams is introduced in this study to address the key limitations of existing empirical and black-box data-driven approaches. Unlike conventional machine-learning models that rely primarily on raw geometric and material inputs, the proposed method embeds structural mechanics directly into the learning process through physically meaningful feature engineering, including the bending moment, prestress index, load ratio, effective depth, and section stiffness. This integration enables the model to learn relationships that are consistent with flexural cracking theory while maintaining high predictive accuracy. Furthermore, the framework emphasizes model transparency and interpretability by combining ensemble learning with explainability techniques, allowing the influence of each governing parameter to be quantified and physically interpreted. This distinguishes the present work from previous studies that reported predictive performance without providing mechanistic insight. By bridging experimental evidence, structural theory, and explainable artificial intelligence, the proposed approach offers a robust and adaptable tool for serviceability assessment, supports future calibration of crack width design provisions, and expands the methodological foundation for applying artificial intelligence in prestressed concrete engineering.
The structure of this paper is organized as follows. Section 2 presents the materials and methods, including the experimental database, data preprocessing procedures, engineered feature development, and explainable AI modeling framework. Section 3 reports the results of the predictive models, with emphasis on crack width estimation performance and key evaluation metrics. In Section 4, the predictive performance of the proposed models is evaluated, and the results are interpreted via feature importance and SHAP analyses, linking the dominant predictors to established flexural cracking theory for prestressed concrete beams.Finally, Section 5 concludes the paper by summarizing the main contributions and outlining recommendations for future research.
2 Materials and Methods
In this section, the materials and methods used to develop the explainable AI framework for analyzing crack width behavior in prestressed concrete beams are described. This study is based on an experimental database containing laboratorytested prestressed concrete beam specimens, each of which is characterized by geometric properties, concrete material characteristics, prestressing parameters, reinforcement details, and flexural loading conditions. Because the raw dataset included inconsistencies such as list-formatted entries, string-based numerical values, and missing or incomplete measurements, a comprehensive data preprocessing pipeline was implemented. This process involved unit normalization, conversion of nested values into aggregated scalars, removal of invalid entries, and replacement of infinite or undefined values to ensure computational reliability.
To enhance the structural relevance of the AI model, a set of engineered features was developed on the basis of established mechanical principles. These included the effective depth, slenderness ratio, prestressing index, sectional moment of inertia, cover-related parameters, and load ratio. These features were selected to capture key behaviors governing crack formation and propagation under flexural loading.
The cleaned dataset and engineered features were then used to train an explainable AI modeling framework. Two predictive models were considered and optimized through a two-stage hyperparameter search process. Model performance was evaluated using standard statistical metrics, and an interpretability module was incorporated to quantify the influence of individual features on crack width predictions.
2.1 Experimental Database
The analyses in this study are based on a curated experimental dataset [26] titled Crack width and crack spacing in reinforced and prestressed concrete elements(V2.0)developed at Eindhoven University of Technology. The database synthesizes results from 30 experimental programs documented in the literature, encompassing cracking behavior in reinforced or prestressed concrete members under bending, axial tension, or combined loading. The goal of the compilation is to provide a unified, high-quality source of empirical measurements for research on crack formation, tension stiffening, bond behavior, and serviceability performance.
The main dataset contains 24,297 total records(rows). Each record corresponds to a single measured quantity(such as the number of cracks, mean crack width, or maximum crack spacing) associated with a specific load level of a tested structural element. The data are categorized using a variable valcat, which specifies the type of measurement. While crack width is not the only descriptor of cracking behavior, it remains the most critical serviceability indicator in prestressed concrete design because of its direct relationship with durability and code compliance. This dataset is particularly rich in crack width data across a wide variety of structural configurations and load states. The database follows a three-level hierarchical structure. At the top level, each Experimental Program(EPID) contains bibliographic details and contextual information about the original publications. The second level describes the Structural Elements(ELID), including the geometric properties, reinforcement and prestressing layouts, concrete material characteristics(compressive strength, tensile strength, modulus of elasticity), concrete mix parameters, curing and preparation information, and properties of both reinforcing and prestressing steel. The third level, Loads and Data Points(LOADID), stores the applied loading steps along with the measured crack-related quantities. By structuring the data in this way, the dataset ensures traceability from individual observations back to the experimental origins and material properties.
Data quality and comparability were ensured through a rigorous selection process. Structural elements were excluded when essential geometry or material data were missing, when reinforcement layouts fell outside the intended scope(e.g., asymmetric layouts or nonmetallic reinforcement), when prestressing data were incomplete, or when the element was damaged prior to testing. To address these issues, a structured preprocessing pipeline was developed. First, all the variables were normalized into explicit numerical or categorical formats, with consistent SI units across the entire dataset. The list-formatted crack measurements were decomposed into individual records, each expressed as a separate observation.Missing values were standardized following the protocol stated in the original documentation—specifically, empty or zero-filled cells were treated as “unknown or not experimentally determinable”. In parallel, the geometric, material, and reinforcement properties were harmonized across the contributing programs.Following these processing steps, all the valid observations were integrated into the finalized dataset, which was used in the analysis.
Through preprocessing, the dataset was transformed from a heterogeneous, multifile source into a unified, analysis-ready structure. The EPIDs link each record to one of the 30 independent experimental programs reported in the literature, each corresponding to a distinct structural typology and testing methodology. In the cleaned dataset, the test series reported by Harajlí and Naaman(1985) [27] is used to investigate partially prestressed concrete beams under static and fatigue loading.These beams were constructed with bonded tendons and subjected to four-point bending tests to quantify the evolution of cracking, steel stress, and prestress-related effects. Within each experimental program, ELID differentiates the individual structural members tested. Each ELID corresponds to a unique beam, tie, or partially prestressed element with its own geometric, material, and reinforcement configuration. In the cleaned file, EPID 17 includes elements such as ELID= PD1, which corresponds to a specific partially prestressed beam with a total length ≈ 3.05m, span length ≈ 2.74 m, section depth h= 228.6 mm, bonded prestressing tendons, and measured concrete properties(e.g., 𝑓𝑐𝑚, 𝑓𝑐𝑡𝑚, and 𝐸𝑐𝑚). All rows sharing the same EPID–ELID pair describe the full loading history and crack-response evolution of that single physical specimen.
The LOADID variable indexes individual load application stages for each structural element. Each LOADID corresponds to a discrete increment of applied load or moment during the experiment and is associated with the measured crack response at that particular stage. For instance, in EPID 17/ ELID PD1, LOADIDs such as 1, 2, 4, 18, and 19 represent sequential measurement points during the beam’s loading history. Each LOADID captures a unique combination of the applied load(𝑃𝐿), bending moment(𝑀), calculated steel stress(𝜎𝑠), and corresponding measured crack width. This structured organization ensures precise traceability from each crack measurement back to the experimental origin, the physical specimen, and its loading conditions, thereby enabling robust statistical and computational analyses of cracking behavior for reinforced and prestressed concrete. The detailed geometric parameters of the structural elements are presented in Table 1, and the hierarchical structure of the dataset, including the mapping among EPIDs, ELIDs, and their corresponding load steps, is provided in Table 2. Together, these tables summarize the physical characteristics of the specimens and the organization of the measurement data used in this study.
The final dataset used in this study consists of 404 laboratory-tested prestressed and partially prestressed concrete beam specimens, extracted from the original database after systematic screening. The selected specimens are predominantly simply supported rectangular beams tested under quasistatic four-point bending, with crack measurements taken in constant-moment regions. The beams cover practical ranges of geometric, material, and prestressing parameters relevant to serviceability assessment: section depths of approximately 200–300 mm, widths of approximately 100–200 mm, and spans between approximately 2.5 and 4.0 m. The concrete compressive strength ranges from approximately 25 to 60 MPa, whereas the prestressing level varies from low to moderate partial prestressing ratios when bonded tendons are used. The measured maximum crack widths are mainly within the serviceability range(approximately 0.05–0.40 mm), with a few specimens exhibiting larger crack widths.
To ensure consistency and reliability, specimens were excluded when essential geometric, material, prestressing, or loading information was missing or inconsistent or when configurations fell outside the scope of this study(e.g., unbonded or external tendons, nonrectangular cross-sections, combined shear–flexure or axial loading).All the retained data were converted to consistent SI units, nonnumeric entries were removed, and only the maximum recorded crack width per specimen was used as the target variable. Although the resulting dataset provides a robust experimental basis for modeling flexural crack width in conventional prestressed concrete beams, its applicability is limited to the statistical and physical bounds of the selected specimens. In particular, long-term effects, cyclic or fatigue loading, environmental degradation, and highly damaged states are not explicitly represented and may require additional data for reliable extrapolation.
Table
1
Relationships between the experimental programs (EPIDs) and their corresponding structural elements (ELIDs), showing the number of measured load increments associated with each specimen
Table
2
Summary of the geometric parameters for each structural element (ELID), detailing section dimensions, concrete cover characteristics, and the positions of both mild and prestressing reinforcements[26]
Note 1: The height and width of those cross-sections are 228.6 mm and 114.3 mm, respectively.
Note 2: ℎ𝑐: denotes the curing condition. A specific condition is denoted with a number.Multiple curing conditions can be denoted in an array. Different available conditions: 1 = keeping the formwork in place, 2 = covering the concrete surface with vapour-proof sheets, 3 = placing of wet coverings on the surface, 4 = keeping the concrete surface visibly wet with suitable water, 5 = application of a curing compound, 6 = stored inside test hall, 7 = stored inside climate room, 8 = stored outside.
Although the curated dataset of 404 specimens offers comprehensive coverage of cracking behavior in rectangular sections of simply supported prestressed and partially prestressed concrete beams under quasistatic four-point bending with bonded tendons, its scope remains limited to these conditions. The included experimental programs focus on constant-moment regions in laboratory-scale elements, with straight or moderately draped prestressing layouts and monotonic or low-cycle loading. Non-rectangular cross-sections, unbonded tendons, multi-stage or external prestressing, combined bending, shear or axial loading, cyclic/fatigue regimes, long-term sustained loads, and environmental degradation mechanisms(e.g., corrosion, freeze–thaw, or alkali–aggregate reactions) are not represented.Therefore, the trained models are most reliable within the statistical and configurational bounds of the database and should be applied with caution or supplemented with targeted additional data outside these limits to minimize extrapolation uncertainty.
2.2 Engineered Feature Development
To incorporate fundamental structural mechanics into the predictive framework, a set of engineered features was developed based on standard principles governing flexural behavior and crack formation in prestressed concrete beams. These features provide physically interpretable variables that enhance both predictive accuracy and model explainability. The effective depth, which governs the lever arm of internal forces, was computed as
(1)
where ℎ is the total beam depth, 𝑐 is the concrete cover to the prestressing strands, and ϕ𝑝 represents the equivalent diameter of the prestressing steel bundle.The slenderness ratio, reflecting global deformation behavior, was defined as
(2)
with 𝐿 denoting the clear span. Increased slenderness is typically associated with larger deflection and curvature, contributing to wider flexural cracks. The moment of inertia of the uncracked rectangular cross-section was evaluated using
(3)
where 𝑏 and ℎ denote the section width and height, respectively. This parameter affects flexural stiffness and the curvature response.
(4)
where 𝑀 is the applied bending moment and 𝐸𝑐 is the elasticity modulus of concrete. A prestressing index was formulated to quantify the contribution of the effective prestressing force to resisting tensile stresses:
(5)
where 𝐴𝑝 is the total area of prestressing steel, 𝑓𝑝𝑒 is the effective prestress after losses, and 𝑓𝑐𝑚 is the mean compressive strength of the concrete. This dimensionless parameter characterizes the degree to which prestressing delays cracking. The reinforcement spacing ratio is as follows
(6)
where 𝑆𝑝 denotes the spacing between prestressing strands. Both ratios influence crack spacing and localization, which are directly related to the maximum crack width behavior. Collectively, these engineered features form the vector
(7)
providing physically meaningful descriptors that embed structural behavior into the AI model. Their inclusion enhances interpretability and enables the explainable framework developed in subsequent sections. While the effective depth(𝑑), gross moment of inertia(𝐼), and slenderness ratio(𝐿/ℎ) follow the classical definitions used in design codes and tension-stiffening models, the prestressing index(𝑃𝐼) and load ratio(𝐿𝑅) were formulated to explicitly quantify the relative crack-controlling effect of prestress and the nonlinear progression of cracking under increasing flexural demand. These derived features, though rooted in mechanics principles, are not routinely used together in prior machine-learning applications for prestressed concrete crack width prediction and proved critical for model performance and interpretability.
2.3 AI Modeling Framework
An explainable machine learning framework was developed to model crack width behavior in prestressed concrete beams by integrating supervised learning algorithms with mechanics-informed feature engineering. Two ensemble regression models were implemented: the Random Forest Regressor(RF) and the Extreme Gradient Boosting Regressor(XGBoost). These algorithms were selected due to their capacity to approximate highly nonlinear structural responses, mitigate overfitting through ensemble aggregation, and accommodate heterogeneous input variables characteristic of structural engineering datasets.
Ensemble regression algorithms were selected due to their robustness in handling nonlinear, heterogeneous structural datasets with moderate sample sizes and potential multicollinearity among features(e.g., geometric and material parameters). Random Forest was chosen for its bagging-based approach, which reduces variance through parallel tree construction and bootstrapping, provides inherent feature importance via mean decrease in impurity, and offers high interpretability. Extreme Gradient Boosting(XGBoost) was adopted for its sequential boosting mechanism with regularization(preventing overfitting), efficient handling of sparse/missing data, and superior predictive performance on tabular engineering data through gradient-based optimization and tree pruning. Both algorithms excel in capturing complex interactions without restrictive assumptions, align well with mechanics-informed features, and support post-hoc explainability(e.g., SHAP). XGBoost was ultimately favored for its slight edge in accuracy(R²= 0.6507 vs. 0.6306 for RF), while RF provides complementary interpretability advantages for engineering judgment. Alternative methods(e.g., ANN, SVR) were considered but prioritized less due to higher tuning complexity, lower inherent explainability, or inferior performance on this dataset size and feature type.
A unified feature set consisting of geometric parameters, material properties, prestressing metrics, and section-level mechanical indicators was used for both models(as explained in the previous section) as inputs. Twelve input features were included: beam width(𝑏), beam height(ℎ), span length(𝐿𝑠𝑝𝑎𝑛), concrete compressive strength(𝑓𝑐𝑚), prestressing steel area(𝐴𝑝), prestressing steel strength(𝑓𝑝𝑘), prestress index(𝑃𝐼), applied bending moment(𝑀), load ratio(𝐿𝑅), section moment of inertia(𝐼), effective depth(𝑑), and slenderness ratio(𝐿/ℎ). These parameters represent the primary geometric, material, and mechanical factors governing flexural cracking behavior in prestressed concrete beams. The target output variable was the maximum crack width(𝑊𝑚𝑎𝑥). Only samples corresponding to the maximum recorded crack width were used for training, and missing or nonphysical values were removed prior to model development. A two-stage hyperparameter optimization strategy was employed to ensure both predictive accuracy and model generalizability. All the models were trained using an 80/20 train–test split for the filtered dataset. In the initial optimization stage, RandomizedSearchCV is used to explore the global hyperparameter space efficiently. This method enables stochastic sampling of candidate configurations, providing comprehensive coverage of allowable depths, regularization terms, sampling ratios, and ensemble sizes without the need for exhaustive evaluation. Following the randomized exploration stage, a GridSearchCV procedure was applied for the best-performing configurations to refine the hyperparameters within a localized region. This two-tier optimization approach ensures a balance between global exploration and local precision. The Python programming language was used to script the algorithms. Tables 3 and 4 summarize the full search spaces for both models, including the randomized and grid search ranges.
All models were implemented in Python 3.10 using scikit-learn 1.3.2(for Random Forest and RandomizedSearchCV/GridSearchCV utilities), XGBoost 2.0.3, pandas 2.1.0, numpy 1.25.2, and SHAP 0.43.0. To ensure reproducibility of results involving randomness(e.g., train–test splitting, bootstrap sampling in Random Forest, subsampling in XGBoost, and randomized hyperparameter search), a fixed global random seed of 42 was set via random.seed(42), np.random.seed(42), and random_state=42 for all relevant models and splitter initializations. The hyperparameter optimization process(RandomizedSearchCV followed by GridSearchCV) was computationally lightweight: on a standard laptop(Intel Core i7-12700H, 32 GB RAM), the full pipeline, including data preprocessing, feature engineering, two-stage tuning, model training, and SHAP computation, required approximately 3–5 minutes per model. No GPU acceleration or distributed computing was used. The exact versions of all dependencies are listed in the repository’s requirements.txt file, and the complete reproducible workflow—including seed setting, data loading, preprocessing steps, model training, evaluation, and visualization—is provided in Jupyter notebooks at https://github.com/BehradEmadi.
Table
3
Hyperparameter search space for Random Forest Regressor
Table
4
Hyperparameter search space for XGBoost Regressor
Model performance was quantified using two standard regression metrics: the coefficient of determination(R2) and the root mean square error(RMSE). These metrics were computed directly from the model predictions and the experimental crack width measurements:
(8)
where 𝑦𝑖 denotes the experimentally measured crack width for the 𝑖-th observation and 𝑦̂𝑖 represents the corresponding predicted value obtained from the machine learning model. The term 𝑦̅𝑖 refers to the arithmetic mean of all measured crack widths in the dataset and serves as a baseline representing the average structural response. The total number of samples is denoted by 𝑛. The use of the RMSE enables interpretation of the prediction accuracy in physical units(mm), while R2 provides a normalized measure of explained variance.
Both the RF and XGBoost models provide intrinsic measures of feature importance. These importance rankings, the prestressing level, effective depth, applied moment, and section stiffness were consistently identified as dominant predictors—findings aligned with established structural mechanics principles. The feature importance outputs were generated directly as follows:
• RF: best_model.feature_importances_(bar charts and cumulative curves)
• XGB: internal gain-based importance via plot_importance()
The combined modeling and interpretability framework offers two key advantages:
Mechanics-Consistent Predictive Behavior
The dominance of physically meaningful features indicates that the models capture behavior aligned with structural design theory rather than spurious correlations.
Transparent and Actionable Outputs for Engineers
By linking predictions to interpretable feature contributions, the framework supports engineering judgment, expands the applicability of AI tools in structural assessment, and helps connect data-driven methods and established mechanical models.
3 Results
This section presents the outcomes of the machine learning framework used to predict the maximum crack width in prestressed concrete beams. Two ensemble models, the RF and XGB models, were evaluated through accuracy metrics, visual diagnostics, and feature importance analysis.
3.1 Model Training and Performance Evaluation
The Random Forest model architecture with the 12 engineered feature inputs is propagated through 200 parallel decision trees, with predictions aggregated via bootstrapped averaging. The corresponding XGBoost model architecture is demonstrating the boosted sequential refinement process used to minimize error across iterations. The influence of ensemble size on the RF model was examined using the number of decision trees(n_estimators). Figure 2 shows that model accuracy increases rapidly as the number of trees rises to approximately 200, after which improvements plateau. The highest R² occurs for small ensembles(around 50 trees), but performance becomes more stable for 200 trees and above; therefore, 200 trees were selected as a compromise between accuracy and robustness. This validates the selected configuration of 200 estimators as a balanced choice between stability and computational cost.
Figure
2
Effect of the number of trees (n_estimators) on Random Forest model accuracy (R²)
The XGBoost learning behavior is illustrated in Figure 3(training curve). As previously described, the training RMSE drops rapidly during early iterations, while the test RMSE converges smoothly, indicating controlled model complexity and an absence of overfitting.
Figure
3
XGBoost training curve showing train and test RMSE across boosting rounds
3.2 Predictive Performance of the Models
The measured-versus-predicted for RF(Figure 4a) shows that most observations align closely with the ideal 1:1 line, particularly in the serviceability crack-width range(0–0.4 mm). A few points with large crack widths(𝑊𝑚𝑎𝑥> 1.0 mm) exhibit underprediction, consistent with the limited representation of high-crack specimens in the dataset. The RF residual(Figure 4b) illustrates that residuals are tightly centered around zero for moderate crack widths but exhibit several positive spikes for larger values—again reflecting dataset imbalance.
Figure
4
Measured versus predicted crack width (𝑊𝑚𝑎𝑥) and residuals for the Random Forest model
The distribution of RF prediction errors(Figure 5) shows a dense central peak around zero, with most errors falling between–0.05 mm and+0.10 mm. Only a small number of residuals exceed 0.3 mm, indicating that significant deviations are isolated and not systemic.
Figure
5
Distribution of prediction errors for the Random Forest model
The measured versus predicted(Figure 6a) confirms a strong linear relationship between predicted and experimental crack widths. Most points lie close to the oneto-one dashed line, particularly within the common serviceability range(0.05–0.40mm). The residual plot for XGBoost(Figure 6b) provides further insight. Two distinct behavioral zones are apparent:
Serviceability range(0–0.4 mm): Residuals are tightly clustered around zero, demonstrating excellent predictive stability.
Large crack widths:A few positive residuals appear, consistent with underprediction and reflecting the scarcity of such samples.
Figure
6
Measured versus predicted crack width (𝑊𝑚𝑎𝑥) and residuals for the XGBoost model
The error-distribution histogram(Figure 7) shows a sharply peaked distribution, with most residuals between–0.05 and 0.05 mm. Only a limited number of observations fall in the upper tail(residuals> 0.3 mm), confirming that large prediction errors are isolated rather than systematic.
Figure
7
Error-distribution histogram for XGBoost predictions
After hyperparameter tuning, both models achieve similar accuracy:
XGBoost: R²= 0.6507, RMSE= 0.184 mm
Random Forest: R²= 0.6306, RMSE= 0.1892 mm
XGBoost provides slightly higher accuracy and smoother residual patterns, although both models show strong predictive alignment with experimental data.
3.3 Feature Influence and Explainability Analysis
The RF feature-importance ranking(Figure 8a) identifies the applied bending moment as the most influential feature, followed by the load ratio, prestressing index, concrete compressive strength, and effective depth. These variables correspond directly to the established mechanics governing crack formation. The cumulative contribution of RF features(Figure 8b) indicates that the top five variables account for more than 90% of the model’s predictive influence.
Figure
8
Feature importance analysis for the Random Forest model
A representative single decision tree(Figure 9) further illustrates how the RF partitions the input space, with early splits dominated by the moment, prestress index, and concrete strength.
Figure
9
Representative single decision tree extracted from the Random Forest ensemble
The feature-importance values for XGBoost(Figure 10) highlight relationships consistent with structural behavior that applied moment is the dominant feature and load ratio and concrete strength also contribute strongly. Prestress index and prestressing area show moderate influence. In addition, effective depth has secondary but meaningful influence. The consistency between RF and XGB strengthens confidence in the model interpretability and confirms that the algorithms rely on physically meaningful variables.
Figure
10
XGBoost feature-importance ranking based on total gain across trees
3.4 Comparative Evaluation of the Models
Both the RF and XGB models capture the essential nonlinear mechanisms governing crack width in prestressed beams. XGBoost exhibits slightly superior predictive performance, especially in reducing residual scatter for mid-range crack widths. Meanwhile, Random Forest demonstrates strong explainability due to its inherently interpretable structure. Both models emphasize the same structural drivers, validating the mechanics-informed feature engineering methodology used in this study. While XGBoost delivered marginally superior predictive performance(R²= 0.6507, RMSE= 0.184 mm) compared with Random Forest(R²= 0.6306, RMSE=0.1892 mm), the choice between these ensemble methods involves a deliberate tradeoff between accuracy, interpretability, and usability in design and assessment practice. XGBoost's sequential boosting mechanism and finer regularization enable it to capture more nuanced nonlinear interactions in the data, yielding slightly tighter residuals and better handling the moderate imbalance in the dataset toward small crack widths. However, this comes at the cost of increased model complexity:XGBoost requires more careful hyperparameter tuning(e.g., learning rate, subsample ratio, tree depth) and results in a larger, less intuitive ensemble structure.In contrast, Random Forest's parallel bagging approach produces a simpler, more stable model with fewer tuning demands, inherently lower risk of overfitting on small-to-medium structural datasets, and greater intrinsic interpretability. Its feature-importance scores(mean decrease in impurity) and the ability to inspect individual trees(as illustrated in Figure 9) make it easier for engineers to trace decision paths and communicate model reasoning to stakeholders, code committees, or forensic investigations. Both models benefit from post-hoc explainability tools such as SHAP, which provide comparable global and local insights(e.g., dominant influence of bending moment, load ratio, and prestress index). For serviceabilityfocused applications, where predictions support routine crack-width checks, sensitivity studies, or preliminary performance-based evaluations, Random Forest may offer a preferable balance of reliability, transparency, and ease of adoption by practicing engineers. XGBoost, on the other hand, is better suited when maximizing predictive precision is prioritized(e.g., in research-oriented calibrations of code provisions or when the model is integrated into automated monitoring pipelines with abundant validation data). Ultimately, the framework allows practitioners to select the model that best aligns with project-specific needs regarding accuracy requirements, computational resources, regulatory acceptance, and the necessity for transparent justification of predictions. The framework is developed and validated for predicting maximum flexural crack width in prestressed and partially prestressed rectangular concrete beams under quasi-static four-point bending(constant-moment region) with bonded tendons. It relies on a curated subset of the Eindhoven crackwidth database(v2.0) and performs best for serviceability-level crack widths(≈0.05–0.4 mm). Predictions for crack widths>1.0 mm show underprediction due to data scarcity at the tail. The model is not validated for non-rectangular sections, unbonded/external prestressing, combined actions, cyclic/fatigue loading, long-term effects, or environmental degradation. Applications outside these bounds require further validation or data augmentation.
4 Discussion
The results of this study demonstrate that data-driven models, when combined with mechanics-informed feature engineering, can accurately predict maximum crack width in prestressed concrete beams. Both the Random Forest and XGBoost models achieved strong predictive performance, with the XGBoost model providing the best accuracy(R²= 0.6507, RMSE= 0.184 mm). These results confirm that the engineered features successfully captured the dominant mechanisms governing flexural cracking—an outcome consistent with decades of experimental and analytical research(e.g., tension-stiffening theory, prestress loss mechanics, and flexural stress distribution models). A key outcome of this study is a model which can predict crack width; another outcome of the study is the strong alignment between model explainability and fundamental structural mechanics. To improve the interpretability of the machine-learning models, the Shapley Additive Explanations(SHAP) framework was employed to quantify the contribution of each input variable to the predicted crack width [28]. SHAP is a game-theoretic method that attributes a “fair share” of the model output to each feature by evaluating its marginal contribution across all possible combinations of predictors. This approach provides a transparent, mathematically consistent explanation of model behavior, even for complex nonlinear algorithms such as XGBoost [29]. SHAP values indicate whether a feature increased or decreased the predicted crack width and by how much, enabling direct comparison between variables on a unified scale. Because the method is based on additive feature attribution, the sum of all SHAP values exactly reconstructs the model’s prediction, allowing both global interpretability(overall feature importance) and local interpretability(case-specific reasoning). The use of SHAP is particularly valuable in the context of structural engineering, where model predictions must align with well-established mechanical principles. The SHAP analysis performed in this study revealed that the XGBoost model relied primarily on physically meaningful variables(such as applied bending moment, load ratio, prestress index, and concrete compressive strength) to estimate crack width. The dependence plots demonstrated clear and intuitive relationships, with moment and load ratio exerting strong positive effects on crack formation, while higher prestressing levels and stronger concrete mixtures reduced cracking. These trends correspond closely to classical flexural theory and prior experimental research on prestressed beams, confirming that the model’s internal logic is consistent with structural mechanics. By bridging data-driven modeling with transparent explanation, SHAP enhances trust in the predictive framework and supports its potential use in design calibration, structural assessment, and performance-based evaluation.
Feature-importance results from both models identified the applied bending moment as the most influential predictor of crack width. This was further confirmed through SHAP analysis, where the SHAP summary plot(Figure 11a) and SHAP bar plot(Figure 11b) both ranked M as the dominant factor, followed by load ratio, prestress index, and concrete compressive strength. These variables directly correspond to the theoretical drivers of cracking predicted by flexural theory—most notably, their influence on tensile-zone stress and crack propagation.
Figure
11
Result of SHAP analysis
The SHAP dependence plots provide deeper insights into the nonlinear interactions influencing crack formation:
Load ratio dependence(Figure 12a) showed a clear monotonic increase in SHAP values as beams approached ultimate load levels. This finding aligns with prior work showing that crack width grows disproportionately once tension stiffening degrades near serviceability limits.
Moment–crack interaction(Figure 12b) revealed a strong, nearly linear growth of SHAP value with applied moment, matching the theoretical relationship between bending stress and flexural cracking
Prestress index behavior(Figure 12c) showed that higher prestressing levels consistently reduce SHAP values, confirming their role in minimizing tensile stress and delaying crack initiation. This mirrors the observations of previous studies on bonded and unbonded prestressing systems.
Concrete strength effects(Figure 12d) demonstrated that increases in 𝑓𝑐𝑚 values produce increasingly negative SHAP values, which is consistent with a stiffer tension zone and smaller crack openings.
Notably, SHAP also revealed interaction effects that are difficult to observe directly through classical sensitivity analyses. For example, the interaction between load ratio and prestress index indicated that beams with low prestressing experience disproportionately larger crack widths at high load levels—a trend corroborated by historical experimental studies but now quantified through data-driven modeling.
Figure
12
SHAP dependence plots
Replacing or supplementing raw inputs with dimensionless or mechanicsderived indices reduces multicollinearity, embeds domain knowledge directly into the feature space, and enables the ensemble models to more efficiently learn physically consistent nonlinear interactions, as evidenced by the high rankings of 𝑃𝐼, 𝑀, and 𝐿𝑅 in both gain-based importance and SHAP analyses. Both the RF and XGBoost models slightly underpredicted extreme crack widths(𝑊𝑚𝑎𝑥> 1.0 mm).This behavior is attributable to dataset imbalance: few samples were representative of highly damaged states. Such phenomena are well documented in ML-based structural predictions, where rare events are associated with wider uncertainty bounds. Despite this phenomenon, the predictive accuracy for serviceability-level cracking, which is most relevant to design codes(Eurocode 2 [30], ACI 318 [31], fib Model Code [32]), remained excellent, with residuals tightly centered on zero.
Compared with using raw geometric inputs alone, these features improved both the performance and interpretability of the ML models. This finding supports emerging perspectives in the structural AI literature, emphasizing that domaininformed ML models outperform purely data-driven approaches and reduce the risk of spurious correlations. The findings of this study are consistent with those of prior experimental investigations showing that crack width is most sensitive to flexural demand, reinforcement position, concrete tensile strength, and prestressing force.However, the present work advances beyond traditional parametric studies in the following ways:
(1) Quantifying nonlinear interactions(via SHAP),
(2) Providing interpretable ML-driven rankings, and
(3) Introducing predictive models capable of being generalized across multiple test series.
These contributions position explainable AI as a promising complement, not a replacement, to the analytical crack width models used in design codes. The demonstrated ability of ML models to reflect physical crack behavior suggests their potential for improving serviceability predictions in existing code equations, assisting in reliability-based calibration of cracking limits, and supporting continuous condition assessment using sensor-or image-based measurements.
Overall, the findings suggest that explainable AI can serve as a powerful complement to existing analytical and code-based crack width models. By reflecting actual experimental behavior while maintaining transparent, mechanics-consistent reasoning, the proposed framework provides a practical tool that may support design calibration, structural assessment, and improved performance-based evaluation of prestressed concrete systems.
The transparent, mechanics-aligned insights provided by the proposed framework, particularly through SHAP dependence plots(Figure 12) and consistent feature rankings across both models, offer a promising pathway for informing future revisions and reliability-based calibrations of crack width provisions in design codes such as Eurocode 2, ACI 318, and the fib Model Code. For example, the dominant positive monotonic influence of the applied bending moment(𝑀) and load ratio(𝐿𝑅) on the predicted crack width, together with the clear mitigating effect of increases in the prestressing index(𝑃𝐼) and concrete compressive strength, quantitatively confirms and improves the long-standing theoretical understanding of flexural cracking in prestressed members. These relationships could support targeted adjustments to empirical coefficients in code equations(e.g., those governing the effective tension area, bond/transmission length, or partial prestressing reduction factors) by providing data-driven evidence of parameter sensitivity and interaction effects that are difficult to isolate in traditional parametric studies or limited experimental campaigns. Moreover, the observed nonlinearities, such as the disproportionate crack width increase at high load ratios once tension stiffening decreases, highlight regions where the current simplified linear or semiempirical formulations may be unconservative or overly conservative, especially for partially prestressed beams with low prestressing ratios. By quantifying the relative contribution of each governing variable(e.g., via mean absolute SHAP values or dependence trends), the framework enables objective, evidence-based calibrations of safety factors, crack width limits, or partial safety coefficients for serviceability, potentially leading to more economical yet reliable designs without compromising durability. While full code revision would require broader validation across diverse structural typologies, loading histories, and environmental conditions(as noted earlier), the present explainable models provide a reproducible, interpretable starting point for such efforts, bridging experimental databases, classic theory, and modern data-driven techniques to support performance-based and reliability-oriented updates to prestressed concrete serviceability provisions.
Future research should include expanding the experimental database, particularly for larger crack widths, and exploring model generalization across different prestressing layouts, shear–span ratios, and environmental conditions.Incorporating additional interpretable AI tools(e.g., partial dependence, counterfactuals) may further enhance structural insight and aid in identifying behavioral anomalies in unique or critical cases.
The explainable AI framework is conceptually adaptable owing to its mechanics-informed feature set, ensemble learning strategy, and transparent interpretability tools(feature importance and SHAP). Nevertheless, the present validation is restricted to the characteristics of the selected database subset, predominantly comprising records for rectangular beams subjected to controlled flexural loading in laboratory settings. Reliable generalization to other configurations faces several challenges: nonrectangular cross-sections would require the inclusion of additional geometric descriptors(e.g., flange-to-web ratios and transformedsection properties); alternative prestressing arrangements(unbonded tendons, multistrand eccentricity variations, and externally prestressed systems) may involve different bond and stress-transfer mechanisms not captured in the current features;diverse loading scenarios(cyclic, shear-dominant, combined actions, or sustained long-term loading) introduce time-dependent, hysteretic, or multiaxial effects absent from the training set; and environmental factors(corrosion-induced cracking and durability-related damage) necessitate explicit damage or time-varying proxies.While the physics-aligned features provide a promising basis for future transfer learning or domain adaptation, direct application to these unrepresented conditions is not advised without further validation and dataset augmentation. This highlights the framework’s current strength as a complementary tool for serviceability-focused assessments of conventional prestressed rectangular beams under quasistatic bending while underscoring the need for broader experimental coverage to expand its practical scope.
5 Conclusions
In this study, an explainable machine learning framework for predicting the maximum flexural crack width in prestressed concrete beams was developed. A curated experimental database of 404 specimens is integrated with mechanicsinformed feature engineering(including the prestressing index, load ratio, effective depth, and sectional stiffness) and transparent interpretability through random forest, XGBoost, and SHAP analyses.
Compared with traditional empirical and code-based methods(ACI 318, Eurocode 2, and fib Model Code), the framework introduces key innovations: direct embedding of structural mechanics into the feature space to capture nonlinear interactions; improved predictive accuracy within the serviceability range(XGBoost R²= 0.6507; RMSE= 0.184 mm); and full explainability, which supports the quantification of the influence and nonlinear effects of dominant parameters(bending moment, load ratio, prestress level, and concrete strength) in a way that classic formulas cannot. These advances provide practical value, and the framework provides a transparent, data-driven complement to existing design schemes, enabling rapid serviceability checks, parameter sensitivity studies, performancebased evaluations of existing structures, and evidence-based calibrations of crack width limits.
The model performs reliably for serviceability-level crack widths(approximately 0.05–0.4 mm) in rectangular-section beams with bonded tendons under quasistatic four-point bending. However, it underpredicted large crack widths(>1.0 mm) because of data scarcity for extreme damage states and was not validated for nonrectangular sections, unbonded or external prestressing, cyclic or sheardominant loading, long-term sustained effects, or environmental degradation(e.g., corrosion). These limitations restrict reliable application to severe damage scenarios, ultimate limit states, or deteriorated structures, where extrapolation may lead to unconservative or misleading results.
Beyond simply expanding the dataset, model performance could be improved through weighted or focal loss functions to better handle tail behavior, physicsinformed constraints for the loss term, quantile regression for uncertainty quantification, transfer learning or domain adaptation for new configurations, and hybrid approaches that combine ML predictions with analytical or mechanical equations for improved robustness and trustworthiness.
Overall, the proposed mechanics-informed and explainable framework bridges experimental evidence, classical flexural theory, and modern artificial intelligence, offering a reliable tool for serviceability-focused assessment and the design of prestressed concrete beams while laying a strong foundation for future extensions to broader structural and environmental conditions.
Supplementary Materials: Supporting information can be accessed online. The refined experimental dataset, full preprocessing scripts, trained machine-learning models scripts, hyperparameter-tuning workflows, and SHAP explainability scripts used in this study are publicly available at: https://github.com/BehradEmadi
Author Contributions: Conceptualization, S.Emadi; methodology, S.Emadi;validation, S.Emadi, and P.Lotfabadi; formal analysis, A.Farmanfarma; investigation, A.Farmanfarma; data curation, S.Emadi; writing—original draft preparation, S.Emadi; writing—review and editing, A.Farmanfarma; visualization, A.Farmanfarma; supervision, S.Emadi; project administration, S.Emadi. All authors have read and agreed to the published version of the manuscript.
Conflict of interest: The authors disclosed no relevant relationships.
Data availability statement: The data that support the findings of this study are available from the corresponding author, Ma, upon reasonable request.

Seyyedbehrad Emadi
D.Eng. Associate Professor, Working at Department of Civil Engineering, Universita Degli Studi di Salerno, Fisciano, Italy.
Research Direction: Application of Advanced Artificial Intelligence(AI)Techniques for Condition Assessment and Structural Health Monitoring of Civil Infrastructure.
Email: Semadi@unisa.it

Amir Hossein Farmanara Bozorgzad
D.Eng. Research Scientist. Working at Institute of Building Materials, Concrete Construction and Fire Safety, Technische Universität Braunschweig, Germany.
Research Direction: Numerical Modeling and Experimental Analysis of Concrete and Precast Concrete Structures.
Email: A.Farmanara@ibmb.tu-bs.de

Pooya Lotfabadi
D.Eng. Associate Professor, Working at Department of Architecture and Fine Arts, Final International University, Girne, Turkey.
Research Direction: Sustainable and Performance-Based Design, Environmental Systems, and Application of Artificial Intelligence in the Built Environment.
Email: pooya.lotfabadi@final.edu.tr

Haiying Ma
D.Eng. Associate Professor, Working at Department of Bridge engineering, Tongji University.
Research Direction: Structural Failure and Safety Performance Evaluation, Innovative Materials and Systems, Intelligent Bridge Design and Methods.
Email: mahaiying@tongji.edu.cn