Machine learning assisted discovery of high-efficiency self-healing epoxy coating for corrosion protection

Machine learning assisted discovery of high-efficiency self-healing epoxy coating for corrosion protection

Play all audios:

Loading...

ABSTRACT Machine learning is a powerful means for the rapid development of high-performance functional materials. In this study, we presented a machine learning workflow for predicting the


corrosion resistance of a self-healing epoxy coating containing ZIF-8@Ca microfillers. The orthogonal Latin square method was used to investigate the effects of the molecular weight of the


polyetheramine curing agent, molar ratio of polyetheramine to epoxy, molar content of the hydrogen bond unit (UPy-D400), and mass content of the solid microfillers (ZIF-8@Ca microfillers) on


the low impedance modulus (lg|_Z_|0.01Hz) values of the scratched coatings, generating 32 initial datasets. The machine learning workflow was divided into two stages: In stage I, five


models were compared and the random forest (RF) model was selected for the active learning. After 5 cycles of active learning, the RF model achieved good prediction accuracy: coefficient of


determination (_R_2) = 0.709, mean absolute percentage error (MAPE) = 0.081, root mean square error (RMSE) = 0.685 (lg(Ω·cm2)). In stage II, the best coating formulation was identified by


Bayesian optimization. Finally, the electrochemical impedance spectroscopy (EIS) results showed that compared with the intact coating ((4.63 ± 2.08) × 1011 Ω·cm2), the |_Z_|0.01Hz value of


the repaired coating was as high as (4.40 ± 2.04) × 1011 Ω·cm2. Besides, the repaired coating showed minimal corrosion and 3.3% of adhesion loss after 60 days of neutral salt spray testing.


SIMILAR CONTENT BEING VIEWED BY OTHERS LAYING THE EXPERIMENTAL FOUNDATION FOR CORROSION INHIBITOR DISCOVERY THROUGH MACHINE LEARNING Article Open access 21 February 2024 DATA-DRIVEN


PREDICTION ON CRITICAL MECHANICAL PROPERTIES OF ENGINEERED CEMENTITIOUS COMPOSITES BASED ON MACHINE LEARNING Article Open access 03 July 2024 MACHINE-LEARNING-GUIDED DESCRIPTOR SELECTION FOR


PREDICTING CORROSION RESISTANCE IN MULTI-PRINCIPAL ELEMENT ALLOYS Article Open access 31 January 2022 INTRODUCTION Epoxy (EP) resin is widely used in the field of corrosion protection


because of strong adhesion properties, high corrosion resistance, excellent mechanical properties and low cost. However, cracks may arise inside or at the surface of the EP matrix during


long-term service and reduce its corrosion protection performance with time, thus increasing potential safety hazards during its service life1. The application of self-healing coatings will


be the most common and cost-effective method of improving the corrosion protection and thus the durability of metallic structures. A wide range of engineering structures from vehicles to


aircrafts, from factories to house-hold equipment can be effectively protected via the self-healing coating systems. Recent efforts have focused on improving the durability of EP coatings in


the presence of damage by granting them self-healing functions, which can be realized through intrinsic repair of the material matrix by reversible covalent bonds2 and noncovalent bonds3,


or via extrinsic strategies depending on the release of healing agents4 and corrosion inhibitors5 into coating defects. In contrast to these extrinsic self-healing mechanisms, the intrinsic


one endows the coating with the ability to simulate natural systems and repeated repairability. Such mechanisms are typically based on reversible covalent bonds via disulfide bonds6,


Diels–Alder reactions7, and hydrazone bonds8, or non-covalent interactions via metal-ligand9 and hydrogen bonding10,11,12. Among these mechanisms, the most promising one is based on dynamic


hydrogen bonds because of their high reversibility and mild repair conditions, in combination with their directional and tunable self-association properties13. As an indication of the


self-healing ability of the coating, the low-frequency impedance modulus, such as according to the electrochemical impedance spectroscopy (EIS) data measured at 0.01 Hz (|_Z_|0.01Hz), were


extensively used to estimate the overall corrosion resistance of the test area14,15. A higher |_Z_|0.01Hz value represents a higher barrier ability of the coating. Based on the previous


studies16, in our view the design of an ideal self-healing corrosion protective coating should have the following main index: (1) The |_Z_|0.01Hz value of the self-healed coating is nearly


close to that of the intact coating; (2) excellent barrier ability, |_Z_|0.01Hz value more than 1010 Ω·cm2; (3) long-term stability in corrosive environments both before and after repair.


For example, in a previous work by our group11, an intrinsic self-healing EP coating was developed by grafting 2-ureido-4[1H]-pyrimidinone (UPy) as a quadruple hydrogen bonding unit onto the


backbones of an EP-matrix. The UPy/EP coating demonstrated high-efficient self-healing functionality within 5 min in 3.5 wt.% NaCl solution. The self-healed coating still had high


|_Z_|0.01Hz value of 4.8 × 1010 Ω·cm2 even after 60 days of immersion in NaCl solution. Often, the achievement of the target performance of self-healing implies synergy between multiple


components of the EP coating formulation, including different resins, curing agents, liquid/solid additives, etc. The conventional trial-and-error design strategy for coating formulation is


time-consuming and labor-intensive. Recently, machine learning methods have show to represent a promising option for materials design and optimization, especially for systems with complex


properties or compositions17,18,19,20,21. For example, Haik et al.22 developed a machine learning model to predict the stress relaxation properties of EP matrix composites, based on a


three-layer neural network model using initial stress, test temperature and operating time as input variables and stress relaxation behavior as output. The final model was obtained by


training 9000 experimental data samples. This model can predict efficiently the time-dependent mechanical behavior of a viscoelastic or a viscoplastic material. Kan et al.23 constructed a


molecular recognition model for predicting 2000 molecular descriptors from chemical structures using a gated graph neural network, and extracted 32-dimensional vectors representing 2000


molecular descriptors through the molecular recognition model to complete the dimension reduction. This 32-dimensional vector was used as the input value for the next Gaussian regression,


and the machine learning model for predicting electrical conductivity was finally built by training a large amount of data. Typically, the establishment of an accurate machine learning


requires vast training data, which is difficult to be obtained for polymer resin formation considering the heavy experimental workload in the synthesis and characterization24,25. Therefore,


the construction of small sample datasets in the machine learning aspect of the research method has major implications for polymer design. The problem of machine learning under small sample


data conditions (<1000 samples) has received much attention in recent years26,27. For the processing of small sample data, the most common methods are the neural-network-based methods28,


hierarchical machine learning29, active-learning-based method30 and so on. For instance, Li et al.31 proposed a model combined with nearest neighbor interpolation (NNI), synthetic minority


oversampling technique (SMOTE) and extreme gradient boosting (XGBoost) models to predict the abrasion of rubber composites with small samples. NNI and SMOTE are two classical models in image


processing that aim at increasing the sample size and solving the problem of sample unevenness. Combining these two models, the original dataset was expanded from 23 to 710 samples.


Finally, the abrasion was predicted by the XGBoost model to yield a better prediction accuracy (MSE = 0.001). Similarly, active learning has been applied to discover EP adhesive strength30,


polymer molecular dynamics32, high-_T__g_ polymers33,34 and among others from the small initial datasets. Herein, we employed a machine learning framework to develop self-healing composite


coatings for corrosion protection applications. A flowchart of the machine learning workflow is shown in Fig. 1. In the machine learning framework, active learning and Bayesian optimization


to model and maximize the common logarithm of the low-frequency impedance modulus (lg|_Z_|0.01Hz) obtained from EIS measurements for various scratched self-healing EP composite coatings to


improve its self-healing property. This coating formulation consists of an EP resin, polyetheramines, amino-terminated urea-pyrimidinone monomers (UPy-D400) and ZIF-8@Ca microfillers. The EP


resin mixed with polyetheramine can react to form an EP-based polymer, and the UPy-D400 acts as a quadruple hydrogen bonding unit that can be grafted into the EP network to provide a


self-healing function for the EP polymer via the self-association process; The ZIF-8@Ca microfiller, which is an empty CaCO3 carbonate microcontainer with ZIF-8 nanoparticles assembled on


the surface, is incorporated as a model filler that can not only enhance the barrier property of EP coating, but also present a pH-sensitive response to release loaded substance (e.g.,


inhibitors) to achieve useful functions. For the machine learning process, four-parameter variables, molecular weights of polyetheramine, the molar ratio of polyetheramine to EP, UPy-D400


content, and ZIF-8@Ca content, were used as input, and the lg|_Z_|0.01Hz value of the scratched coatings was used as output; 32 initial dataset were obtained from the preliminary experiment.


Among the five common models, the model with the best accuracy was selected, and trained to achieve the best accuracy by active learning. Subsequently, the Bayesian optimization method was


used to search for the scratched self-healing EP composite coating with an extremely high lg|_Z_|0.01Hz value. Finally, the self-healing and corrosion protective properties of the optimal


coating were verified by EIS and salt spray testing. RESULTS AND DISCUSSION EXPERIMENTAL RESULTS FROM THE INITIAL DATASET As seen in Table 1, four parameters with four initial condition


levels were set (total experimental conditions = 44 = 256 sets). Four parameter variables included the molecular weight of polyetheramine, molar ratio of polyetheramine to EP, the molar


content of UPy-D400, and mass content of the ZIF-8@Ca microfillers. An initial 32 sets of experimental conditions were extracted from the 256 sets by orthogonal Latin square design method35.


This is a method based on mathematical statistics and the orthogonality principle, which can achieve the equivalent results of a large number of comprehensive tests with the minimum number


of tests. It selects a part of points which can represent the whole experiment according to the orthogonality of the experiments. And these selected points are uniformly distributed in the


whole space36,37. Then, the coatings were prepared for EIS measurements according to these 32 conditions, the corresponding the low impedance modulus (lg|_Z_|0.01Hz value) of different


scratched coatings was obtained. The reason for selecting lg|_Z_|0.01Hz value as the output instead of using |_Z_|0.01Hz value is to eliminate the undesirable effects caused by sample


dataset with high variability. Measurements of lg|_Z_|0.01Hz experimental values of scratched coatings that comprise our initial dataset are reported in Table 2. Figure 2 shows the


distribution of lg|_Z_|0.01Hz experimental values. As shown in Fig. 2, the average lg|_Z_|0.01Hz experimental values were widely distributed in the range of 4.75–10.87 (lg(Ω·cm2)). According


to a previous experimental study11, the scratched coatings with different self-healing abilities are involved in this distribution, indicating that the selection of the initial preparation


conditions using the orthogonal Latin square method is reasonable. ASSESSMENT AND SELECTION OF AN LG|_Z_|0.01HZ VALUES PREDICTION MODEL Next step, different experimental conditions and


corresponding lg|_Z_|0.01Hz value of scratched coating were used as the input and output of the machine learning process, respectively, and five common machine learning models were trained


using 32 initial datasets. A comparison of the predicted and measured lg|_Z_|0.01Hz values for each model is shown in Fig. 3a, e. A black dashed straight line indicates equal measured and


predicted values. A comparison of the accuracy of each model is shown in Fig. 3f. Compared with the other models, the RF model yielded the best accuracy in terms of a higher coefficient of


determination (_R_2) value, and lower mean absolute percentage error (MAPE) and root mean square error (RMSE) values. This may be due to its deeper layers of model structure than general


machine learning models; RF models possessed a good processing ability for data with high variability38,39. Hence, the RF model was chosen to predict the lg|_Z_|0.01Hz values in subsequent


steps. ACTIVE LEARNING AND MACHINE LEARNING MODEL PERFORMANCE For the active learning process, the RF model first predicted the lg|_Z_|0.01Hz values of all (256 – 32 = 224 sets) possible


experimental conditions from the 32 initial dataset. The predicted lg|_Z_|0.01Hz values were ranked in descending order. The five top-ranked experimental conditions from 224 sets of


conditions were selected as proposals for subsequent measurements to be performed in the laboratory. These five measurements were added to the initial 32 datasets. Then, the machine learning


model for the prediction of the lg|_Z_|0.01Hz values was trained again on this improved (32 + 5) dataset. The new measurements were re-used in the RF model to improve the accuracy, as this


can enhance the prediction accuracy for high-target performance samples in a targeted manner and improve the active learning efficiency. This process, from the prediction phase to the reuse


phase, represents one cycle of active learning (see Table 3). This active learning process is repeated until the preliminary goal of the best accuracy of the machine learning model is


achieved. In this study, the active learning cycle was stopped if all the evaluation indices (MAPE, RMSE and _R_2) stopped increasing. Figures 4a–g present scatter plots of the predicted


versus measured lg|_Z_|0.01Hz values from the initial dataset to the last cycle. The blue and red dots indicate existing and new measurements, respectively. The evolution of the


corresponding _R_2, MAPE and RMSE values for each cycle is summarized in Fig. 4h, i. As shown in Figs. 4a–g, the predicted and measured values gradually approached the black dashed straight


line from the initial dataset to the last cycle, indicating that an increase in the dataset size resulted in predicted lg|_Z_|0.01Hz values that are closer to measured lg|_Z_|0.01Hz values.


As the dataset size increased, _R_2 clearly increased, and the MAPE and RMSE decreased gradually. After five active learning cycles, the _R_2, MAPE and RMSE values reached equilibrium, at


this time, the active learning process was terminated. For the dataset of 62 samples, the RF model achieved _R_2, MAPE and RMSE values of 0.709, 0.081 and 0.685 (lg(Ω·cm2)), respectively.


Compared to the accuracy of the initial dataset, improvements of 246%, 51% and 47% were achieved for _R_2, MAPE, and RMSE, respectively. In this case, _R_2 was greater than 0.7 and both MAPE


and RMSE were stabilized at a low level, indicating that the RF model reached acceptable accuracy. Therefore, the active learning procedure was stopped at this stage and the RF model was


fixed based on the existing dataset. In addition, Table 3 lists the top-five proposed experiments for the five cycles of active learning with the corresponding predicted and measured


lg|_Z_|0.01Hz values. Several measured lg|_Z_|0.01Hz values in Table 3 that were greater than 11.00 (lg(Ω·cm2)), which is greater than the highest value in the initial dataset, showed that


the RF model allowed us to predict the experimental conditions of the coating with a potentially high self-healing ability. These additional data on high-performance self-healing coatings


are beneficial for further maximization using Bayesian optimization. In addition, the proposed experiments required polyetheramine of molecular weights 400 and 2000 g·mol–1, with an _r_


value greater than 0.85, 10-20 mol% of UPy-D400, and ZIF-8@Ca microfiller content in the full range. This provided the main guidance for refining the test conditions in the subsequent step.


BAYESIAN OPTIMIZATION FOR SCREENING OPTIMAL CANDIDATE In this step, three experimental conditions were refined: _r_ values, molar ratio of UPy-D400, and microfiller content were varied from


0.85 to 1.00, 10 to 20 mol%, and 5.5 to 10.0 wt.%, by increments of 0.1, 1 mol%, and 0.1 wt.%, respectively. The molecular weights of the polyetheramine curing agents were fixed at 400 and


2000 g·mol–1. Obviously, this search space for the coating formulation is vast, and the machine learning model has limited utility if it do not incorporate uncertainty and the expected


improvement process. Since a machine learning model is built using a limited amount of training data, the selection of candidates using that model may be limited to a local search.


Therefore, we speculate that Bayesian optimization may give better results because this optimization technique considers the uncertainty of the prediction and the balance between local and


global search40. Bayesian optimization works on a surrogate model and evaluates a utility function41. The utility function uses the mean and standard deviation of the candidates estimated by


the surrogate model. The utility function encodes a trade-off between the exploitation (candidate searching at points with high mean) and exploration (candidate searching at points with


high uncertainty). Herein, we have used RF as the surrogate model and expected improvement (EI) as a utility function. The EI is defined as the following Eqs. (1)-(2)42:


$${\rm{EI}}({\rm{x}})=\sigma ({\rm{x}})[z\varPhi (z)+\phi (z)]$$ (1) $${\rm{z}}=\,[\mu ({\rm{x}})-{\rm{f}}({{\rm{x}}}^{+})-\varepsilon ]/\sigma ({\rm{x}})$$ (2) where _EI_(_x_) represents


the expected improvement value for each coating formulation candidate. _μ_ and _σ_ are the predicted output and standard deviation of the candidates obtained from the surrogate model,


_f_(_x_+) is the maximum value of the target material property observed in the training data set. _Φ_ represents the cumulative distribution function and _ϕ_ is the probability distribution


function assuming the target property values follows the normal distribution. The term _ε_ regulates the amount of exploration, higher the value of _ε_ more is the exploration. In this


method, the largest EI value represents the most promising coating formulation candidate. Here, we use 1000 iterations for BO run, as this was sufficiently many to predict the optimal


experimental conditions with high accuracy (see Data Availability section for where to access this code), and a series of experiments were conducted starting from rank 1 (Table 4). The new


highest lg|_Z_|0.01Hz values of 11.58 ± 0.28 (lg(Ω·cm2)) was observed, that is, (4.40 ± 2.04) × 1011 Ω·cm2. This impedance modulus value was considerably high compared with those reported in


previous studies on EP-based self-healing coating11,43,44,45,46, which reported a typical lg|_Z_|0.01Hz value range of 7.48–10.68 (lg(Ω·cm2)). The suggested experimental conditions from


Bayesian optimization showed that a relatively low molecular weight of polyetheramine and a high molar ratio of polyetheramine to EP were promising conditions for achieving a high


lg|_Z_|0.01Hz value, whereas the molar ratio of UPy-D400 and microfillers content should be in the middle of their defined range. According to previous studies47,48, excessive amine addition


improves the shape recovery rate of EP materials. The intrinsic self-repair process mentioned in this study is realized by a self-healing unit (hydrogen bond) self-association process on


the premise that the damage can be physically closed. A high shape recovery rate is beneficial for the physical closure of scratched material surfaces11. Excess amine (excessive _r_ value)


leads to higher flexibility but lower mechanical strength of EP materials47, an optimum combination of high strength and good flexibility can be achieved by adjusting the _r_ value precisely


through Bayesian optimization. The introduction of self-healing units and microfillers may also affect the various performance indicators of the coatings, which can balance each addition


amount simultaneously to achieve a reasonable design for target property. Figure 5 shows the distribution of lg|_Z_|0.01Hz values of scratched coatings from the initial dataset, after the


five active learning cycles, and after a Bayesian optimization process. The lg|_Z_|0.01Hz values from the initial dataset were spread randomly from 4.75 to 10.87 (lg(Ω·cm2)). By comparison,


all samples that followed an active learning cycle exhibited a high lg|_Z_|0.01Hz value (>8.23 (lg(Ω·cm2))), and one sample from the Bayesian optimization dataset showed an exceptionally


high lg|_Z_|0.01Hz value. These results demonstrate the potential of our machine learning framework for the design and optimization of high-performance functional materials based on small


sample conditions. INTERPRETATION OF MACHINE LEARNING MODEL FOR COATING DESIGN EIS measurements were conducted on the scratched pure commercial EP and ZIF-8@Ca/EP coatings and their


corresponding intact coatings to study the self-healing and corrosion resistance properties. The ZIF-8@Ca/EP coating was prepared based on the best formulation selected by Bayesian


optimization. Nyquist and Bode plots of the intact coatings were obtained by EIS after 30 min of immersion in 3.5 wt.% NaCl solution (Fig. 6a–c). Figure 6d–i show the Nyquist and Bode plots


of the steels with scratched coatings after immersion for 1, 15, 30 and 60 d. The as-used pure EP coating was prepared by mixing E51 with D400 polyetheramine curing agents at a molar ratio


of 5:3. For the pure EP sample, the intact coating initially showed a high barrier property with large capacitive arc in the Nyquist plot (Fig. 6a) and the high |_Z_|0.01Hz value (3.98 ×


1010 Ω·cm2) in the Bode plot (Fig. 6b). The phase angles in the high frequencies (105 Hz) were close to –90° which indicates the capacitive character of the coatings. In contrast to the


intact pure EP coating, intact ZIF-8@Ca/EP coating exhibited a slightly larger capacitive arc in terms of Nyquist plot, and |_Z_|0.01Hz value rose to 3.82 × 1011 Ω·cm2, indicating


substantial improvement in the barrier property of the coating after the machine learning adjustment. The average and standard deviation of the |_Z_|0.01Hz value for intact coating were


calculated using six parallel samples, expressed as (4.63 ± 2.08) × 1011 Ω·cm2. In terms of the scratched coatings, the capacitive arcs of the pure EP coating shrank and the |_Z_|0.01Hz


values declined gradually over the entire immersion time, demonstrating the continuous deterioration of the barrier property (Figs. 6d–e). Subsequently, for the phase diagrams in Fig. 6f,


scratched pure EP showed two-time constants: one related to the charge transfer process at the coating/substrate interface (10−2−100 Hz), and the other related to the resistance increase by


means of corrosion product formation in the artificial defect (101−105 Hz)49. Compared with the Bode plots for pure EP coating, the Bode plots of the scratched coating showed approximately


–45° straight lines with |_Z_|0.01Hz values in excess of 3.80 × 1011 Ω·cm2 at the beginning of immersion. The corresponding phase angles were –90◦ over the frequency range of 10–1−105 Hz.


This implies that during the immersion, a conductive pathway is not formed through the coating, which largely exhibits a capacitive behavior similar to that of an intact coating50. During


the 60 d of immersion, the |_Z_|0.01Hz values of the ZIF-8@Ca/EP coating only slightly decreased from 3.80 × 1011 Ω·cm2 to 1.23 × 1011 Ω·cm2, confirming that the scratched ZIF-8@Ca/EP


coating had been well repaired and possessed a satisfactory corrosion resistance. After scratching, the pure EP and ZIF-8@Ca/EP coatings were subjected to salt spray tests following the ASTM


B117/D1654 standard. Figures 6b and 7a show the optical images of the coatings after exposure to the salt spray chamber for different periods. According to the visual assessment in Fig. 7a,


green corrosion products were observed at the scratches of the pure EP coating within the 1 d of the salt spray test. After 60 d, large-scale coating delamination and corrosion products


appeared in the scratched region, indicating that the scratched location of the pure EP coating was highly vulnerable to attack by corrosive species. Compared with pure EP, only slight


scratch traces were observed at the scratched positions, and the ZIF-8@Ca/EP coating did not show any signs of degradation (delamination, corrosion, or blistering) after 30 d (Fig. 7b).


Furthermore, as the salt spray exposure time increased to 60 d, only one slight corrosion spot was observed at the scratched site, indicating the corrosion of the scratched ZIF-8@Ca/EP


coating could be controlled in a salt spray environment for a long time. The adhesion strength, an important indicator of coating properties, can be measured using a pull-off test. Figure 7d


shows the adhesion strength/loss values of intact pure EP and ZIF-8@Ca/EP coating before and after the 60 d salt spray test. The optical images of the remaining coatings following the


pull-off test are presented in Fig. 7c. As shown in Fig. 7c, none of the samples exhibits cohesive failure. As shown in Fig. 7c, the dry adhesion strength of the ZIF-8@Ca/EP coatings (9.82 


MPa) is higher than that of pure EP (4.70 MPa). This is because the introduction of branched-chain amines and UPy units enhanced the hydrogen bonding between the coating and the metal


surface51. After salt spraying, the pure EP coating exhibited a considerable adhesion loss of 79.4% (0.97 MPa). In contrast, the ZIF-8@Ca/EP coating demonstrated not only the highest wet


adhesion strength (9.50 MPa) but also minimal adhesion loss (3.3%) after a 60 d of salt spray test. In summary, the design of experimental techniques combined with an active learning and


Bayesian optimization was proposed to predict and optimize the lg|_Z_|0.01Hz values of scratched EP self-healing coatings composed of different molecular weights of polyetheramine curing


agent, molar ratios of polyetheramine to E51 EP resin, molar content of UPy-D400 and mass contents of ZIF-8@Ca microfillers. The active learning process yielded the preferred experimental


conditions to build a predictive RF model of lg|_Z_|0.01Hz values with satisfactory accuracy (_R_2 = 0.709, MAPE = 0.081, RMSE = 0.685 (lg(Ω·cm2))) after five cycles of active learning.


Then, an extremely high lg|_Z_|0.01Hz values of 11.58 (|_Z_|0.01Hz = 3.80 × 1011 Ω·cm2) was achieved using the experimental conditions that were refined by Bayesian optimization. As


confirmed by EIS, the ZIF-8@Ca/EP coating exhibited a great healing effect in barrier property (intact sample: 3.82 × 1011 Ω·cm2, repaired sample: 3.80 × 1011 Ω·cm2). In addition, in terms


of the corrosion resistance after repair, the ZIF-8@Ca/EP coating exhibited slight corrosion after 60 d of the salt spray test, and the adhesion loss of the composite coating after the salt


spray test was 3.3%, which was considerably lower than that of the pure EP coating (79.4%). METHODS MATERIALS Polyetheramine curing agents with four different molecular weights (230, 400,


2000 and 4000 g·mol–1) were sourced from the Aladdin Industrial Corporation. The E51 EP resin was sourced from Jiangsu Heli Resin Co., ltd. The ZIF-8@Ca microfillers and the UPy-D400


monomers were obtained using previously published methods11,51. The Q235 mild steel was used as the substrate. PREPARATION OF COATINGS AND EIS TEST Based on the selected 32 experimental


conditions, the preparation process of the self-healing EP coating containing ZIF-8@Ca microfillers (ZIF-8@Ca/EP) is shown in Fig. 8. In each case, the ZIF-8@Ca microfillers were first mixed


with the E51 EP resin under magnetic stirring. The polyetheramine curing agent and UPy-D400 were then added to the mixture using a mechanical agitator at 500 rpm for 10 min. Prior to the


coating preparation, the steel specimens were wet-polished sequentially with 150-, 240- and 400-grit sandpapers, washed with ethanol and blow-dried in an N2 atmosphere. The resulting mixture


was applied to a steel piece using a bar coater. The coated samples were obtained by drying at room temperature for 48 h. The final thickness of each of the dry films was approximately 85


μm. EIS tests were performed to measure the low-frequency impedance (|_Z_|0.01Hz) values of the coated steel with/without an artificial scratch. Herein, all scratches of the EIS tests are


made by a scalpel, and they are reproducible. The EIS results were obtained using a 3.5 wt.% NaCl solution and a CHI-660E electrochemical workstation with a three-electrode cell system


comprising a coated steel substrate as a working electrode, a platinum plate electrode as a counter electrode and a saturated calomel electrode (SCE) as a reference electrode. The test


parameters were set in the 10−2−105 Hz range with a 0.02 V root mean square amplitude. Prior to EIS measurements, artificial through-coating scratches (approximately 3 mm in length and


approximately 60 µm in width) were made on the different coated steels using a scalpel. The measurements were conducted on the coated steels at least five times to ensure the reproducibility


of the EIS results. In EIS results, the |_Z_|0.01Hz value in the Bode plot usually represents the main performance index for the corrosion resistance of a coating, that is, a higher


|_Z_|0.01Hz value reflects a higher barrier property52. Therefore, this index was used to characterize the repair effect of the barrier properties of the coating after scratching. To further


verify the self-healing and long-term anti-anticorrosion ability of the scratched composite coating after machine learning process, salt spray test was performed on the coatings via


exposing the samples to salt spray for 60 d in accordance with ASTM D1654. DATA PRE-PROCESSING, DATA SPLITTING AND MACHINE LEARNING MODELS Data pre-processing and data splitting were


performed and different machine learning models were simulated using the Python package scikit-learn (version 1.1.1). The four variable parameters (Table 4) in this study were standardized


following a standard Gaussian distribution of a mean of 0 and a variance of 153. The purpose of normalization is to make the preprocessed data be limited to a certain range (e.g., [0,1] or


[–1,1]), thus eliminating the undesirable effects caused by sample dataset with high variability. The validity and accuracy of all employed machine learning models were evaluated using


k-fold cross-validation. In this step, the data were randomly arranged and divided into 10 groups. Nine groups were allocated for training purposes, and the remaining group was assigned to


validate of the model. The average value was obtained by repeating the same process 10 times. To obtain the performance level of the model, the MAPE, RMSE and _R_2 were introduced to


evaluate the k-fold cross-validation, using the following Eqs. (3)-(5):54,55,56 $$\,{\rm{MAPE}}=\,\frac{1}{{\rm{n}}}\mathop{\sum


}\limits_{{\rm{i}}=1}^{{\rm{n}}}\frac{|{{\rm{y}}}_{{\rm{i}}}{-\hat{{\rm{y}}}}_{{\rm{i}}}|}{|{{\rm{y}}}_{{\rm{i}}}|}$$ (3) $$\,{\rm{RMSE}}=\sqrt{\frac{1}{{\rm{n}}}\mathop{\sum


}\limits_{{\rm{i}}=1}^{{\rm{n}}}{{({\rm{y}}}_{{\rm{i}}}{-\hat{{\rm{y}}}}_{{\rm{i}}})}^{2}}$$ (4) $${{\rm{R}}}^{2}=1-\frac{{\sum


}_{{\rm{i}}=1}^{{\rm{n}}}{{({\rm{y}}}_{{\rm{i}}}{-\hat{{\rm{y}}}}_{{\rm{i}}})}^{2}}{{\sum }_{{\rm{i}}=1}^{{\rm{n}}}{{({\rm{y}}}_{{\rm{i}}}-\bar{{\rm{y}}})}^{2}}$$ (5) where n is the number


of samples, and \({y}_{i}\) and \({\hat{y}}_{i}\) are the experimental and predicted values of the _i_th sample, respectively. The accuracy of the machine learning model was accessed using


its MAPE (MAPE value is in between 0 and 1, a value closer to 0 indicates greater accuracy57) and RMSE (a lower value of each indicates greater accuracy30) and _R_2 (a value closer to 1


indicates greater accuracy; when the _R_2 coefficient is greater than 0.7, the model represents acceptable accuracy58.) Five machine learning models were applied as regression tools to the


dataset: LR, ANN, SVR, DT and RF models. The machine learning methods are described in detail in the related reference59. The interested reader should refer to the Data Availability section


for where to access our code used to run these algorithms. BAYESIAN OPTIMIZATION Bayesian optimization40 was used to determine the highest lg|_Z_|0.01Hz values by refining the variable


conditions from Table 1. Bayesian optimization was performed using the Python package GPyOpt. DATA AVAILABILITY Source codes for this article are publicly available at


https://github.com/lt1037870521/manuscript-code-EP-Lt. REFERENCES * He, Y. et al. Micro-crack behavior of carbon fiber reinforced Fe3O4/graphene oxide modified epoxy composites for cryogenic


application. _Compos_. _Part A Appl. Sci. Manuf._ 108, 12–22 (2018). Article  CAS  Google Scholar  * Huang, S. et al. An overview of dynamic covalent bonds in polymer material and their


applications. _Eur. Polym. J._ 141, 110094 (2020). Article  CAS  Google Scholar  * Utrera-Barrios, S., Verdejo, R. & López-Manchado, M. A. & Hernández Santana, M. Evolution of


self-healing elastomers, from extrinsic to combined intrinsic mechanisms: a review. _Mater. Horiz._ 7, 2882–2902 (2020). Article  CAS  Google Scholar  * Samadzadeh, M., Boura, S. H.,


Peikari, M., Kasiriha, S. M. & Ashrafi, A. A review on self-healing coatings based on micro/nanocapsules. _Prog. Org. Coat._ 68, 159–164 (2010). Article  CAS  Google Scholar  * Shchukin,


D. G. Container-based multifunctional self-healing polymer coatings. _Polym. Chem._ 4, 4871–4877 (2013). Article  CAS  Google Scholar  * Canadell, J., Goossens, H. & Klumperman, B.


Self-healing materials based on disulfide links. _Macromolecules_ 44, 2536–2541 (2011). Article  CAS  Google Scholar  * Kuang, X. et al. Facile fabrication of fast recyclable and multiple


self-healing epoxy materials through diels-alder adduct cross-linker. _J. Polym. Sci. Pol. Chem._ 53, 2094–2103 (2015). Article  CAS  Google Scholar  * Wen, N. et al. Recent advancements in


self-healing materials: Mechanicals, performances and features. _React. Funct. Polym._ 168, 105041 (2021). Article  CAS  Google Scholar  * Han, Y., Wu, X., Zhang, X. & Lu, C.


Self-healing, highly sensitive electronic sensors enabled by metal–ligand coordination and hierarchical structure design. _ACS Appl. Mater. Inter._ 9, 20106–20114 (2017). Article  CAS 


Google Scholar  * Nardeli, J. V., Fugivara, C. S., Taryba, M., Montemor, M. F. & Benedetti, A. V. Self-healing ability based on hydrogen bonds in organic coatings for corrosion


protection of AA1200. _Corros. Sci._ 177, 108984 (2020). Article  CAS  Google Scholar  * Liu, T. et al. Ultrafast and high-efficient self-healing epoxy coatings with active multiple hydrogen


bonds for corrosion protection. _Corros. Sci._ 187, 109485 (2021). Article  CAS  Google Scholar  * Kim, G., Caglayan, C. & Yun, G. J. Epoxy-based catalyst-free self-healing elastomers


at room temperature employing aromatic disulfide and hydrogen bonds. _ACS omega_ 7, 44750–44761 (2022). Article  CAS  Google Scholar  * Bosnian, A., Brunsveld, L., Folmer, B. & Sijbesma,


R. & Meijer, E. _Macromol. Symp._ 201, 143–154 (2003). Article  Google Scholar  * Rosero-Navarro, N. C., Pellice, S. A., Durán, A. & Aparicio, M. Effects of Ce-containing sol–gel


coatings reinforced with SiO2 nanoparticles on the protection of AA2024. _Corros. Sci._ 50, 1283–1291 (2008). Article  CAS  Google Scholar  * Wang, J. et al. Two birds with one stone:


Nanocontainers with synergetic inhibition and corrosion sensing abilities towards intelligent self-healing and self-reporting coating. _Chem. Eng. J._ 433, 134515 (2022). Article  CAS 


Google Scholar  * Fan, Z. et al. Self-healing mechanisms in smart protective coatings: a review. _Corros. Sci._ 144, 74–88 (2018). Article  Google Scholar  * Tao, Q., Xu, P., Li, M. &


Lu, W. Machine learning for perovskite materials design and discovery. _npj Comput. Mater._ 7, 23 (2021). Article  Google Scholar  * Li, Z. et al. Machine learning in concrete science:


applications, challenges, and best practices. _npj Comput. Mater._ 8, 127 (2022). Article  Google Scholar  * Zhong, X. et al. Explainable machine learning in materials science. _npj Comput.


Mater._ 8, 204 (2022). Article  Google Scholar  * Taylor, C. D. & Tossey, B. M. High temperature oxidation of corrosion resistant alloys from machine learning. _npj Mater_. _Degrad_ 5,


38 (2021). CAS  Google Scholar  * Li, Q. et al. Long-term corrosion monitoring of carbon steels and environmental correlation analysis via the random forest method. _npj Mater_. _Degrad_ 6,


1 (2022). CAS  Google Scholar  * Al-Haik, M. S., Hussaini, M. Y. & Garmestani, H. Prediction of nonlinear viscoelastic behavior of polymeric composites using an artificial neural


network. _Int. J. Plast._ 22, 1367–1392 (2006). Article  CAS  Google Scholar  * Hatakeyama-Sato, K., Tezuka, T., Umeki, M. & Oyaizu, K. AI-assisted exploration of superionic glass-type


Li(+) conductors with aromatic structures. _J. Am. Chem. Soc._ 142, 3301–3305 (2020). Article  CAS  Google Scholar  * Askland, K. D. et al. Prediction of remission in obsessive compulsive


disorder using a novel machine learning strategy. _Int. J. Methods Psychiatr. Res._ 24, 156–169 (2015). Article  Google Scholar  * Shao, M., Zhu, X.-J., Cao, H.-F. & Shen, H.-F. An


artificial neural network ensemble method for fault diagnosis of proton exchange membrane fuel cell system. _Energy_ 67, 268–275 (2014). Article  Google Scholar  * Xu, P., Ji, X., Li, M.


& Lu, W. Small data machine learning in materials science. _npj Comput. Mater._ 9, 42 (2023). Article  Google Scholar  * Sutojo, T. et al. A machine learning approach for corrosion small


datasets. _npj Mater_. _Degrad_ 7, 18 (2023). Google Scholar  * Xiang, K.-L., Xiang, P.-Y. & Wu, Y.-P. Prediction of the fatigue life of natural rubber composites by artificial neural


network approaches. _Mater. Des._ 57, 180–185 (2014). Article  CAS  Google Scholar  * Menon, A., Thompson-Colón, J. A. & Washburn, N. R. Hierarchical machine learning model for


mechanical property predictions of polyurethane elastomers from small datasets. _Front. Mater._ 6, 87 (2019). Article  Google Scholar  * Pruksawan, S., Lambard, G., Samitsu, S., Sodeyama, K.


& Naito, M. Prediction and optimization of epoxy adhesive strength from a small dataset through active learning. _Sci. Technol. Adv. Mater._ 20, 1010–1021 (2019). Article  CAS  Google


Scholar  * Li, D., Liu, J. & Liu, J. NNI-SMOTE-XGBoost: A novel small sample analysis method for properties prediction of polymer materials. _Macromol. Theory Simul._ 30, 2100010 (2021).


Article  CAS  Google Scholar  * Novikov, I. S., Shapeev, A. V. & Suleimanov, Y. V. Ring polymer molecular dynamics and active learning of moment tensor potential for gas-phase


barrierless reactions: Application to S + H2. _J. Chem. Phys._ 151, 224105 (2019). Article  Google Scholar  * Kim, C., Chandrasekaran, A., Jha, A. & Ramprasad, R. Active-learning and


materials design: the example of high glass transition temperature polymers. _MRS Commun._ 9, 860–866 (2019). Article  CAS  Google Scholar  * Jha, A., Chandrasekaran, A., Kim, C. &


Ramprasad, R. Impact of dataset uncertainties on machine learning model predictions: the example of polymer glass transition temperatures. _Model. Simul. Mater. Sci. Eng._ 27, 024002 (2019).


Article  CAS  Google Scholar  * Mandl, R. Orthogonal Latin squares: an application of experiment design to compiler testing. _Commun. ACM_ 28, 1054–1058 (1985). Article  Google Scholar  *


Balak, Z. & Zakeri, M. Application of Taguchi L32 orthogonal design to optimize flexural strength of ZrB2-based composites prepared by spark plasma sintering. _Int. J. Refract. Met. H._


55, 58–67 (2016). Article  CAS  Google Scholar  * Wu, C. J. & Hamada, M. S. _Experiments: planning, analysis, and optimization_. (John Wiley & Sons), (2011). * Breiman, L. Random


forests. _Mach. Learn._ 45, 5–32 (2001). Article  Google Scholar  * Ji, Y. et al. Random forest incorporating ab-initio calculations for corrosion rate prediction with small sample Al alloys


data. _npj Mater_. _Degrad_ 6, 83 (2022). CAS  Google Scholar  * Packwood, D. _Bayesian Optimization for Materials Science_. 11-28 (Springer), (2017). * Wagner, T., Emmerich, M., Deutz, A.


& Ponweiser, W. in _Parallel Problem Solving from Nature, PPSN XI: 11th International Conference, Kraków, Poland, September 11-15, 2010, Proceedings, Part I 11_. 718-727 (Springer). *


Mohanty, T., Chandran, K. & Sparks, T. D. Machine learning guided optimal composition selection of niobium alloys for high temperature applications. _APL Mach. Learn._ 1, 036102 (2023).


Article  Google Scholar  * Cui, G. et al. Research progress on self-healing polymer/graphene anticorrosion coatings. _Prog. Org. Coat._ 155, 106231 (2021). Article  CAS  Google Scholar  *


Nawaz, M., Habib, S., Khan, A., Shakoor, R. A. & Kahraman, R. Cellulose microfibers (CMFs) as a smart carrier for autonomous self-healing in epoxy coatings. _N. J. Chem._ 44, 5702–5710


(2020). Article  CAS  Google Scholar  * Zhang, C., Wang, H. & Zhou, Q. Preparation and characterization of microcapsules based self-healing coatings containing epoxy ester as healing


agent. _Prog. Org. Coat._ 125, 403–410 (2018). Article  CAS  Google Scholar  * Wang, T. et al. Photothermal nanofiller-based polydimethylsiloxane anticorrosion coating with multiple cyclic


self-healing and long-term self-healing performance. _Chem. Eng. J._ 446, 137077 (2022). Article  CAS  Google Scholar  * Zheng, N., Fang, G., Cao, Z., Zhao, Q. & Xie, T. High strain


epoxy shape memory polymer. _Polym. Chem._ 6, 3046–3053 (2015). Article  CAS  Google Scholar  * Li, J., Rodgers, W. R. & Xie, T. Semi-crystalline two-way shape memory elastomer.


_Polymer_ 52, 5320–5325 (2011). Article  CAS  Google Scholar  * Oliveira, C. & Ferreira, M. Ranking high-quality paint systems using EIS. Part I: intact coatings. _Corros. Sci._ 45,


123–138 (2003). Article  Google Scholar  * Hao, Y., Sani, L. A., Ge, T. & Fang, Q. Phytic acid doped polyaniline containing epoxy coatings for corrosion protection of Q235 carbon steel.


_Appl. Surf. Sci._ 419, 826–837 (2017). Article  CAS  Google Scholar  * Liu, T. et al. Self-healing and corrosion-sensing coatings based on pH-sensitive MOF-capped microcontainers for


intelligent corrosion control. _Chem. Eng. J._ 454, 140335 (2023). Article  CAS  Google Scholar  * Tavandashti, N. P. et al. Inhibitor-loaded conducting polymer capsules for active corrosion


protection of coating defects. _Corros. Sci._ 112, 138–149 (2016). Article  Google Scholar  * Zheng, X., Zheng, P. & Zhang, R.-Z. Machine learning material properties from the periodic


table using convolutional neural networks. _Chem. Sci._ 9, 8426–8432 (2018). Article  CAS  Google Scholar  * De Myttenaere, A., Golden, B., Le Grand, B. & Rossi, F. Mean absolute


percentage error for regression models. _Neurocomputing_ 192, 38–48 (2016). Article  Google Scholar  * Fukutani, T., Miyazawa, K., Iwata, S. & Satoh, H. G-RMSD: Root mean square


deviation based method for three-dimensional molecular similarity determination. _Bull. Chem. Soc. Jpn._ 94, 655–665 (2021). Article  CAS  Google Scholar  * Uyanık, T., Karatuğ, Ç. &


Arslanoğlu, Y. Machine learning approach to ship fuel consumption: A case of container vessel. _Transp. Res. D.-T. E._ 84, 102389 (2020). Article  Google Scholar  * Chen, S., Cao, H.,


Ouyang, Q., Wu, X. & Qian, Q. ALDS: An active learning method for multi-source materials data screening and materials design. _Mater. Des._ 223, 111092 (2022). Article  CAS  Google


Scholar  * Faraji Niri, M., Reynolds, C., Román Ramírez, L. A. A., Kendrick, E. & Marco, J. Systematic analysis of the impact of slurry coating on manufacture of Li-ion battery


electrodes via explainable machine learning. _Energy Storage Mater._ 51, 223–238 (2022). Article  Google Scholar  * Bishop, C. M. & Nasrabadi, N. M. _Pattern Recognition and Machine


Learning_. 4 (Springer), (2006). Download references ACKNOWLEDGEMENTS This work is supported by National Key R&D Program of China (2022YFB3808803). AUTHOR INFORMATION AUTHORS AND


AFFILIATIONS * Beijing Advanced Innovation Center for Materials Genome Engineering, Institute for Advanced Materials and Technology, University of Science and Technology Beijing, Beijing,


100083, China Tong Liu, Zhuoyao Chen, Jingzhi Yang, Lingwei Ma & Dawei Zhang * National Materials Corrosion and Protection Data Center, University of Science and Technology Beijing,


Beijing, 100083, China Tong Liu, Zhuoyao Chen, Jingzhi Yang, Lingwei Ma & Dawei Zhang * College of Materials Science and Engineering, Shenyang University of Chemical Technology,


Shenyang, 110142, China Tong Liu * Institute of Materials Intelligent Technology, Liaoning Academy of Materials, Shenyang, 110004, China Lingwei Ma & Dawei Zhang * Department of


Materials Science and Engineering, Delft University of Technology, Mekelweg 2, Delft, 2628CD, The Netherlands Arjan Mol Authors * Tong Liu View author publications You can also search for


this author inPubMed Google Scholar * Zhuoyao Chen View author publications You can also search for this author inPubMed Google Scholar * Jingzhi Yang View author publications You can also


search for this author inPubMed Google Scholar * Lingwei Ma View author publications You can also search for this author inPubMed Google Scholar * Arjan Mol View author publications You can


also search for this author inPubMed Google Scholar * Dawei Zhang View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS T.L.: investigation,


methodology, and writing—original draft. Z.C: investigation and methodology. J.Y.: investigation. L.M.: investigation. A.M.: writing—review and editing. D.Z.: supervision, conceptualization,


methodology, and writing—review and editing. CORRESPONDING AUTHOR Correspondence to Dawei Zhang. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests.


ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. RIGHTS AND PERMISSIONS OPEN


ACCESS This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format,


as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third


party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the


article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright


holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Liu, T., Chen, Z., Yang, J. _et al._


Machine learning assisted discovery of high-efficiency self-healing epoxy coating for corrosion protection. _npj Mater Degrad_ 8, 11 (2024). https://doi.org/10.1038/s41529-024-00427-z


Download citation * Received: 11 June 2023 * Accepted: 04 January 2024 * Published: 19 January 2024 * DOI: https://doi.org/10.1038/s41529-024-00427-z SHARE THIS ARTICLE Anyone you share the


following link with will be able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer


Nature SharedIt content-sharing initiative