Machine learning assisted discovery of high-efficiency self-healing epoxy coating for corrosion protection

Play all audios:

ABSTRACT Machine learning is a powerful means for the rapid development of high-performance functional materials. In this study, we presented a machine learning workflow for predicting the

corrosion resistance of a self-healing epoxy coating containing ZIF-8@Ca microfillers. The orthogonal Latin square method was used to investigate the effects of the molecular weight of the

polyetheramine curing agent, molar ratio of polyetheramine to epoxy, molar content of the hydrogen bond unit (UPy-D400), and mass content of the solid microfillers (ZIF-8@Ca microfillers) on

the low impedance modulus (lg|_Z_|0.01Hz) values of the scratched coatings, generating 32 initial datasets. The machine learning workflow was divided into two stages: In stage I, five

models were compared and the random forest (RF) model was selected for the active learning. After 5 cycles of active learning, the RF model achieved good prediction accuracy: coefficient of

determination (_R_2) = 0.709, mean absolute percentage error (MAPE) = 0.081, root mean square error (RMSE) = 0.685 (lg(Ω·cm2)). In stage II, the best coating formulation was identified by

Bayesian optimization. Finally, the electrochemical impedance spectroscopy (EIS) results showed that compared with the intact coating ((4.63 ± 2.08) × 1011 Ω·cm2), the |_Z_|0.01Hz value of

the repaired coating was as high as (4.40 ± 2.04) × 1011 Ω·cm2. Besides, the repaired coating showed minimal corrosion and 3.3% of adhesion loss after 60 days of neutral salt spray testing.

SIMILAR CONTENT BEING VIEWED BY OTHERS LAYING THE EXPERIMENTAL FOUNDATION FOR CORROSION INHIBITOR DISCOVERY THROUGH MACHINE LEARNING Article Open access 21 February 2024 DATA-DRIVEN

PREDICTION ON CRITICAL MECHANICAL PROPERTIES OF ENGINEERED CEMENTITIOUS COMPOSITES BASED ON MACHINE LEARNING Article Open access 03 July 2024 MACHINE-LEARNING-GUIDED DESCRIPTOR SELECTION FOR

PREDICTING CORROSION RESISTANCE IN MULTI-PRINCIPAL ELEMENT ALLOYS Article Open access 31 January 2022 INTRODUCTION Epoxy (EP) resin is widely used in the field of corrosion protection

because of strong adhesion properties, high corrosion resistance, excellent mechanical properties and low cost. However, cracks may arise inside or at the surface of the EP matrix during

long-term service and reduce its corrosion protection performance with time, thus increasing potential safety hazards during its service life1. The application of self-healing coatings will

be the most common and cost-effective method of improving the corrosion protection and thus the durability of metallic structures. A wide range of engineering structures from vehicles to

aircrafts, from factories to house-hold equipment can be effectively protected via the self-healing coating systems. Recent efforts have focused on improving the durability of EP coatings in

the presence of damage by granting them self-healing functions, which can be realized through intrinsic repair of the material matrix by reversible covalent bonds2 and noncovalent bonds3,

or via extrinsic strategies depending on the release of healing agents4 and corrosion inhibitors5 into coating defects. In contrast to these extrinsic self-healing mechanisms, the intrinsic

one endows the coating with the ability to simulate natural systems and repeated repairability. Such mechanisms are typically based on reversible covalent bonds via disulfide bonds6,

Diels–Alder reactions7, and hydrazone bonds8, or non-covalent interactions via metal-ligand9 and hydrogen bonding10,11,12. Among these mechanisms, the most promising one is based on dynamic

hydrogen bonds because of their high reversibility and mild repair conditions, in combination with their directional and tunable self-association properties13. As an indication of the

self-healing ability of the coating, the low-frequency impedance modulus, such as according to the electrochemical impedance spectroscopy (EIS) data measured at 0.01 Hz (|_Z_|0.01Hz), were

extensively used to estimate the overall corrosion resistance of the test area14,15. A higher |_Z_|0.01Hz value represents a higher barrier ability of the coating. Based on the previous

studies16, in our view the design of an ideal self-healing corrosion protective coating should have the following main index: (1) The |_Z_|0.01Hz value of the self-healed coating is nearly

close to that of the intact coating; (2) excellent barrier ability, |_Z_|0.01Hz value more than 1010 Ω·cm2; (3) long-term stability in corrosive environments both before and after repair.

For example, in a previous work by our group11, an intrinsic self-healing EP coating was developed by grafting 2-ureido-4[1H]-pyrimidinone (UPy) as a quadruple hydrogen bonding unit onto the

backbones of an EP-matrix. The UPy/EP coating demonstrated high-efficient self-healing functionality within 5 min in 3.5 wt.% NaCl solution. The self-healed coating still had high

|_Z_|0.01Hz value of 4.8 × 1010 Ω·cm2 even after 60 days of immersion in NaCl solution. Often, the achievement of the target performance of self-healing implies synergy between multiple

components of the EP coating formulation, including different resins, curing agents, liquid/solid additives, etc. The conventional trial-and-error design strategy for coating formulation is

time-consuming and labor-intensive. Recently, machine learning methods have show to represent a promising option for materials design and optimization, especially for systems with complex

properties or compositions17,18,19,20,21. For example, Haik et al.22 developed a machine learning model to predict the stress relaxation properties of EP matrix composites, based on a

three-layer neural network model using initial stress, test temperature and operating time as input variables and stress relaxation behavior as output. The final model was obtained by

training 9000 experimental data samples. This model can predict efficiently the time-dependent mechanical behavior of a viscoelastic or a viscoplastic material. Kan et al.23 constructed a

molecular recognition model for predicting 2000 molecular descriptors from chemical structures using a gated graph neural network, and extracted 32-dimensional vectors representing 2000

molecular descriptors through the molecular recognition model to complete the dimension reduction. This 32-dimensional vector was used as the input value for the next Gaussian regression,

and the machine learning model for predicting electrical conductivity was finally built by training a large amount of data. Typically, the establishment of an accurate machine learning

requires vast training data, which is difficult to be obtained for polymer resin formation considering the heavy experimental workload in the synthesis and characterization24,25. Therefore,

the construction of small sample datasets in the machine learning aspect of the research method has major implications for polymer design. The problem of machine learning under small sample

data conditions (<1000 samples) has received much attention in recent years26,27. For the processing of small sample data, the most common methods are the neural-network-based methods28,

hierarchical machine learning29, active-learning-based method30 and so on. For instance, Li et al.31 proposed a model combined with nearest neighbor interpolation (NNI), synthetic minority

oversampling technique (SMOTE) and extreme gradient boosting (XGBoost) models to predict the abrasion of rubber composites with small samples. NNI and SMOTE are two classical models in image

processing that aim at increasing the sample size and solving the problem of sample unevenness. Combining these two models, the original dataset was expanded from 23 to 710 samples.

Finally, the abrasion was predicted by the XGBoost model to yield a better prediction accuracy (MSE = 0.001). Similarly, active learning has been applied to discover EP adhesive strength30,

polymer molecular dynamics32, high-_T__g_ polymers33,34 and among others from the small initial datasets. Herein, we employed a machine learning framework to develop self-healing composite

coatings for corrosion protection applications. A flowchart of the machine learning workflow is shown in Fig. 1. In the machine learning framework, active learning and Bayesian optimization

to model and maximize the common logarithm of the low-frequency impedance modulus (lg|_Z_|0.01Hz) obtained from EIS measurements for various scratched self-healing EP composite coatings to

improve its self-healing property. This coating formulation consists of an EP resin, polyetheramines, amino-terminated urea-pyrimidinone monomers (UPy-D400) and ZIF-8@Ca microfillers. The EP

resin mixed with polyetheramine can react to form an EP-based polymer, and the UPy-D400 acts as a quadruple hydrogen bonding unit that can be grafted into the EP network to provide a

self-healing function for the EP polymer via the self-association process; The ZIF-8@Ca microfiller, which is an empty CaCO3 carbonate microcontainer with ZIF-8 nanoparticles assembled on

the surface, is incorporated as a model filler that can not only enhance the barrier property of EP coating, but also present a pH-sensitive response to release loaded substance (e.g.,

inhibitors) to achieve useful functions. For the machine learning process, four-parameter variables, molecular weights of polyetheramine, the molar ratio of polyetheramine to EP, UPy-D400

content, and ZIF-8@Ca content, were used as input, and the lg|_Z_|0.01Hz value of the scratched coatings was used as output; 32 initial dataset were obtained from the preliminary experiment.

Among the five common models, the model with the best accuracy was selected, and trained to achieve the best accuracy by active learning. Subsequently, the Bayesian optimization method was

used to search for the scratched self-healing EP composite coating with an extremely high lg|_Z_|0.01Hz value. Finally, the self-healing and corrosion protective properties of the optimal

coating were verified by EIS and salt spray testing. RESULTS AND DISCUSSION EXPERIMENTAL RESULTS FROM THE INITIAL DATASET As seen in Table 1, four parameters with four initial condition

levels were set (total experimental conditions = 44 = 256 sets). Four parameter variables included the molecular weight of polyetheramine, molar ratio of polyetheramine to EP, the molar

content of UPy-D400, and mass content of the ZIF-8@Ca microfillers. An initial 32 sets of experimental conditions were extracted from the 256 sets by orthogonal Latin square design method35.

This is a method based on mathematical statistics and the orthogonality principle, which can achieve the equivalent results of a large number of comprehensive tests with the minimum number

of tests. It selects a part of points which can represent the whole experiment according to the orthogonality of the experiments. And these selected points are uniformly distributed in the

whole space36,37. Then, the coatings were prepared for EIS measurements according to these 32 conditions, the corresponding the low impedance modulus (lg|_Z_|0.01Hz value) of different

scratched coatings was obtained. The reason for selecting lg|_Z_|0.01Hz value as the output instead of using |_Z_|0.01Hz value is to eliminate the undesirable effects caused by sample

dataset with high variability. Measurements of lg|_Z_|0.01Hz experimental values of scratched coatings that comprise our initial dataset are reported in Table 2. Figure 2 shows the

distribution of lg|_Z_|0.01Hz experimental values. As shown in Fig. 2, the average lg|_Z_|0.01Hz experimental values were widely distributed in the range of 4.75–10.87 (lg(Ω·cm2)). According

to a previous experimental study11, the scratched coatings with different self-healing abilities are involved in this distribution, indicating that the selection of the initial preparation

conditions using the orthogonal Latin square method is reasonable. ASSESSMENT AND SELECTION OF AN LG|_Z_|0.01HZ VALUES PREDICTION MODEL Next step, different experimental conditions and

corresponding lg|_Z_|0.01Hz value of scratched coating were used as the input and output of the machine learning process, respectively, and five common machine learning models were trained

using 32 initial datasets. A comparison of the predicted and measured lg|_Z_|0.01Hz values for each model is shown in Fig. 3a, e. A black dashed straight line indicates equal measured and

predicted values. A comparison of the accuracy of each model is shown in Fig. 3f. Compared with the other models, the RF model yielded the best accuracy in terms of a higher coefficient of

determination (_R_2) value, and lower mean absolute percentage error (MAPE) and root mean square error (RMSE) values. This may be due to its deeper layers of model structure than general

machine learning models; RF models possessed a good processing ability for data with high variability38,39. Hence, the RF model was chosen to predict the lg|_Z_|0.01Hz values in subsequent

steps. ACTIVE LEARNING AND MACHINE LEARNING MODEL PERFORMANCE For the active learning process, the RF model first predicted the lg|_Z_|0.01Hz values of all (256 – 32 = 224 sets) possible

experimental conditions from the 32 initial dataset. The predicted lg|_Z_|0.01Hz values were ranked in descending order. The five top-ranked experimental conditions from 224 sets of

conditions were selected as proposals for subsequent measurements to be performed in the laboratory. These five measurements were added to the initial 32 datasets. Then, the machine learning

model for the prediction of the lg|_Z_|0.01Hz values was trained again on this improved (32 + 5) dataset. The new measurements were re-used in the RF model to improve the accuracy, as this

can enhance the prediction accuracy for high-target performance samples in a targeted manner and improve the active learning efficiency. This process, from the prediction phase to the reuse

phase, represents one cycle of active learning (see Table 3). This active learning process is repeated until the preliminary goal of the best accuracy of the machine learning model is

achieved. In this study, the active learning cycle was stopped if all the evaluation indices (MAPE, RMSE and _R_2) stopped increasing. Figures 4a–g present scatter plots of the predicted

versus measured lg|_Z_|0.01Hz values from the initial dataset to the last cycle. The blue and red dots indicate existing and new measurements, respectively. The evolution of the

corresponding _R_2, MAPE and RMSE values for each cycle is summarized in Fig. 4h, i. As shown in Figs. 4a–g, the predicted and measured values gradually approached the black dashed straight

line from the initial dataset to the last cycle, indicating that an increase in the dataset size resulted in predicted lg|_Z_|0.01Hz values that are closer to measured lg|_Z_|0.01Hz values.

As the dataset size increased, _R_2 clearly increased, and the MAPE and RMSE decreased gradually. After five active learning cycles, the _R_2, MAPE and RMSE values reached equilibrium, at

this time, the active learning process was terminated. For the dataset of 62 samples, the RF model achieved _R_2, MAPE and RMSE values of 0.709, 0.081 and 0.685 (lg(Ω·cm2)), respectively.

Compared to the accuracy of the initial dataset, improvements of 246%, 51% and 47% were achieved for _R_2, MAPE, and RMSE, respectively. In this case, _R_2 was greater than 0.7 and both MAPE

and RMSE were stabilized at a low level, indicating that the RF model reached acceptable accuracy. Therefore, the active learning procedure was stopped at this stage and the RF model was

fixed based on the existing dataset. In addition, Table 3 lists the top-five proposed experiments for the five cycles of active learning with the corresponding predicted and measured

lg|_Z_|0.01Hz values. Several measured lg|_Z_|0.01Hz values in Table 3 that were greater than 11.00 (lg(Ω·cm2)), which is greater than the highest value in the initial dataset, showed that

the RF model allowed us to predict the experimental conditions of the coating with a potentially high self-healing ability. These additional data on high-performance self-healing coatings

are beneficial for further maximization using Bayesian optimization. In addition, the proposed experiments required polyetheramine of molecular weights 400 and 2000 g·mol–1, with an _r_

value greater than 0.85, 10-20 mol% of UPy-D400, and ZIF-8@Ca microfiller content in the full range. This provided the main guidance for refining the test conditions in the subsequent step.

BAYESIAN OPTIMIZATION FOR SCREENING OPTIMAL CANDIDATE In this step, three experimental conditions were refined: _r_ values, molar ratio of UPy-D400, and microfiller content were varied from

0.85 to 1.00, 10 to 20 mol%, and 5.5 to 10.0 wt.%, by increments of 0.1, 1 mol%, and 0.1 wt.%, respectively. The molecular weights of the polyetheramine curing agents were fixed at 400 and

2000 g·mol–1. Obviously, this search space for the coating formulation is vast, and the machine learning model has limited utility if it do not incorporate uncertainty and the expected

improvement process. Since a machine learning model is built using a limited amount of training data, the selection of candidates using that model may be limited to a local search.

Therefore, we speculate that Bayesian optimization may give better results because this optimization technique considers the uncertainty of the prediction and the balance between local and

global search40. Bayesian optimization works on a surrogate model and evaluates a utility function41. The utility function uses the mean and standard deviation of the candidates estimated by

the surrogate model. The utility function encodes a trade-off between the exploitation (candidate searching at points with high mean) and exploration (candidate searching at points with

high uncertainty). Herein, we have used RF as the surrogate model and expected improvement (EI) as a utility function. The EI is defined as the following Eqs. (1)-(2)42:

$${\rm{EI}}({\rm{x}})=\sigma ({\rm{x}})[z\varPhi (z)+\phi (z)]$$ (1) $${\rm{z}}=\,[\mu ({\rm{x}})-{\rm{f}}({{\rm{x}}}^{+})-\varepsilon ]/\sigma ({\rm{x}})$$ (2) where _EI_(_x_) represents

the expected improvement value for each coating formulation candidate. _μ_ and _σ_ are the predicted output and standard deviation of the candidates obtained from the surrogate model,

_f_(_x_+) is the maximum value of the target material property observed in the training data set. _Φ_ represents the cumulative distribution function and _ϕ_ is the probability distribution

function assuming the target property values follows the normal distribution. The term _ε_ regulates the amount of exploration, higher the value of _ε_ more is the exploration. In this

method, the largest EI value represents the most promising coating formulation candidate. Here, we use 1000 iterations for BO run, as this was sufficiently many to predict the optimal

experimental conditions with high accuracy (see Data Availability section for where to access this code), and a series of experiments were conducted starting from rank 1 (Table 4). The new

highest lg|_Z_|0.01Hz values of 11.58 ± 0.28 (lg(Ω·cm2)) was observed, that is, (4.40 ± 2.04) × 1011 Ω·cm2. This impedance modulus value was considerably high compared with those reported in

previous studies on EP-based self-healing coating11,43,44,45,46, which reported a typical lg|_Z_|0.01Hz value range of 7.48–10.68 (lg(Ω·cm2)). The suggested experimental conditions from

Bayesian optimization showed that a relatively low molecular weight of polyetheramine and a high molar ratio of polyetheramine to EP were promising conditions for achieving a high

lg|_Z_|0.01Hz value, whereas the molar ratio of UPy-D400 and microfillers content should be in the middle of their defined range. According to previous studies47,48, excessive amine addition

improves the shape recovery rate of EP materials. The intrinsic self-repair process mentioned in this study is realized by a self-healing unit (hydrogen bond) self-association process on

the premise that the damage can be physically closed. A high shape recovery rate is beneficial for the physical closure of scratched material surfaces11. Excess amine (excessive _r_ value)

leads to higher flexibility but lower mechanical strength of EP materials47, an optimum combination of high strength and good flexibility can be achieved by adjusting the _r_ value precisely

through Bayesian optimization. The introduction of self-healing units and microfillers may also affect the various performance indicators of the coatings, which can balance each addition

amount simultaneously to achieve a reasonable design for target property. Figure 5 shows the distribution of lg|_Z_|0.01Hz values of scratched coatings from the initial dataset, after the

five active learning cycles, and after a Bayesian optimization process. The lg|_Z_|0.01Hz values from the initial dataset were spread randomly from 4.75 to 10.87 (lg(Ω·cm2)). By comparison,

all samples that followed an active learning cycle exhibited a high lg|_Z_|0.01Hz value (>8.23 (lg(Ω·cm2))), and one sample from the Bayesian optimization dataset showed an exceptionally

high lg|_Z_|0.01Hz value. These results demonstrate the potential of our machine learning framework for the design and optimization of high-performance functional materials based on small

sample conditions. INTERPRETATION OF MACHINE LEARNING MODEL FOR COATING DESIGN EIS measurements were conducted on the scratched pure commercial EP and ZIF-8@Ca/EP coatings and their

corresponding intact coatings to study the self-healing and corrosion resistance properties. The ZIF-8@Ca/EP coating was prepared based on the best formulation selected by Bayesian

optimization. Nyquist and Bode plots of the intact coatings were obtained by EIS after 30 min of immersion in 3.5 wt.% NaCl solution (Fig. 6a–c). Figure 6d–i show the Nyquist and Bode plots

of the steels with scratched coatings after immersion for 1, 15, 30 and 60 d. The as-used pure EP coating was prepared by mixing E51 with D400 polyetheramine curing agents at a molar ratio

of 5:3. For the pure EP sample, the intact coating initially showed a high barrier property with large capacitive arc in the Nyquist plot (Fig. 6a) and the high |_Z_|0.01Hz value (3.98 ×

1010 Ω·cm2) in the Bode plot (Fig. 6b). The phase angles in the high frequencies (105 Hz) were close to –90° which indicates the capacitive character of the coatings. In contrast to the

intact pure EP coating, intact ZIF-8@Ca/EP coating exhibited a slightly larger capacitive arc in terms of Nyquist plot, and |_Z_|0.01Hz value rose to 3.82 × 1011 Ω·cm2, indicating

substantial improvement in the barrier property of the coating after the machine learning adjustment. The average and standard deviation of the |_Z_|0.01Hz value for intact coating were

calculated using six parallel samples, expressed as (4.63 ± 2.08) × 1011 Ω·cm2. In terms of the scratched coatings, the capacitive arcs of the pure EP coating shrank and the |_Z_|0.01Hz

values declined gradually over the entire immersion time, demonstrating the continuous deterioration of the barrier property (Figs. 6d–e). Subsequently, for the phase diagrams in Fig. 6f,

scratched pure EP showed two-time constants: one related to the charge transfer process at the coating/substrate interface (10−2−100 Hz), and the other related to the resistance increase by

means of corrosion product formation in the artificial defect (101−105 Hz)49. Compared with the Bode plots for pure EP coating, the Bode plots of the scratched coating showed approximately

–45° straight lines with |_Z_|0.01Hz values in excess of 3.80 × 1011 Ω·cm2 at the beginning of immersion. The corresponding phase angles were –90◦ over the frequency range of 10–1−105 Hz.

This implies that during the immersion, a conductive pathway is not formed through the coating, which largely exhibits a capacitive behavior similar to that of an intact coating50. During

the 60 d of immersion, the |_Z_|0.01Hz values of the ZIF-8@Ca/EP coating only slightly decreased from 3.80 × 1011 Ω·cm2 to 1.23 × 1011 Ω·cm2, confirming that the scratched ZIF-8@Ca/EP

coating had been well repaired and possessed a satisfactory corrosion resistance. After scratching, the pure EP and ZIF-8@Ca/EP coatings were subjected to salt spray tests following the ASTM

B117/D1654 standard. Figures 6b and 7a show the optical images of the coatings after exposure to the salt spray chamber for different periods. According to the visual assessment in Fig. 7a,

green corrosion products were observed at the scratches of the pure EP coating within the 1 d of the salt spray test. After 60 d, large-scale coating delamination and corrosion products

appeared in the scratched region, indicating that the scratched location of the pure EP coating was highly vulnerable to attack by corrosive species. Compared with pure EP, only slight

scratch traces were observed at the scratched positions, and the ZIF-8@Ca/EP coating did not show any signs of degradation (delamination, corrosion, or blistering) after 30 d (Fig. 7b).

Furthermore, as the salt spray exposure time increased to 60 d, only one slight corrosion spot was observed at the scratched site, indicating the corrosion of the scratched ZIF-8@Ca/EP

coating could be controlled in a salt spray environment for a long time. The adhesion strength, an important indicator of coating properties, can be measured using a pull-off test. Figure 7d

shows the adhesion strength/loss values of intact pure EP and ZIF-8@Ca/EP coating before and after the 60 d salt spray test. The optical images of the remaining coatings following the

pull-off test are presented in Fig. 7c. As shown in Fig. 7c, none of the samples exhibits cohesive failure. As shown in Fig. 7c, the dry adhesion strength of the ZIF-8@Ca/EP coatings (9.82

MPa) is higher than that of pure EP (4.70 MPa). This is because the introduction of branched-chain amines and UPy units enhanced the hydrogen bonding between the coating and the metal

surface51. After salt spraying, the pure EP coating exhibited a considerable adhesion loss of 79.4% (0.97 MPa). In contrast, the ZIF-8@Ca/EP coating demonstrated not only the highest wet

adhesion strength (9.50 MPa) but also minimal adhesion loss (3.3%) after a 60 d of salt spray test. In summary, the design of experimental techniques combined with an active learning and

Bayesian optimization was proposed to predict and optimize the lg|_Z_|0.01Hz values of scratched EP self-healing coatings composed of different molecular weights of polyetheramine curing

agent, molar ratios of polyetheramine to E51 EP resin, molar content of UPy-D400 and mass contents of ZIF-8@Ca microfillers. The active learning process yielded the preferred experimental

conditions to build a predictive RF model of lg|_Z_|0.01Hz values with satisfactory accuracy (_R_2 = 0.709, MAPE = 0.081, RMSE = 0.685 (lg(Ω·cm2))) after five cycles of active learning.

Then, an extremely high lg|_Z_|0.01Hz values of 11.58 (|_Z_|0.01Hz = 3.80 × 1011 Ω·cm2) was achieved using the experimental conditions that were refined by Bayesian optimization. As

confirmed by EIS, the ZIF-8@Ca/EP coating exhibited a great healing effect in barrier property (intact sample: 3.82 × 1011 Ω·cm2, repaired sample: 3.80 × 1011 Ω·cm2). In addition, in terms

of the corrosion resistance after repair, the ZIF-8@Ca/EP coating exhibited slight corrosion after 60 d of the salt spray test, and the adhesion loss of the composite coating after the salt

spray test was 3.3%, which was considerably lower than that of the pure EP coating (79.4%). METHODS MATERIALS Polyetheramine curing agents with four different molecular weights (230, 400,

2000 and 4000 g·mol–1) were sourced from the Aladdin Industrial Corporation. The E51 EP resin was sourced from Jiangsu Heli Resin Co., ltd. The ZIF-8@Ca microfillers and the UPy-D400

monomers were obtained using previously published methods11,51. The Q235 mild steel was used as the substrate. PREPARATION OF COATINGS AND EIS TEST Based on the selected 32 experimental

conditions, the preparation process of the self-healing EP coating containing ZIF-8@Ca microfillers (ZIF-8@Ca/EP) is shown in Fig. 8. In each case, the ZIF-8@Ca microfillers were first mixed

with the E51 EP resin under magnetic stirring. The polyetheramine curing agent and UPy-D400 were then added to the mixture using a mechanical agitator at 500 rpm for 10 min. Prior to the

coating preparation, the steel specimens were wet-polished sequentially with 150-, 240- and 400-grit sandpapers, washed with ethanol and blow-dried in an N2 atmosphere. The resulting mixture

was applied to a steel piece using a bar coater. The coated samples were obtained by drying at room temperature for 48 h. The final thickness of each of the dry films was approximately 85

μm. EIS tests were performed to measure the low-frequency impedance (|_Z_|0.01Hz) values of the coated steel with/without an artificial scratch. Herein, all scratches of the EIS tests are

made by a scalpel, and they are reproducible. The EIS results were obtained using a 3.5 wt.% NaCl solution and a CHI-660E electrochemical workstation with a three-electrode cell system

comprising a coated steel substrate as a working electrode, a platinum plate electrode as a counter electrode and a saturated calomel electrode (SCE) as a reference electrode. The test

parameters were set in the 10−2−105 Hz range with a 0.02 V root mean square amplitude. Prior to EIS measurements, artificial through-coating scratches (approximately 3 mm in length and

approximately 60 µm in width) were made on the different coated steels using a scalpel. The measurements were conducted on the coated steels at least five times to ensure the reproducibility

of the EIS results. In EIS results, the |_Z_|0.01Hz value in the Bode plot usually represents the main performance index for the corrosion resistance of a coating, that is, a higher

|_Z_|0.01Hz value reflects a higher barrier property52. Therefore, this index was used to characterize the repair effect of the barrier properties of the coating after scratching. To further

verify the self-healing and long-term anti-anticorrosion ability of the scratched composite coating after machine learning process, salt spray test was performed on the coatings via

exposing the samples to salt spray for 60 d in accordance with ASTM D1654. DATA PRE-PROCESSING, DATA SPLITTING AND MACHINE LEARNING MODELS Data pre-processing and data splitting were

performed and different machine learning models were simulated using the Python package scikit-learn (version 1.1.1). The four variable parameters (Table 4) in this study were standardized

following a standard Gaussian distribution of a mean of 0 and a variance of 153. The purpose of normalization is to make the preprocessed data be limited to a certain range (e.g., [0,1] or

[–1,1]), thus eliminating the undesirable effects caused by sample dataset with high variability. The validity and accuracy of all employed machine learning models were evaluated using

k-fold cross-validation. In this step, the data were randomly arranged and divided into 10 groups. Nine groups were allocated for training purposes, and the remaining group was assigned to

validate of the model. The average value was obtained by repeating the same process 10 times. To obtain the performance level of the model, the MAPE, RMSE and _R_2 were introduced to

evaluate the k-fold cross-validation, using the following Eqs. (3)-(5):54,55,56 $$\,{\rm{MAPE}}=\,\frac{1}{{\rm{n}}}\mathop{\sum

}\limits_{{\rm{i}}=1}^{{\rm{n}}}\frac{|{{\rm{y}}}_{{\rm{i}}}{-\hat{{\rm{y}}}}_{{\rm{i}}}|}{|{{\rm{y}}}_{{\rm{i}}}|}$$ (3) $$\,{\rm{RMSE}}=\sqrt{\frac{1}{{\rm{n}}}\mathop{\sum

}\limits_{{\rm{i}}=1}^{{\rm{n}}}{{({\rm{y}}}_{{\rm{i}}}{-\hat{{\rm{y}}}}_{{\rm{i}}})}^{2}}$$ (4) $${{\rm{R}}}^{2}=1-\frac{{\sum

}_{{\rm{i}}=1}^{{\rm{n}}}{{({\rm{y}}}_{{\rm{i}}}{-\hat{{\rm{y}}}}_{{\rm{i}}})}^{2}}{{\sum }_{{\rm{i}}=1}^{{\rm{n}}}{{({\rm{y}}}_{{\rm{i}}}-\bar{{\rm{y}}})}^{2}}$$ (5) where n is the number

of samples, and ${y}_{i}$ and ${\hat{y}}_{i}$ are the experimental and predicted values of the _i_th sample, respectively. The accuracy of the machine learning model was accessed using

its MAPE (MAPE value is in between 0 and 1, a value closer to 0 indicates greater accuracy57) and RMSE (a lower value of each indicates greater accuracy30) and _R_2 (a value closer to 1

indicates greater accuracy; when the _R_2 coefficient is greater than 0.7, the model represents acceptable accuracy58.) Five machine learning models were applied as regression tools to the

dataset: LR, ANN, SVR, DT and RF models. The machine learning methods are described in detail in the related reference59. The interested reader should refer to the Data Availability section

for where to access our code used to run these algorithms. BAYESIAN OPTIMIZATION Bayesian optimization40 was used to determine the highest lg|_Z_|0.01Hz values by refining the variable

conditions from Table 1. Bayesian optimization was performed using the Python package GPyOpt. DATA AVAILABILITY Source codes for this article are publicly available at

https://github.com/lt1037870521/manuscript-code-EP-Lt. REFERENCES * He, Y. et al. Micro-crack behavior of carbon fiber reinforced Fe3O4/graphene oxide modified epoxy composites for cryogenic

application. _Compos_. _Part A Appl. Sci. Manuf._ 108, 12–22 (2018). Article CAS Google Scholar * Huang, S. et al. An overview of dynamic covalent bonds in polymer material and their

applications. _Eur. Polym. J._ 141, 110094 (2020). Article CAS Google Scholar * Utrera-Barrios, S., Verdejo, R. & López-Manchado, M. A. & Hernández Santana, M. Evolution of

self-healing elastomers, from extrinsic to combined intrinsic mechanisms: a review. _Mater. Horiz._ 7, 2882–2902 (2020). Article CAS Google Scholar * Samadzadeh, M., Boura, S. H.,

Peikari, M., Kasiriha, S. M. & Ashrafi, A. A review on self-healing coatings based on micro/nanocapsules. _Prog. Org. Coat._ 68, 159–164 (2010). Article CAS Google Scholar * Shchukin,

D. G. Container-based multifunctional self-healing polymer coatings. _Polym. Chem._ 4, 4871–4877 (2013). Article CAS Google Scholar * Canadell, J., Goossens, H. & Klumperman, B.

Self-healing materials based on disulfide links. _Macromolecules_ 44, 2536–2541 (2011). Article CAS Google Scholar * Kuang, X. et al. Facile fabrication of fast recyclable and multiple

self-healing epoxy materials through diels-alder adduct cross-linker. _J. Polym. Sci. Pol. Chem._ 53, 2094–2103 (2015). Article CAS Google Scholar * Wen, N. et al. Recent advancements in

self-healing materials: Mechanicals, performances and features. _React. Funct. Polym._ 168, 105041 (2021). Article CAS Google Scholar * Han, Y., Wu, X., Zhang, X. & Lu, C.

Self-healing, highly sensitive electronic sensors enabled by metal–ligand coordination and hierarchical structure design. _ACS Appl. Mater. Inter._ 9, 20106–20114 (2017). Article CAS

Google Scholar * Nardeli, J. V., Fugivara, C. S., Taryba, M., Montemor, M. F. & Benedetti, A. V. Self-healing ability based on hydrogen bonds in organic coatings for corrosion

protection of AA1200. _Corros. Sci._ 177, 108984 (2020). Article CAS Google Scholar * Liu, T. et al. Ultrafast and high-efficient self-healing epoxy coatings with active multiple hydrogen

bonds for corrosion protection. _Corros. Sci._ 187, 109485 (2021). Article CAS Google Scholar * Kim, G., Caglayan, C. & Yun, G. J. Epoxy-based catalyst-free self-healing elastomers

at room temperature employing aromatic disulfide and hydrogen bonds. _ACS omega_ 7, 44750–44761 (2022). Article CAS Google Scholar * Bosnian, A., Brunsveld, L., Folmer, B. & Sijbesma,

R. & Meijer, E. _Macromol. Symp._ 201, 143–154 (2003). Article Google Scholar * Rosero-Navarro, N. C., Pellice, S. A., Durán, A. & Aparicio, M. Effects of Ce-containing sol–gel

coatings reinforced with SiO2 nanoparticles on the protection of AA2024. _Corros. Sci._ 50, 1283–1291 (2008). Article CAS Google Scholar * Wang, J. et al. Two birds with one stone:

Nanocontainers with synergetic inhibition and corrosion sensing abilities towards intelligent self-healing and self-reporting coating. _Chem. Eng. J._ 433, 134515 (2022). Article CAS

Google Scholar * Fan, Z. et al. Self-healing mechanisms in smart protective coatings: a review. _Corros. Sci._ 144, 74–88 (2018). Article Google Scholar * Tao, Q., Xu, P., Li, M. &

Lu, W. Machine learning for perovskite materials design and discovery. _npj Comput. Mater._ 7, 23 (2021). Article Google Scholar * Li, Z. et al. Machine learning in concrete science:

applications, challenges, and best practices. _npj Comput. Mater._ 8, 127 (2022). Article Google Scholar * Zhong, X. et al. Explainable machine learning in materials science. _npj Comput.

Mater._ 8, 204 (2022). Article Google Scholar * Taylor, C. D. & Tossey, B. M. High temperature oxidation of corrosion resistant alloys from machine learning. _npj Mater_. _Degrad_ 5,

38 (2021). CAS Google Scholar * Li, Q. et al. Long-term corrosion monitoring of carbon steels and environmental correlation analysis via the random forest method. _npj Mater_. _Degrad_ 6,

1 (2022). CAS Google Scholar * Al-Haik, M. S., Hussaini, M. Y. & Garmestani, H. Prediction of nonlinear viscoelastic behavior of polymeric composites using an artificial neural

network. _Int. J. Plast._ 22, 1367–1392 (2006). Article CAS Google Scholar * Hatakeyama-Sato, K., Tezuka, T., Umeki, M. & Oyaizu, K. AI-assisted exploration of superionic glass-type

Li(+) conductors with aromatic structures. _J. Am. Chem. Soc._ 142, 3301–3305 (2020). Article CAS Google Scholar * Askland, K. D. et al. Prediction of remission in obsessive compulsive

disorder using a novel machine learning strategy. _Int. J. Methods Psychiatr. Res._ 24, 156–169 (2015). Article Google Scholar * Shao, M., Zhu, X.-J., Cao, H.-F. & Shen, H.-F. An

artificial neural network ensemble method for fault diagnosis of proton exchange membrane fuel cell system. _Energy_ 67, 268–275 (2014). Article Google Scholar * Xu, P., Ji, X., Li, M.

& Lu, W. Small data machine learning in materials science. _npj Comput. Mater._ 9, 42 (2023). Article Google Scholar * Sutojo, T. et al. A machine learning approach for corrosion small

datasets. _npj Mater_. _Degrad_ 7, 18 (2023). Google Scholar * Xiang, K.-L., Xiang, P.-Y. & Wu, Y.-P. Prediction of the fatigue life of natural rubber composites by artificial neural

network approaches. _Mater. Des._ 57, 180–185 (2014). Article CAS Google Scholar * Menon, A., Thompson-Colón, J. A. & Washburn, N. R. Hierarchical machine learning model for

mechanical property predictions of polyurethane elastomers from small datasets. _Front. Mater._ 6, 87 (2019). Article Google Scholar * Pruksawan, S., Lambard, G., Samitsu, S., Sodeyama, K.

& Naito, M. Prediction and optimization of epoxy adhesive strength from a small dataset through active learning. _Sci. Technol. Adv. Mater._ 20, 1010–1021 (2019). Article CAS Google

Scholar * Li, D., Liu, J. & Liu, J. NNI-SMOTE-XGBoost: A novel small sample analysis method for properties prediction of polymer materials. _Macromol. Theory Simul._ 30, 2100010 (2021).

Article CAS Google Scholar * Novikov, I. S., Shapeev, A. V. & Suleimanov, Y. V. Ring polymer molecular dynamics and active learning of moment tensor potential for gas-phase

barrierless reactions: Application to S + H2. _J. Chem. Phys._ 151, 224105 (2019). Article Google Scholar * Kim, C., Chandrasekaran, A., Jha, A. & Ramprasad, R. Active-learning and

materials design: the example of high glass transition temperature polymers. _MRS Commun._ 9, 860–866 (2019). Article CAS Google Scholar * Jha, A., Chandrasekaran, A., Kim, C. &

Ramprasad, R. Impact of dataset uncertainties on machine learning model predictions: the example of polymer glass transition temperatures. _Model. Simul. Mater. Sci. Eng._ 27, 024002 (2019).

Article CAS Google Scholar * Mandl, R. Orthogonal Latin squares: an application of experiment design to compiler testing. _Commun. ACM_ 28, 1054–1058 (1985). Article Google Scholar *

Balak, Z. & Zakeri, M. Application of Taguchi L32 orthogonal design to optimize flexural strength of ZrB2-based composites prepared by spark plasma sintering. _Int. J. Refract. Met. H._

55, 58–67 (2016). Article CAS Google Scholar * Wu, C. J. & Hamada, M. S. _Experiments: planning, analysis, and optimization_. (John Wiley & Sons), (2011). * Breiman, L. Random

forests. _Mach. Learn._ 45, 5–32 (2001). Article Google Scholar * Ji, Y. et al. Random forest incorporating ab-initio calculations for corrosion rate prediction with small sample Al alloys

data. _npj Mater_. _Degrad_ 6, 83 (2022). CAS Google Scholar * Packwood, D. _Bayesian Optimization for Materials Science_. 11-28 (Springer), (2017). * Wagner, T., Emmerich, M., Deutz, A.

& Ponweiser, W. in _Parallel Problem Solving from Nature, PPSN XI: 11th International Conference, Kraków, Poland, September 11-15, 2010, Proceedings, Part I 11_. 718-727 (Springer). *

Mohanty, T., Chandran, K. & Sparks, T. D. Machine learning guided optimal composition selection of niobium alloys for high temperature applications. _APL Mach. Learn._ 1, 036102 (2023).

Article Google Scholar * Cui, G. et al. Research progress on self-healing polymer/graphene anticorrosion coatings. _Prog. Org. Coat._ 155, 106231 (2021). Article CAS Google Scholar *

Nawaz, M., Habib, S., Khan, A., Shakoor, R. A. & Kahraman, R. Cellulose microfibers (CMFs) as a smart carrier for autonomous self-healing in epoxy coatings. _N. J. Chem._ 44, 5702–5710

(2020). Article CAS Google Scholar * Zhang, C., Wang, H. & Zhou, Q. Preparation and characterization of microcapsules based self-healing coatings containing epoxy ester as healing

agent. _Prog. Org. Coat._ 125, 403–410 (2018). Article CAS Google Scholar * Wang, T. et al. Photothermal nanofiller-based polydimethylsiloxane anticorrosion coating with multiple cyclic

self-healing and long-term self-healing performance. _Chem. Eng. J._ 446, 137077 (2022). Article CAS Google Scholar * Zheng, N., Fang, G., Cao, Z., Zhao, Q. & Xie, T. High strain

epoxy shape memory polymer. _Polym. Chem._ 6, 3046–3053 (2015). Article CAS Google Scholar * Li, J., Rodgers, W. R. & Xie, T. Semi-crystalline two-way shape memory elastomer.

_Polymer_ 52, 5320–5325 (2011). Article CAS Google Scholar * Oliveira, C. & Ferreira, M. Ranking high-quality paint systems using EIS. Part I: intact coatings. _Corros. Sci._ 45,

123–138 (2003). Article Google Scholar * Hao, Y., Sani, L. A., Ge, T. & Fang, Q. Phytic acid doped polyaniline containing epoxy coatings for corrosion protection of Q235 carbon steel.

_Appl. Surf. Sci._ 419, 826–837 (2017). Article CAS Google Scholar * Liu, T. et al. Self-healing and corrosion-sensing coatings based on pH-sensitive MOF-capped microcontainers for

intelligent corrosion control. _Chem. Eng. J._ 454, 140335 (2023). Article CAS Google Scholar * Tavandashti, N. P. et al. Inhibitor-loaded conducting polymer capsules for active corrosion

protection of coating defects. _Corros. Sci._ 112, 138–149 (2016). Article Google Scholar * Zheng, X., Zheng, P. & Zhang, R.-Z. Machine learning material properties from the periodic

table using convolutional neural networks. _Chem. Sci._ 9, 8426–8432 (2018). Article CAS Google Scholar * De Myttenaere, A., Golden, B., Le Grand, B. & Rossi, F. Mean absolute

percentage error for regression models. _Neurocomputing_ 192, 38–48 (2016). Article Google Scholar * Fukutani, T., Miyazawa, K., Iwata, S. & Satoh, H. G-RMSD: Root mean square

deviation based method for three-dimensional molecular similarity determination. _Bull. Chem. Soc. Jpn._ 94, 655–665 (2021). Article CAS Google Scholar * Uyanık, T., Karatuğ, Ç. &

Arslanoğlu, Y. Machine learning approach to ship fuel consumption: A case of container vessel. _Transp. Res. D.-T. E._ 84, 102389 (2020). Article Google Scholar * Chen, S., Cao, H.,

Ouyang, Q., Wu, X. & Qian, Q. ALDS: An active learning method for multi-source materials data screening and materials design. _Mater. Des._ 223, 111092 (2022). Article CAS Google

Scholar * Faraji Niri, M., Reynolds, C., Román Ramírez, L. A. A., Kendrick, E. & Marco, J. Systematic analysis of the impact of slurry coating on manufacture of Li-ion battery

electrodes via explainable machine learning. _Energy Storage Mater._ 51, 223–238 (2022). Article Google Scholar * Bishop, C. M. & Nasrabadi, N. M. _Pattern Recognition and Machine

Learning_. 4 (Springer), (2006). Download references ACKNOWLEDGEMENTS This work is supported by National Key R&D Program of China (2022YFB3808803). AUTHOR INFORMATION AUTHORS AND

AFFILIATIONS * Beijing Advanced Innovation Center for Materials Genome Engineering, Institute for Advanced Materials and Technology, University of Science and Technology Beijing, Beijing,

100083, China Tong Liu, Zhuoyao Chen, Jingzhi Yang, Lingwei Ma & Dawei Zhang * National Materials Corrosion and Protection Data Center, University of Science and Technology Beijing,

Beijing, 100083, China Tong Liu, Zhuoyao Chen, Jingzhi Yang, Lingwei Ma & Dawei Zhang * College of Materials Science and Engineering, Shenyang University of Chemical Technology,

Shenyang, 110142, China Tong Liu * Institute of Materials Intelligent Technology, Liaoning Academy of Materials, Shenyang, 110004, China Lingwei Ma & Dawei Zhang * Department of

Materials Science and Engineering, Delft University of Technology, Mekelweg 2, Delft, 2628CD, The Netherlands Arjan Mol Authors * Tong Liu View author publications You can also search for

this author inPubMed Google Scholar * Zhuoyao Chen View author publications You can also search for this author inPubMed Google Scholar * Jingzhi Yang View author publications You can also

search for this author inPubMed Google Scholar * Lingwei Ma View author publications You can also search for this author inPubMed Google Scholar * Arjan Mol View author publications You can

also search for this author inPubMed Google Scholar * Dawei Zhang View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS T.L.: investigation,

methodology, and writing—original draft. Z.C: investigation and methodology. J.Y.: investigation. L.M.: investigation. A.M.: writing—review and editing. D.Z.: supervision, conceptualization,

methodology, and writing—review and editing. CORRESPONDING AUTHOR Correspondence to Dawei Zhang. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests.

ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. RIGHTS AND PERMISSIONS OPEN

ACCESS This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format,

as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third

party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the

article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright

holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Liu, T., Chen, Z., Yang, J. _et al._

Machine learning assisted discovery of high-efficiency self-healing epoxy coating for corrosion protection. _npj Mater Degrad_ 8, 11 (2024). https://doi.org/10.1038/s41529-024-00427-z

Download citation * Received: 11 June 2023 * Accepted: 04 January 2024 * Published: 19 January 2024 * DOI: https://doi.org/10.1038/s41529-024-00427-z SHARE THIS ARTICLE Anyone you share the

following link with will be able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer

Nature SharedIt content-sharing initiative