Harmonizing existing climate change mitigation policy datasets with a hybrid machine learning approach

Play all audios:

ABSTRACT With the rapid proliferation of climate policies in both number and scope, there is an increasing demand for a global-level dataset that provides multi-indicator information on

policy elements and their implementation contexts. To address this need, we developed the Global Climate Change Mitigation Policy Dataset (GCCMPD) using a semisupervised hybrid machine

learning approach, drawing upon policy information from global, regional, and sector-specific sources. Differing from existing climate policy datasets, the GCCMPD covers a large range of

policies, amounting to 73,625 policies of 216 entities. Through the integration of expert knowledge-based dictionary mapping, probability statistics methods, and advanced natural language

processing technology, the GCCMPD offers detailed classification of multiple indicators and consistent information on sectoral policy instruments. This includes insights into objectives,

target sectors, instruments, legal compulsion, administrative entities, etc. By aligning with the sector classification of the Intergovernmental Panel on Climate Change (IPCC) emission

datasets, the GCCMPD serves to help policy-makers, researchers, and social organizations gain a deeper understanding of the similarities and distinctions among climate activities across

countries, sectors, and entities. SIMILAR CONTENT BEING VIEWED BY OTHERS MACHINE LEARNING MAP OF CLIMATE POLICY LITERATURE REVEALS DISPARITIES BETWEEN SCIENTIFIC ATTENTION, POLICY DENSITY,

AND EMISSIONS Article Open access 11 February 2025 MEASURING CHINA’S POLICY STRINGENCY ON CLIMATE CHANGE FOR 1954–2022 Article Open access 31 January 2025 A COMPUTATIONAL APPROACH TO

ANALYZING CLIMATE STRATEGIES OF CITIES PLEDGING NET ZERO Article Open access 26 August 2022 BACKGROUND & SUMMARY With the first World Climate Conference in 1979 marking the start of an

international focus on climate change issues, the number and scope of climate-related policies have increased substantially. At the global level, an international governance scheme

formulated the Kyoto Protocol in 1997 and the Paris Agreement in 2015 to set up national mitigation targets and ancillary mechanisms. National, subnational and sectoral policies have also

flourished in recent years, with some local consideration of other objectives such as environmental protection, economic development, equity thinking and sustainable development.

Quantitative policy studies have emerged accordingly, either assessing the specific policy effect empirically or simulating the policy impacts virtually. The literature focusing on the

analysis of a single policy typically involves only a single country (or even a specific industry in a certain country)1,2,3,4,5,6 or a comparison of a few countries7,8, thus requiring a

limited amount of policy data. A more detailed and consistent climate policy dataset enables large-scale quantitative analysis of global climate policies and provides new information for

policy-makers. For example, through detailed classification based on direct and indirect laws, the differences in the coverage scope of greenhouse gas targets can be analysed9. Through the

classification of instruments, various patterns of policy convergence and divergence across countries and the driving factors behind them can be compared10. On the other hand, the

differences in the circumstances of policy formulation and implementation are driving wide variations with similar policy instruments, which have not been fully captured for better

evaluating the policy effects. In addition, some rising concerns, such as the constraints of policy adoption in developing economies, the synergistic effects of the policy mix, and the

spillover effects of market-based instruments across regions, call for more comparative studies with global or cross-regional perspectives. Such demands in policy science require a more

finely categorized, multi-indicator climate policy dataset that allows for convincing investigation of policy design under the premise of clarifying the implementation context. Furthermore,

the climate policy literature has diversified significantly with the emergence of new climate practices and the availability of more advanced data processing methods. According to SciVal

(see Supplementary Information (SI1) for detailed search terms), the number of articles in climate policy-related research areas has been increasing annually since 1997, reaching 6,946

articles in 2022. Concurrently, the number of topics within the global climate policy research domain has also expanded, rising from 499 in 2018 to 751 in 2022 (Supplementary Figs. 1-2).

Climate policy evolution represents one of the emerging branches within this research domain, encompassing various aspects such as policy coverage9,11,12, factors influencing policy

adoption10,13,14, policy diffusion15,16, and policy themes and trends17,18,19,20,21. The utilization of policy quantity as the primary variable has broadened the scope of research on policy

effects and their relationship with greenhouse gases22,23,24. At the same time, there has been a gradual shift towards more detailed and consistent exploration of policy design and

comparison, adopting a mixed policy perspective25,26,27,28,29,30. In addition, research on the trade-offs between multiple goals31,32,33,34 also requires the support of multi-indicator

policy datasets. To the best of our knowledge, the existing global climate policy datasets are still not sufficient to meet the above research demands. The following points are taken as an

example. (1) The lack of policy instruments for specific sectors prevents in-depth research on the policy mix for specific sectors. The lack of linkage between sectoral policies and sectoral

carbon emissions, coupled with the varying focus of existing datasets on sectors35, further compounds these limitations. (2) Variations in policy versions and coverage can also undermine

the robustness of research that relies on policy density as a metric35. (3) The separation of the law from a large number of supportive and weaker policies will also cause the role of

supportive policies to be ignored36. The above shortcomings can be summarized as follows. First, the numbers and scopes of policies are still not complete, and updates cannot be made in time

due to manual methods of data collection and processing. Second, there is a lack of consistent and more detailed policy categorization in comparable standards, partly due to the overlapping

scope of sectors, complexity of entities involved, and difficulty in identifying indirect policies. These limitations also make it impractical to conduct comparative analysis for the

identification of additional features across datasets. Additionally, studies comparing existing mainstream global datasets have shown that differences in data coverage and inconsistent

classification standards lead to inconsistent results when using different datasets35. Finally, legal enforcement characteristics and information about administrative entities are crucial in

driving changes in policy design. The aforementioned features may impact the practical outcomes of policy implementation and serve as a complement to policy ambitions35,36. We attempted to

construct the Global Climate Change Mitigation Policy Dataset (GCCMPD) to fill the gap between the existing climate policy databsets and the increasing demand for research by harmonizing the

existing datasets with a hybrid machine learning approach. The GCCMPD currently includes 73,625 policies of 216 entities, with a substantial increase in the numbers and scopes of policies

through a complete search of available sources. For each policy in the dataset, we extracted important policy features such as target sector, policy instrument, objective, binding force,

executive/legislative, jurisdiction, etc. We mainly used semisupervised machine learning and combined various natural language processing methods to achieve consistent classification of each

feature of the policy, which enhances the objectivity and extensibility of the data. Compared with the existing datasets, our dataset can better respond to the demands of empirical studies

and cross-national comparisons of policies. First, our dataset can be used to build more accurate policy indicators. On the one hand, our dataset includes more policy records than any

individual dataset and has fewer omitted policies, which can ease the underestimation of the efforts of the entities to mitigate climate change. Coordinated climate policies can also enrich

single or small n-case country case studies30,35,37. On the other hand, our dataset provides features of the policies, including the binding force, the executive/legislative and the

objectives. Since the differences in the circumstances of policy formulation and implementation can vary greatly among policies with different binding forces (for example, laws and

preparative instruments) and strongly influence the output of the policies, future studies can weigh the number of policies with their binding force or add the number of soft and hard laws

into their model as two separate variables rather than simply counting the number of policies23,24,35,36. Second, our dataset enables us to measure the output of sectoral policies. Our

dataset categorizes the policies into sectors according to the IPCC Sixth Assessment Report (AR6) standard and can be easily linked to emission datasets with the same standard, such as the

Emissions Database for Global Atmospheric Research (EDGAR). Thus, with the sectoral emission data from these datasets and climate policy indicators calculated by sector instruments with our

dataset, the sectoral emission reduction performance of the policy system (rather than the emission reduction performance of single policies38,39,40,41) can be measured. Third, the policy

similarity information of our dataset supports a better exploration of climate policy diffusion. We provide information on the similarity of the context of climate policies, facilitating the

inference of policy diffusion using both temporal and contextual data. This approach has the potential to outperform existing studies42, which primarily rely on inferring the probability of

diffusion from one entity to another based solely on the time gap between the adoption of climate policies in the two entities. Finally, our dataset includes information on the binding

force and policy objectives, enabling studies to explore the relationship between these features and other factors. For example, researchers can investigate how various conditions, such as

economic or geographic factors, may influence the binding force of policies, as well as the differences in policy outcomes associated with different levels of binding force. Moreover, our

dataset enables studies to focus on the dynamics between hard law and soft law, such as examining how soft law transitions into hard law and to what extent different types of laws influence

emissions43,44. METHODS We constructed the GCCMPD using a multilevel process and framework (Fig. 1). Specifically, this entailed source identification and selection of policy data, selection

of policy characteristic indicators, data collection, data processing, manual checks, verification, annotation, dataset expansion and some possible uses. The final data are also stored in

different ways to facilitate various types of users. DATA IDENTIFICATION Before identifying the data source, it is important to define the boundaries of the dataset described in this paper.

At present, there is no clear definition of climate policy, and existing attempts to define climate policy have adopted very broad and vague scopes20. The reason is that climate policy

covers a wide range of areas. For example, greenhouse gases and pollutants have common sources45, resulting in considerable overlap between climate policy and energy and environmental

policy. Sometimes climate policy is directly defined as policies to reduce greenhouse gases and air pollution. Our purpose is to construct a detailed classification of multiple-indicator

multipurpose datasets, so we use a relatively broad definition to define climate policy as a set of laws, regulations, strategies, and other measures related to climate change, which can be

further divided into climate mitigation and climate adaptation policies. Dataset users can filter by attributes such as sources to obtain more targeted subsets of data, despite the broad

boundaries of the dataset. According to the above definition, the GCCMPD roughly consists of three sources (see Table 1). First, global datasets, the Climate Change Laws of the World (CCLW),

the International Energy Agency (IEA) Policies and Measures Dataset, and the Climate Policy dataset (CP) were used. Second, regional datasets, such as the Asia Pacific Energy Portal (APEP)

and the European Environment Agency (EEA), cover a certain area and have a considerable number of policies. The third category is datasets that focus on specialized fields such as

technology, policy instruments, and legislation. These data are extremely specialized but narrow in scope. For the ECOLEX dataset, we selected climate-related subjects (see Supplementary

Table 1 for details). We selected IEA, CP, and CCLW as the training sets for manual annotation, accounting for approximately 16% of all data sources (excluding the legal part of ECOLEX,

which accounts for approximately 70%). Since the GCCMPD is a multi-indicator dataset, it is appropriate to choose authoritative and diverse data sources, and these three datasets have

gradually been recognized by the academic community18,22,23. We first harmonized the three datasets to form a “core” dataset via the natural language processing technique and manual

checking. The “core” dataset can support the demand of analysing global climate mitigation policies since these three datasets cover various types of climate policies (laws, commitment

agreements, goals; national level, subnational level) from almost all of the entities in the world, which can basically guarantee the diversity and representativeness of the core dataset.

Additionally, we noticed that the data coverage of the “core” dataset can be further improved by including data from other data sources. Then, we used the machine learning models trained on

the manually checked core dataset to label data from other sources and formed a full dataset. The IEA, CP, and CCLW datasets have great potential to complement each other in terms of country

coverage, sector coverage, laws and nonlaws, policy versions, policy scope, etc. CCLW is a specialized climate law dataset that collects climate laws or equally important policies23, with

the highest policy importance but a relatively small number of policies. The IEA was the first of the three datasets to collect policy data on energy-related policies in the building sector.

CP is a synthesized dataset containing policies from other datasets (including CCLW and IEA) and various reporting sources, characterized by detailed information on policy targets9,12,

instruments and sectors11. After harmonizing with the IEA and CP datasets, the following complements were formed: (1) subnational policies and early policy versions; (2) nonlegal supportive

“weaker” policies; (3) specific sector policies (such as the building sector); and (4) more balanced national policies. A detailed comparison of the main characteristics of the three

datasets is given in Supplementary Table 2. The main purpose of comparison is to find which classification information (the fields in bold in the table) of these three datasets can help us

classify. INDICATOR SELECTION AND DESCRIPTION The GCCMPD aims to provide a multi-indicator dataset that can be researched and consulted by scholars in various fields, policy-makers, and

various social organizations. We mainly chose the following indicators to meet different research needs: sector, instrument, objective, legally binding force, enforcement settings and

jurisdiction. Due to the diverse sources of the GCCMPD, choosing a unified standard and definition is conducive to maintaining the consistency of the dataset. * SECTOR: One of the

characteristics of the GCCMPD is that it can be closely integrated with the carbon emission dataset, which requires sectoral classification of policies to match them with GHG emissions. Most

mainstream GHG emissions datasets46,47 refer to the IPCC 2006 reporting guidelines48 for sectoral classification and generally attribute global GHG emission sources to five broad sectors49

of energy systems, industry, buildings, transport and AFOLU (agriculture, forestry and other land uses). For the above reasons, the GCCMPD also divides sectors into these five sectors,

adopting the same sectoral classification standards as the carbon emissions dataset based on the EDGAR50, and the classification standards and descriptions are shown in Supplementary Table

3. * INSTRUMENT: A taxonomy of policy instruments can help policy-makers formulate policy packages, provide possible options for policy practice, and better evaluate climate policies. Some

taxonomies, such as the NATO scheme51 (A widely-used standard that categorizes policies based on the types of governing resources they rely on: Nodality, Authority, Treasure, and

Organization.), represent general criteria and are less specific to a certain domain of climate mitigation. We chose the classification standard of the IPCC Fifth Assessment Report (AR5)52,

which has the following advantages. First, the classification is authoritative. The classification criteria are summarized by climate experts based on a large amount of literature,

reflecting the climate policies that the academic community focuses on. Second, after the modification from the Third Assessment Report (TAR)53 to the AR5, the classification has strong

completeness and timeliness. Third, there are sector-level classifications of policy instruments for the above five sectors, which can help to understand and use them in conjunction with

sectoral classifications. As a result, policy instruments are divided into seven categories: taxes, tradable allowances, subsidies, government provisions of public goods or services,

information programmes, regulatory approaches, and voluntary actions. Detailed sector-level standards and examples can be found in Supplementary Table 4. * OBJECTIVE: The mitigation of

climate change is only one of many public issues, and sometimes it is necessary to judge whether climate change has co-benefits and adverse side effects on society, the economy, and the

environment. For almost the same reasons as “instruments”, we chose the AR5 taxonomy, which categorizes climate mitigation impacts on economic, social, and environmental objectives/issues.

The detailed subobjective classification can be found in Supplementary Table 5. * BINDING FORCE: Whether the same policy instrument is legally binding affects its effectiveness, which is

also the core difference between hard law and soft law. The concept of soft law was originally common in the EU and international law54. At present, countries need to consider multiple goals

while mitigating climate change. Therefore, soft law has become a flexible choice when making climate commitments cautiously. Both soft and hard laws have advantages and disadvantages. Hard

law usually balances the interests of all parties and is more democratic but has a long promulgation cycle and high costs. Soft law is formulated and adopted more quickly, but at the same

time, there are problems of poor consideration and they can be overly ambitious55. In addition, the soft law is a good complement to the hard law, and its functions can be divided into

prelaw functions, postlaw functions, and para-law functions55. For soft law, we refer to the classification based on function (as an alternative to legislation) proposed by Senden, L55,

which divides soft law into three main categories: preparatory and informative instruments, interpretative and decisional instruments, and steering instruments. For hard laws, we mainly

consider the priority when two hard laws conflict, that is, the legal hierarchy56. We refer to the legal hierarchy summarized by Clegg, M. _et al_.56, combined with the legal hierarchy of

the European Union, the United States and other countries, and divide hard law into three main categories: constitution, statutes/legislation, and regulations/rules. The classification rules

and descriptions of soft law and hard law can be found in Supplementary Table 6, and some examples are shown in Supplementary Table 7. * EXECUTIVE/LEGISLATIVE: The enactment of policies by

executive or legislative bodies can partially measure the durability of policies (assuming that executive measures can more easily be removed by new governments, while laws are more durable,

given that their removal requires the approval of the legislative body), referring to Iacobuta _et al_.9 Policy durability has attracted attention9 because it affects the expectations of

policy implementation objects. For example, expectations regarding the durability of policies may affect short-term and long-term decision-making judgements of enterprises, thereby impacting

the effectiveness of policy implementation. What is slightly different is that, like the CCLW, the GCCMPD considers whether policies are enacted by the legislative or executive branch of

the government on the basis of law or strategy. For example, China’s 14th five-year plan is classified as executive in the CCLW and GCCMPD but not legislative. * JURISDICTION: In addition to

classifying the scope of policy from an industrial perspective, such as sector, the classification of policy jurisdiction can help to better analyse the interaction between central and

local governments57, international competition and multilateral cooperation58, and regional emissions governance59. Therefore, we divided policy jurisdictions into international, national,

subnational area (the broader category with multiple provinces), and subnational (single state/provinces, municipality/city) jurisdictions. DATA COLLECTION Some challenges of data collection

are that there are few websites that can be directly packaged and downloaded, the webpage rules of each data source are different, and the rules are constantly updated. First, important

policy features were manually selected based on the characteristics of each website (all policies include at least title, content, and year). Then, Python toolkits such as Requests (an

elegant and simple library allows you to send HTTP requests) and Selenium (is used to automate web browser interaction from Python) were used to obtain the data. All the raw data are stored

in document form (CSV, EXCEL) and data platform form (MongoDB). DATA PROCESSING Through steps 1-12 of the data processing shown in Fig. 1, we constructed a reference (original) version of

the training set data as of December 31, 2021. To ensure the consistency of manual annotation and reduce manual annotation errors, the classification information of the IEA, CP, and CCLW

datasets was used as much as possible. As shown in Supplementary Table 2, among the main indicators of the GCCMPD, the preliminary classifications of sector, instrument, objective (including

detailed classification of subsector, sector-instrument and subobjective), and jurisdiction can refer to the IEA, CP, and CCLW classification information. The two indicators of binding

force and executive/legislative can only partially refer to the IEA, CP, and CCLW classification information. Therefore, additional legal keywords were constructed with reference to legal

dictionaries, such as Black’s law dictionary (https://thelawdictionary.org/) and Duhaime’s law dictionary (http://www.duhaime.org/), to assist in classification. Therefore, these steps were

intuitively divided into three categories: IEA, CP, and CCLW classification information (steps 2, 4, 6, and 11); IEA, CP, and CCLW information (steps 7, 8, and 9); and simple rules and

mappings (remaining steps). A detailed description of the processing steps can be found in the Supplementary Information (SI2); the mapping dictionary built based on IEA, CP, and CCLW can be

found in Supplementary Table 8-16; and the keywords built with reference to the legal dictionaries can be found in Supplementary Table 17-18. Although there are differences in the

definitions of mitigation and adaptation, thus, the two are strongly related60. Many climate policies often involve both mitigation and adaptation policies, and different datasets differ in

terms of whether the same climate policy involves mitigation or adaptation. The GCCMPD adopts a negative list method to construct keywords (Supplementary Table 19) that significantly

characterizes adaptation policies and remove them (One of the keywords in Supplementary Table 19 appears in the policy title or content, then the policy is judged to be climate adaptation.)

from the dataset. Below are the descriptive statistics of the training set. The training set contains climate policy data from 199 entities (198 countries and the European Union). Table 2

illustrates the distribution of entities and policies on each continent. According to Table 2, the number of African, Asian, and European entities is the largest. For the number of policies,

although the number of European entities is not the largest, European entities issued the most policies among the six continents. The average number of policies issued by each entity is the

highest in Europe (89.96) and lowest in Africa (16.09). There are 10088 policies ranging from 1927 to 2021 (among them, several policies in the IEA dataset are scheduled to enter into force

in 2025) included in our core dataset. As Table 3 shows, national climate policies make up approximately 90% of our sample, while international policies account for 3% and subnational

policies for 7%. We divided the policies into five sectors and found that most of the policies targeted energy systems (3871) and that the fewest policies focused on AFOLU (838). There are

also 3843 policies focused on the whole economic level rather than on a single sector or several other sectors. Policies are divided into seven types according to the instrument they apply.

Most policies employ the government provision of public goods or services (7040) and regulatory approaches (4467), while only 121 policies use the tradable allowance instrument. For the

binding force of the policies, soft laws constitute a greater share of the sample (69.73%), while hard laws constitute only 30.27% of the sample. Law/Act is the most popular hard law, while

Preparatory Instruments is the most widely used soft law. The classification of co-benefit objectives is mainly economic objective, followed by social objective. As Table 4 shows, by

combining these three datasets, the number of policies in each sector has almost doubled. All four datasets have a high share of energy sector policies. In addition, the integration of the

construction sector and the AFOLU sector shows the benefits of coordination. The IEA dataset also has a high proportion of construction sector policies, while the CCLW dataset has a

relatively high proportion of AFOLU sector policies. GCCMPD generates complementary advantages through the integration of these datasets, further highlighting the significant potential for

complementarity35 between the IEA, CP and CCLW. MANUAL CHECK, VERIFICATION, ANNOTATION CHECK FOR DUPLICATES Although merging the three datasets can combine their respective strengths, there

will be some duplication of policies, as the three datasets share many of the same sources. Duplicate policies do not affect analysis, such as policy coverage9,11,12 and policy

sequencing18,21, but do affect analysis based on policy strength23,24. Based on the Best Match 25 algorithm (BM2561, a ranking algorithm based on probabilistic relevance framework for

generating the similarity score between two texts), we used all policy titles as a corpus, grouped the data by country, year, and jurisdiction, and calculated the text similarity score

between policies. Human checks combined with text similarity scores eventually revealed duplicate policies (Supplementary Fig. 3). VERIFICATION AND ANNOTATION It is also necessary to

manually check, verify and annotate each indicator. Based on the reference version, the manual labelling bias was reduced. During manual verification, we also found that there were some

policy instruments and sectors that cannot be found directly from the policy content, and the reference version was used to assist the manual work in making more accurate annotations. After

manual checking and correcting the dictionary mapping several times, we found that even graduate students majoring in energy and climate were prone to label mistakes. Here are a few

illustrations of how to manually check policy sectors. We judged whether the policies belonged to the energy system or building sector based on whether the home solar PV system was connected

to the grid because self-sufficiency alone cannot be considered the energy supply. Heating and cooling generally refer to household heating and cooling appliances (buildings) rather than

district heating (energy systems). Fossil fuel extraction (energy systems) depends mainly on how the carbon emissions dataset is classified. Standards for building parking spaces to promote

the use of new energy vehicles are categorized as transport and building policies. Equipment safety and photovoltaics are considered industry policies. Biofuels are considered transport

policies, biomass power generation is considered for energy systems, and the manufacture of biofuels may involve industry and AFOLU. It should be noted that climate policy data differ from

carbon emissions data. It is very common for a policy to involve multiple sectors. For the instrument indicator, the tax category is relatively easy to mislabel. To distinguish it from

subsidies, new taxes and tax increases are classified as taxes, and tax reductions and accelerated depreciation are classified as subsidies. Since the tax classification of the IEA is based

on whether a certain type of tax appears in the policy text, the tax reduction will also be recorded as a tax, so it needs to be manually corrected. In addition, grants, awards, R&D

subsidies, etc., also need to be classified as subsidies or/and government provisions of public goods or services according to the content of the policy. Since the government provision of

public goods or services involves capacity building and the removal of barriers, it includes the establishment of institutions, the provision of financial support, etc. Another thing to

point out is that subsectors (sector-instrument and subobjective) are more detailed but not completely classified; that is, this does not mean that a policy classified as a certain sector

(instrument or objective) will necessarily be further classified into a certain subsector (sector-instrument and subobjective) (Supplementary Figs. 4-6). For example, some energy plans

mentioned several sectors but did not involve specific subsectors. Some government funds support renewable energy power generation without specifying whether it is investment, R&D

support, or subsidies. SEARCH, VERIFICATION, AND SUPPLEMENTATION OF POLICY CONTENT The policy content also requires many manual searches and verification. According to Table 5, many policy

contents in the CP and IEA are missing or too short, and this will affect the effectiveness of manual checking indicators, machine learning model training, and topic models. We tried to find

the content of the policy by searching the source webpage. For some countries whose official language is English (e.g., Canada), the source web page was invalid or incorrect, and for

countries whose official language is not English (e.g., Argentina, Spain, Germany, Japan, etc.), there were language problems. We used Google Search (when the source web page was invalid or

incorrect) in addition to optical character recognition (OCR) technology and Google Translate for scanned documents and language issues. Through the above efforts, the missing value of

policy content has been greatly reduced, and the quality of policy content has improved (Table 5). In addition, manual verification also included translating the policy title and content

into English and correcting the policy year and jurisdiction. DATASET EXPANSION After the above data processing and manual inspection, a refined training dataset containing 10088 policies

was formed (GCCMPD-IEA-CP-CCLW). However, similar to other datasets, the GCCMPD-IEA-CP-CCLW still had shortcomings, such as high labour costs, slow update speed, and difficulty expanding

(some data sources do not have relevant indicator information and cannot form dictionary maps). In this section, we used several state-of-the-art natural language processing (NLP) and

machine learning techniques to expand the dataset. As shown in Table 6, through the application of machine translation, multilabel classification, single-label classification, named entity

recognition, text similarity, and topic models, dataset construction was completed only by relying on policy titles and content. MULTILABEL AND SINGLE-LABEL CLASSIFICATION The GCCMPD’s

indicators, sector, instrument, and objective are multilabel classifications, while binding force and executive/legislative are single-label classifications. We chose the ClimateBERT62

algorithm, which is based on the Bidirectional Encoder Representations from Transformers (BERT) model, because it has high performance63,64, relatively low fine-tuning cost compared with

large language models such as Generative pretrained transformer (GPT) or Large Language Model Meta AI (Llama), and concentration on climate change (pretrained on the text corpus of abstracts

of climate-related research papers, corporate and general news, and company reports, and because it is currently the latest model in the field of climate change), and it is widely used in

existing studies65,66. By comparing ClimateBERT and the traditional machine learning models logistic regression classifier (LR), naïve Bayes (NB), and support vector machine (SVM) with

ClimateBERT, the results (Supplementary Table 21–25) demonstrate the advantages of ClimateBERT in the climate field. The detailed model training details and comparisons can be found in the

Supplementary Information (SI3). NAMED ENTITY RECOGNITION (NER) Judging climate policy jurisdictions by identifying states/provinces and countries that appear in policies through NER is an

intuitive and highly interpretable method. Specifically, we first used the Roberta-based67 “en_core_web_trf” model to identify geopolitical entities and then use “countryinfo” to distinguish

“state/province” or “country”. Furthermore, the “Subnational area” and “International” were determined according to the number of different cities and countries in a policy (Supplementary

Table 26). TEXT SIMILARITY In the GCCMPD-IEA-CP-CCLW, after calculating the BM25 score, we determined the duplication of policies from different sources by manual checking in groups.

However, for data expansion, two issues needed to be considered: 1. When the amount of data increases rapidly, methods based on a large amount of labour are no longer applicable. 2. How can

we objectively judge whether two policies are the same? Some text similarity methods, such as bag-of-words/TFIDF combined with cosine similarity, can obtain a standardized value, but

thresholds such as 0.8 still need to be set subjectively, and the methods are relatively rough. We still used the BM25 algorithm and combined the ideas of model training and evaluation to

automatically find duplicates (Fig. 2). Finally, based on the optimal ranking of 6061 and an F1 score of 0.8 (Supplementary Fig. 7), the optimal BM25 score was determined. A policy with a

similarity score greater than the optimal BM25 score was judged to be a duplicate policy. The technical details and detailed instructions can be found in the Supplementary Information (SI2).

TOPIC MODEL Policy topic analysis can aid in understanding policy evolution17 and has become a new branch of research in the field of public policy. Currently, BERTopic68 is a

state-of-the-art topic model that effectively uses word embedding to transform input documents into numerical representations, dimensionality reduction to reduce the dimensionality of the

embeddings to a workable dimensional space and clustering similar embeddings to obtain topic representations through c-TF-IDF (an adjusted TF-IDF to work on cluster/categorical/topic level

instead of document level). The GCCMPD provides 4 topic models for different analysis needs, namely, GCCMPD-IEA-CP-CCLW-Topic, GCCMPD-EXPAND-Topic (counting multilateral policies as

multiple), GCCMPD-Topic (counting multilateral policies as one) and GCCMPD-EXCEPT-ECOLEX-Topic (excluding ECOLEX). The specific model details can be found in the Supplementary Information

(SI2). DATA RECORDS To facilitate the use of various scientific researchers, government staff, and personnel of international organizations, the GCCMPD provides three forms of data storage:

MySQL, MongoDB, and EXCEL. All data records have been uploaded through Figshare: https://doi.org/10.6084/m9.figshare.22590028.v269. The relational data management system (MySQL) can well

reflect the composition of core data (Fig. 3): * POLICY: contains all deduplicated climate mitigation policies (policy ID, policy title and content translated into English, policy source). *

ADDITIONAL INFORMATION: includes the year the policy was published, the original title and content of the policy, and the policy source document web page. * CHARACTERISTIC: contains

indicators, sector, objective, instrument, executive/legislative, and binding force, for each policy. * SIMILAR INFORMATION: provides the total number of policies in each policy group, the

index, the BM25 score, and the title of the most similar policy in the group. * COUNTRY AND JURISDICTION: provides information about the country where the policy was adopted. The table is

also linked to external countries’ economic (e.g., GDP, imports and exports), social (e.g., population, democracy), political (e.g., left-wing or right-wing, federal), and energy (e.g.,

primary energy usage) and characteristic dataset media. * TOPIC. provides the topic category of each policy, topic name, the most representative words under this topic, the probability that

the policy belongs to this topic, and whether it is a representative policy of this topic. In addition to the core data, the GCCMPD also retains the processing results of each step.

Supplementary Table 27 lists some important result files and their descriptions. These files are all stored in EXCEL. TECHNICAL VALIDATION The data quality of the GCCMPD is affected by

several factors. The first is the data source. The comprehensiveness of information from different data sources varies greatly, which directly affects the model training ability. The second

is the reliability of the annotation and training sets and the training ability of the model. Since the construction idea of the GCCMPD is to first form a training set through manual

labelling and then extrapolate to a richer data source, the quality of the annotation and training set is highly important. We took the following steps to enhance the data quality: * MANUAL

DATA RETRIEVAL: The determination of data sources was performed through high-quality literature, Google searches, and data sources from authoritative datasets. The requirements for the data

sources were as follows: 1. Include the title and content of the policy, and there should not be too many missing values; 2. Include the year of the policy; 3. The entity information of the

policy release is contained as much as possible. * MANUAL ANNOTATIONS AND CHECKS: We compared the annotations of PhD students in multiple climate and energy fields (with each person

annotating a small number of fields) and found that the differences in their annotations were concentrated in several places (Manual check, verification, annotation section). Although we had

given classification criteria and explained where it is easy to misclassify, manual annotation was inevitably subjective. Fortunately, objectivity and consistency can be guaranteed through

dictionary mapping. It can be considered that IEA, CP, and CCLW data are the results of authoritative experts. We regard them as real values and the GCCMPD data as predicted values and

compare the results (It is called comparison because the prediction result is not based on the model, but only draws on the metrics to give a more intuitive comparison result) of the two

(Tables 7–11, for the comparison of subsectors, sector instruments, and subobjectives. See Supplementary Table 28-30 for additional information). The resulting recall rates were generally

high, even close to 1, which means that we fully refer to the results of dictionary mapping. Some categories with lower recall results (such as Other Strategy Plan or Target) indicate that

we later focused on checking such results. Some results with low precision rates were mostly because the original policy data source did not have corresponding keyword labels, and we

annotated them during manual verification and inspection. The precision rate of the taxes category was not high because we checked and found that the classification standard of the IEA

dataset was different from that of the GCCMPD dataset, while the precision rate of the subsidies category was not high because keywords such as “Award” and “Grants” cannot determine whether

they are subsidies. Therefore, we did not construct a dictionary for relevant keywords and instead we added annotations through manual inspection. For the manual inspection of duplicate

data, we also tried to be as detailed as possible and record the specific corresponding relationship of policy duplication. * ENHANCED MODEL TRAINING CAPABILITIES: Based on the use of

state-of-the-art models, we improved model learning by translating non-English policies into English and checking and completing policy content. In terms of manually checking policy content,

we selected content that could be used to summarize the policy, such as goals, purposes, and objectives, as well as information that could be used to assist in judging the legal category.

After the GCCMPD data are made public, we will also enhance the data sources and correct errors through interactions with data users. Nonetheless, our dataset still has flaws, one of which

is that our dataset does not contain adaptation policies. Climate adaptation is a very important field70, especially for countries with high climate vulnerability and high mitigation costs.

Future research can improve our study by constructing an adaptation policy dataset. CODE AVAILABILITY Code, dataset and some intermediate results are freely available on the following GitHub

repository: https://github.com/HUANGZHIHAO1994/GCCMPD-climate-policy-dataset. REFERENCES * Zhu, J., Ge, Z., Wang, J., Li, X. & Wang, C. Evaluating regional carbon emissions trading in

China: effects, pathways, co-benefits, spillovers, and prospects. _Climate Policy_ 22, 918–934 (2022). Article Google Scholar * Zhu, J., Fan, Y., Deng, X. & Xue, L. Low-carbon

innovation induced by emissions trading in China. _Nat Commun_ 10, 4088 (2019). Article ADS PubMed PubMed Central Google Scholar * Cui, J., Wang, C., Zhang, J. & Zheng, Y. The

effectiveness of China’s regional carbon market pilots in reducing firm emissions. _Proc. Natl. Acad. Sci. USA_ 118, e2109912118 (2021). Article CAS PubMed PubMed Central Google Scholar

* Yamazaki, A. Jobs and climate policy: Evidence from British Columbia’s revenue-neutral carbon tax. _Journal of Environmental Economics and Management_ 83, 197–216 (2017). Article Google

Scholar * Maestre-Andrés, S., Drews, S., Savin, I. & van den Bergh, J. Carbon tax acceptability with information provision and mixed revenue uses. _Nat Commun_ 12, 7017 (2021). Article

ADS PubMed PubMed Central Google Scholar * Liski, M. & Tahvonen, O. Can carbon tax eat OPEC’s rents? _Journal of Environmental Economics and Management_ 47, 1–12 (2004). Article

Google Scholar * Demailly, D. & Quirion, P. European Emission Trading Scheme and competitiveness: A case study on the iron and steel industry. _Energy Economics_ 30, 2009–2027 (2008).

Article Google Scholar * Nong, D., Simshauser, P. & Nguyen, D. B. Greenhouse gas emissions vs CO2 emissions: Comparative analysis of a global carbon tax. _Applied Energy_ 298, 117223

(2021). Article Google Scholar * Iacobuta, G., Dubash, N. K., Upadhyaya, P., Deribe, M. & Höhne, N. National climate change mitigation legislation, strategy and targets: a global

update. _Climate Policy_ 18, 1114–1132 (2018). Article Google Scholar * Lachapelle, E. & Paterson, M. Drivers of national climate policy. _Climate Policy_ 13, 547–571 (2013). Article

Google Scholar * Nascimento, L. _et al_. Twenty years of climate policy: G20 coverage and gaps. _Climate Policy_ 22, 158–174 (2022). Article Google Scholar * Dubash, N. K., Hagemann, M.,

Höhne, N. & Upadhyaya, P. Developments in national climate change mitigation legislation and strategy. _Climate Policy_ 13, 649–664 (2013). Article Google Scholar * Fankhauser, S.,

Gennaioli, C. & Collins, M. Do international factors influence the passage of climate change legislation? _Climate Policy_ 16, 318–331 (2016). Article Google Scholar * Fankhauser, S.,

Gennaioli, C. & Collins, M. The political economy of passing climate change legislation: Evidence from a survey. _Global Environmental Change_ 35, 52–61 (2015). Article Google Scholar

* Tews, K., Busch, P.-O. & Jorgens, H. The diffusion of new environmental policy instruments1. _Eur J Political Res_ 42, 569–600 (2003). Article Google Scholar * Busch, P. &

Jörgens, H. The international sources of policy convergence: explaining the spread of environmental policy innovations. _Journal of European Public Policy_ 12, 860–884 (2005). Article

Google Scholar * Biesbroek, R., Wright, S. J., Eguren, S. K., Bonotto, A. & Athanasiadis, I. N. Policy attention to climate change impacts, adaptation and vulnerability: a global

assessment of National Communications (1994–2019). _Climate Policy_ 22, 97–111 (2022). Article Google Scholar * Linsenmeier, M., Mohommad, A. & Schwerhoff, G. Policy sequencing towards

carbon pricing among the world’s largest emitters. _Nat. Clim. Chang._ 12, 1107–1110 (2022). Article ADS CAS Google Scholar * Averchenkova, A., Fankhauser, S. & Nachmany, M. _Trends

in Climate Change Legislation_. (Edward Elgar Publishing, Cheltenham, UK; Northampton, MA, USA, 2017). * Meckling, J. & Allan, B. B. The evolution of ideas in global climate policy.

_Nat. Clim. Chang._ 10, 434–438 (2020). Article ADS Google Scholar * Meckling, J., Sterner, T. & Wagner, G. Policy sequencing toward decarbonization. _Nat Energy_ 2, 918–922 (2017).

Article ADS Google Scholar * Le Quéré, C. _et al_. Drivers of declining CO2 emissions in 18 developed economies. _Nat. Clim. Chang._ 9, 213–217 (2019). Article ADS Google Scholar *

Eskander, S. M. S. U. & Fankhauser, S. Reduction in greenhouse gas emissions from national climate legislation. _Nature Climate Change_ 10, 750–756 (2020). Article ADS CAS Google

Scholar * Chen, P. _et al_. The heterogeneous role of energy policies in the energy transition of Asia–Pacific emerging economies. _Nature Energy_ 7, 588–596 (2022). Article ADS Google

Scholar * Lamb, W. F. & Minx, J. C. The political economy of national climate policy: Architectures of constraint and a typology of countries. _Energy Research & Social Science_ 64,

101429 (2020). Article Google Scholar * Peñasco, C., Anadón, L. D. & Verdolini, E. Systematic review of the outcomes and trade-offs of ten types of decarbonization policy instruments.

_Nat. Clim. Chang._ 11, 257–265 (2021). Article ADS Google Scholar * Rogge, K. S. & Reichardt, K. Policy mixes for sustainability transitions: An extended concept and framework for

analysis. _Research Policy_ 45, 1620–1635 (2016). Article Google Scholar * van den Bergh, J. _et al_. Designing an effective climate-policy mix: accounting for instrument synergy. _Climate

Policy_ 21, 745–764 (2021). Article Google Scholar * Schmidt, T. S. & Sewerin, S. Measuring the temporal dynamics of policy mixes – An empirical analysis of renewable energy policy

mixes’ balance and design features in nine countries. _Research Policy_ 48, 103557 (2019). Article Google Scholar * Schmidt, N. M. & Fleig, A. Global patterns of national climate

policies: Analyzing 171 country portfolios on climate policy integration. _Environmental Science & Policy_ 84, 177–185 (2018). Article Google Scholar * Fankhauser, S., Hepburn, C.

& Park, J. Combining multiple climate policy instruments: how not to do it. _Clim. Change Econ._ 01, 209–225 (2010). Article Google Scholar * Viguié, V. & Hallegatte, S. Trade-offs

and synergies in urban climate policies. _Nature Clim Change_ 2, 334–337 (2012). Article ADS Google Scholar * Persha, L., Agrawal, A. & Chhatre, A. Social and Ecological Synergy:

Local Rulemaking, Forest Livelihoods, and Biodiversity Conservation. _Science_ 331, 1606–1608 (2011). Article ADS CAS PubMed Google Scholar * Bryan, B. A. _et al_. Designer policy for

carbon and biodiversity co-benefits under global change. _Nature Clim Change_ 6, 301–305 (2016). Article ADS Google Scholar * Schaub, S., Tosun, J., Jordan, A. & Enguer, J. Climate

Policy Ambition: Exploring A Policy Density Perspective. _Politics and Governance_ 10, 226–238 (2022). Article Google Scholar * Nascimento, L. & Höhne, N. Expanding climate policy

adoption improves national mitigation efforts. _npj Clim. Action_ 2, 12 (2023). Article Google Scholar * Schaffrin, A., Sewerin, S. & Seubert, S. Toward a Comparative Measure of

Climate Policy Output. _Policy Studies Journal_ 43, 257–282 (2015). Article Google Scholar * CLÒ, S. The effectiveness of the EU Emissions Trading Scheme. _Climate Policy_ 9, 227–241

(2009). Article Google Scholar * Sandoff, A. & Schaad, G. Does EU ETS lead to emission reductions through trade? The case of the Swedish emissions trading sector participants. _Energy

Policy_ 37, 3967–3977 (2009). Article Google Scholar * Castro, P. Does the CDM discourage emission reduction targets in advanced developing countries? _Climate Policy_ 12, 198–218 (2012).

Article Google Scholar * Lin, B. & Li, X. The effect of carbon tax on per capita CO2 emissions. _Energy Policy_ 39, 5137–5146 (2011). Article Google Scholar * Desmarais, B. A.,

Harden, J. J. & Boehmke, F. J. Persistent Policy Pathways: Inferring Diffusion Networks in the American States. _American Political Science Review_ 109, 392–406 (2015). Article Google

Scholar * Lawrence, P. & Wong, D. Soft law in the Paris Climate Agreement: Strength or weakness? _Review of European_. _Comparative & International Environmental Law_ 26, 276–286

(2017). Article Google Scholar * Vihma, A. Analyzing Soft Law and Hard Law in Climate Change. in _Climate Change and the Law_ (eds. Hollo, E. J., Kulovesi, K. & Mehling, M.) 143–164

(Springer Netherlands, Dordrecht, 2013). https://doi.org/10.1007/978-94-007-5440-9_7. * Thompson, T. M., Rausch, S., Saari, R. K. & Selin, N. E. A systems approach to evaluating the air

quality co-benefits of US carbon policies. _Nature Clim Change_ 4, 917–923 (2014). Article ADS Google Scholar * Gütschow, J. _et al_. The PRIMAP-hist national historical emissions time

series. _Earth Syst. Sci. Data_ 8, 571–603 (2016). Article ADS Google Scholar * Jeffery, M. L., Gütschow, J., Gieseke, R. & Gebel, R. PRIMAP-crf: UNFCCC CRF data in IPCC 2006

categories. _Earth Syst. Sci. Data_ 10, 1427–1438 (2018). Article ADS Google Scholar * Eggleston, H. S, Buendia, L, Miwa, K, Ngara, T. & Tanabe, K. 2006 IPCC Guidelines for National

Greenhouse Gas Inventories. (2006). * Lamb, W. F. _et al_. A review of trends and drivers of greenhouse gas emissions by sector from 1990 to 2018. _Environ. Res. Lett._ 16, 073005 (2021).

Article ADS CAS Google Scholar * Minx, J. C. _et al_. A comprehensive and synthetic dataset for global, regional, and national greenhouse gas emissions by sector 1970–2018 with an

extension to 2019. _Earth Syst. Sci. Data_ 13, 5213–5252, https://doi.org/10.5194/essd-13-5213-2021 (2021). * Hood, C., Margetts, H. & Hood, C. _The Tools of Government in the Digital

Age_. (Palgrave Macmillan, Basingstoke, 2007). * IPCC. _Climate Change 2014: Mitigation of Climate Change: Working Group III Contribution to the Fifth Assessment Report of the

Intergovernmental Panel on Climate Change_. (Cambridge University Press, New York, NY, 2014). * Metz, B., Davidson, O., Swart, R. & Pan, J. _Climate Change 2001: Mitigation: Contribution

of Working Group III to the Third Assessment Report of the Intergovernmental Panel on Climate Change_. (Cambridge university press, 2001). * Robilant, A. D. Genealogies of Soft Law. _The

American Journal of Comparative Law_ 54, 499–554 (2006). Article Google Scholar * Senden, L. _Soft Law in European Community Law_. vol. 1 (Hart publishing, 2004). * Clegg, M., Ellena, K.,

Ennis, D. & Vickery, C. _The hierarchy of laws: understanding and implementing the legal frameworks that govern election_. International Foundation for Electoral Systems, Arlington, VA

(2016). * Di Gregorio, M. _et al_. Multi-level governance and power in climate change policy networks. _Global Environmental Change_ 54, 64–77 (2019). Article Google Scholar * Doukas, H.,

Karakosta, C. & Psarras, J. RES technology transfer within the new climate regime: A “helicopter” view under the CDM. _Renewable and Sustainable Energy Reviews_ 13, 1138–1143 (2009).

Article Google Scholar * Bulkeley, H. Cities and the Governing of Climate Change. _Annual Review of Environment and Resources_ 35, 229–253 (2010). Article Google Scholar * Parry, M. L.,

Canziani, O., Palutikof, J., Van der Linden, P. & Hanson, C. _Climate Change 2007-Impacts, Adaptation and Vulnerability: Working Group II Contribution to the Fourth Assessment Report of

the IPCC_. vol. 4 (Cambridge University Press, 2007). * Robertson, S. & Zaragoza, H. The Probabilistic Relevance Framework: BM25 and Beyond. _FNT in Information Retrieval_ 3, 333–389

(2009). Article Google Scholar * Webersinke, N., Kraus, M., Bingler, J. & Leippold, M. ClimateBERT: A Pretrained Language Model for Climate-Related Text. _arXiv preprint

arXiv:2110.12010_ (2021). * Gök, A., Antai, R., Milošević, N. & Al-Nabki, W. Building the European Social Innovation Database with Natural Language Processing and Machine Learning. _Sci

Data_ 9, 697 (2022). Article PubMed PubMed Central Google Scholar * Minaee, S. _et al_. Deep learning–based text classification: a comprehensive review. _ACM computing surveys (CSUR)_

54, 1–40 (2021). Article Google Scholar * Kölbel, J. F., Leippold, M., Rillaerts, J. & Wang, Q. Ask BERT: How Regulatory Disclosure of Transition and Physical Climate Risks Affects the

CDS Term Structure*. _Journal of Financial Econometrics_ 22, 30–69 (2024). Article Google Scholar * Bingler, J. A., Kraus, M., Leippold, M. & Webersinke, N. Cheap talk and

cherry-picking: What ClimateBert has to say on corporate climate risk disclosures. _Finance Research Letters_ 47, 102776 (2022). Article Google Scholar * Liu, Y. _et al_. Roberta: A

robustly optimized bert pretraining approach. _arXiv preprint arXiv:1907.11692_ (2019). * Grootendorst, M. BERTopic: Neural topic modeling with a class-based TF-IDF procedure. _arXiv

preprint arXiv:2203.05794_ (2022). * Huang, Z., Wu, L., Zhang, X. & Wang, Y. Global Climate Change Mitigation Policy Database. _Figshare_ https://doi.org/10.6084/m9.figshare.22590028.v2

(2023). * Lesnikowski, A., Ford, J., Biesbroek, R., Berrang-Ford, L. & Heymann, S. J. National-level progress on adaptation. _Nature Clim Change_ 6, 261–264 (2016). Article ADS Google

Scholar Download references ACKNOWLEDGEMENTS This study acknowledges financial support from the National Natural Science Foundation of China (71925010). AUTHOR INFORMATION AUTHORS AND

AFFILIATIONS * School of Data Science, Fudan University, Shanghai, 200433, China Libo Wu, Zhihao Huang, Xing Zhang & Yushi Wang * Institute for Big Data, Fudan University, Shanghai,

200433, China Libo Wu * School of Economics, Fudan University, Shanghai, 200433, China Libo Wu * Shanghai Institute for Energy and Carbon Neutrality Strategy, Fudan University, Shanghai,

200433, China Libo Wu Authors * Libo Wu View author publications You can also search for this author inPubMed Google Scholar * Zhihao Huang View author publications You can also search for

this author inPubMed Google Scholar * Xing Zhang View author publications You can also search for this author inPubMed Google Scholar * Yushi Wang View author publications You can also

search for this author inPubMed Google Scholar CONTRIBUTIONS L.W. conceived the study. Z.H. performed analysis. All authors (Z.H., Y.W., X.Z. and L.W.) interpreted the data. Z.H. prepared

the manuscript. Z.H., Y.W. and L.W. revised the manuscript. CORRESPONDING AUTHOR Correspondence to Libo Wu. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing

interests. ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. SUPPLEMENTARY

INFORMATION SUPPLEMENTARY FIGURES RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing,

adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons

licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a

credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted

use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT

THIS ARTICLE CITE THIS ARTICLE Wu, L., Huang, Z., Zhang, X. _et al._ Harmonizing existing climate change mitigation policy datasets with a hybrid machine learning approach. _Sci Data_ 11,

580 (2024). https://doi.org/10.1038/s41597-024-03411-z Download citation * Received: 01 November 2023 * Accepted: 23 May 2024 * Published: 04 June 2024 * DOI:

https://doi.org/10.1038/s41597-024-03411-z SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not

currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative