Play all audios:
ABSTRACT Networks of interacting DNA oligomers are useful for applications such as biomarker detection, targeted drug delivery, information storage, and photonic information processing.
However, differences in the chemical kinetics of hybridization reactions, referred to as kinetic dispersion, can be problematic for some applications. Here, it is found that limiting
unnecessary stretches of Watson-Crick base pairing, referred to as unnecessary duplexes, can yield exceptionally low kinetic dispersions. Hybridization kinetics can be affected by
unnecessary intra-oligomer duplexes containing only 2 base-pairs, and such duplexes explain up to 94% of previously reported kinetic dispersion. As a general design rule, it is recommended
that unnecessary intra-oligomer duplexes larger than 2 base-pairs and unnecessary inter-oligomer duplexes larger than 7 base-pairs be avoided. Unnecessary duplexes typically scale
exponentially with network size, and nearly all networks contain unnecessary duplexes substantial enough to affect hybridization kinetics. A new method for generating networks which utilizes
in-silico optimization to mitigate unnecessary duplexes is proposed and demonstrated to reduce in-vitro kinetic dispersions as much as 96%. The limitations of the new design rule and
generation method are evaluated in-silico by creating new oligomers for several designs, including three previously programmed reactions and one previously engineered structure. SIMILAR
CONTENT BEING VIEWED BY OTHERS AUTOMATED DESIGN OF SCAFFOLD-FREE DNA WIREFRAME NANOSTRUCTURES Article Open access 20 May 2025 SYNTHETICALLY ENCODED COMPLEMENTARY OLIGOMERS Article 16
November 2023 REVEALING THE STRUCTURES OF MEGADALTON-SCALE DNA COMPLEXES WITH NUCLEOTIDE RESOLUTION Article Open access 04 December 2020 INTRODUCTION Molecules of single stranded
deoxyribonucleic acid, sometimes referred to as DNA oligomers, can form duplexes where two complementary base-sequences are held together by Watson-Crick base-pairing. Networks of DNA
oligomers interacting through intentional duplexes have been used to fabricate structures1,2,3,4,5,6,7 and program chemical reactions8,9. Demonstrated applications for such networks include
biomarker detection10,11, genomic sequencing12, targeted drug delivery13,14,15, artificial gene regulation16,17, information storage18,19, and photonic information processing20,21.
Hybridization reactions (i.e., reactions where DNA oligomers form new duplexes) are fundamental to these networks. However, DNA oligomers undergoing hybridization reactions sometimes exhibit
a large kinetic dispersion (i.e., a large difference in chemical kinetics)12,22,23. Large kinetic dispersions may cause issues such as false negatives during biomarker detection10,11,
limited throughput of genomic sequencing12, failure to release medication13,14,15, under-expression of a regulated gene16,17, or loss of stored information18,19. Large kinetic dispersions
may also contribute to the inconsistent structure formation3,24,25,26 and high development costs4,27,28,29,30 sometimes associated with networks of DNA oligomers. A given network of DNA
oligomers is often intended to form specific duplexes which either implement a chemical reaction network31,32 or create a spatial arrangement of matter (i.e., a two-dimensional7,33,34 or
three-dimensional35 shape). Such networks may also form other duplexes, referred to here as unnecessary duplexes. It has been previously established that unnecessary duplexes can affect
hybridization reactions9,23,36,37,38,39,40, making them one potential contributor to kinetic dispersion. However, several other factors known to affect hybridization reactions may also
contribute to kinetic dispersion. For separate aqueous solutions which contain oligomers with the same base-sequences, initial state, and final state, these known factors include
temperature12,22,41,42,43,44, ionic strength45,46, and viscosity45,47. For separate aqueous solutions containing oligomers with different base-sequences but similar initial and final states,
these known factors additionally include oligomer length42 and duplex stability12,22,48,49. Here, the relationship between unnecessary duplexes and kinetic dispersion is studied using a
combination of in-silico and in-vitro methods. First, by analyzing existing experimental data, it is found that unnecessary intra-oligomer duplexes are a key contributor to kinetic
dispersion and that nearly all previously reported kinetic dispersion can be explained by known causes. It is recommended that when generating new networks all unnecessary intra-oligomer
duplexes larger than 2 base-pairs and all unnecessary inter-oligomer duplexes larger than 7 base-pairs be avoided if possible. Satisfying both of these conditions is referred to as the “no
3’s and no 8’s” design rule. By randomly sampling networks in-silico, it is found that nearly all networks contain unnecessary intra-oligomer duplexes substantial enough to cause kinetic
dispersion. A new network generation method which utilizes in-silico optimization to mitigate unnecessary duplexes is proposed and is demonstrated to successfully reduce in-vitro kinetic
dispersion by as much as 96%. Finally, the limitations of both the “no 3’s and no 8’s” design rule and the generation method are studied in-silico by generating new networks for several
designs. It is found that the new generation method substantially increases the designs for which a researcher can satisfy the new design rule, presumably enabling reduced kinetic
dispersions and more reliable performance for future applications. RESULTS ANALYSIS OF KINETIC DISPERSION IN EXISTING EXPERIMENTAL DATA Numerous previous studies have characterized the
chemical kinetics of DNA oligomers undergoing a hybridization reaction, and some of these studies characterized enough samples for a meaningful re-analysis of the data12,22,23. Here, this
existing experimental data was used to ascertain how much kinetic dispersion can be explained by unnecessary duplexes and to better understand the relationship between unnecessary duplexes
and kinetic dispersion. Five datasets (i.e., sets of comparable rate-constant values) were identified for analysis and are listed in Fig. 1a. The hybridization reactions within these
datasets range from a relatively simple duplex-formation reaction to a relatively complicated multiple-intermediate catalytic reaction. Other than unnecessary duplexes, all known
hybridization-reaction-affecting factors (including temperature, ionic strength, viscosity, oligomer length, and duplex-stability) were approximately constant within each dataset. Each
dataset was given a label abbreviating the first author, temperature, and reaction type. For example, the values measured by Zhang et al.12 at 37 °C for a duplex-formation reaction were
labeled Z37F. Histograms of the rate-constant values in each dataset are reported within Fig. 1a. Each rate-constant distribution is left skewed with the mean less than the median and
appears poorly approximated by either a Gaussian distribution or a Galton distribution. Each rate-constant distribution spans at least 3 orders of magnitude, verifying the presence of
substantial kinetic dispersion. The kinetic dispersion of each dataset was further quantified using the Inter-Quartile Range of the Natural Logarithm of the rate-constant values (denoted
IQRNL and detailed in the methods section). Supplementary note 1 reports additional information from the analysis of these datasets, with IQRNL values for each dataset reported in
supplementary table S1. The rate-constant values underlying each histogram can be found in the ‘Supplementary Data 1’ file. Sub-populations with the least substantial unnecessary
intra-oligomer duplexes were identified using the oligomer fitness score above baseline (denoted ΔO and detailed in the methods section). This fitness score penalizes a network with 10_L_
fitness points for each unnecessary intra-oligomer duplex of length _L_. A ΔO value of 0 indicates no unnecessary intra-oligomer duplexes, and increasing ΔO values indicate more substantial
unnecessary duplexes. Plots of rate-constant value as a function of ΔO are reported in supplementary fig. S1. Detailed statistical analyses of the most ΔO-fit sub-populations with number of
samples (n) equal to 2, 3, 4, 8 and 16 are reported in supplementary figs. S2 to S6. The rate-constant values of the 3 most ΔO-fit samples are highlighted in orange in Fig. 1a. A different
underlying rate-constant distribution was clearly resolved for many of the ΔO-fit sub-populations (i.e., the sub-populations with the least substantial unnecessary intra-oligomer duplexes).
The null hypothesis was established that a given sub-population was drawn from the same underlying distribution as the other samples in the dataset. When a Kolmogrov–Smirnov50 test with a
threshold of _p_ = 0.05 was used to test this null hypothesis, it was successfully rejected for 21 of the 25 ΔO-fit sub-populations. For the sub-populations containing the 3 most ΔO-fit
samples, this null hypothesis was successfully rejected for 4 of the 5 datasets. All of the ΔO-fit sub-populations which failed to reject the null hypothesis were from the O25C dataset,
which may be explained by the proximity of the peaks of these rate-constant distributions. When the null hypothesis was instead tested using a Wilcoxon-signed-rank test51 or an
Anderson-Darling test52 similar trends were observed, however the Wilcoxon-signed rank test required larger sample sizes to achieve statistical significance. Lower kinetic dispersions were
clearly resolved for many of the ΔO-fit sub-populations (i.e., the sub-populations with the least substantial unnecessary intra-oligomer duplexes). For the 3 most ΔO-fit samples, kinetic
dispersions were reduced 84%, 92%, 86%, 93%, and 94% relative to equally sized random sub-populations. The value and bounds of these estimates are reported in supplementary table S2 and Fig.
1b (orange bars). No overlap existed between the bounds of these kinetic dispersion estimates and those of randomly selected sub-populations. The large kinetic dispersion reductions for
ΔO-fit sub-populations were interpreted as evidence that the kinetic dispersion in each dataset can be largely explained by unnecessary duplexes contributing to ΔO. Since any intra-oligomer
duplex logically implies the existence of two inter-oligomer duplexes, this includes at least some inter-oligomer duplexes. These reductions also verify that the experimental methods of the
prior researchers mitigated other causes of kinetic dispersion sufficiently that unnecessary intra-oligomer duplexes explain most of the remaining kinetic dispersion. Sub-populations with
the least substantial unnecessary inter-oligomer duplexes were identified using the network fitness score above baseline (denoted ΔN and detailed in the methods section). This fitness score
penalizes a network with 10_L_ fitness points for each unnecessary inter-oligomer duplex of length _L_. A ΔN value of 0 indicates no unnecessary inter-oligomer duplexes, and increasing ΔN
values indicate more substantial unnecessary duplexes. Plots of rate-constant value as a function of ΔN are reported in supplementary fig. S7. Detailed statistical analyses of the most
ΔN-fit sub-populations with a number of samples equal to 2, 3, 4, 8 and 16 are reported in supplementary figs. S8 to S12. A different underlying rate-constant distribution was not clearly
resolved for most ΔN-fit sub-populations (i.e., the sub-populations with the least substantial unnecessary inter-oligomer duplexes). The null hypothesis was established that a given
sub-population was drawn from the same underlying distribution as the other samples in the dataset. When a Kolmogorov–Smirnov test50 with a threshold of _p_ = 0.05 was used to test this null
hypothesis, it was successfully rejected for only 1 of the 25 ΔO-fit sub-populations. For the sub-populations containing the 3 most ΔN-fit samples, this null hypothesis was successfully
rejected for 0 of the 5 datasets. When the null hypothesis was instead evaluated using a Wilcoxon-signed-rank test51 or an Anderson–Darling test52 similar trends were observed. Only
marginally reduced kinetic dispersions were observed for the ΔN-fit sub-populations (i.e., the sub-populations with the least substantial unnecessary inter-oligomer duplexes). For
sub-populations containing the three most ΔN-fit samples (_n_ = 3, Fig. 1b green bars), kinetic dispersions were reduced 37%, 51%, 23%, 39%, and 19% relative to equally sized random
sub-populations (Fig. 1b grey bars). The value and bounds of these estimates are reported in supplementary table S2 and Fig. 1b (green bars). For all 5 datasets, the bounds of these kinetic
dispersion estimates overlapped with those of randomly selected sub-populations. The systematic decrease in kinetic dispersion across all five datasets was interpreted as evidence that at
least some of the kinetic dispersion in these datasets may be explained by unnecessary duplexes contributing to ΔN. This may be explained by the fact that each intra-oligomer duplex
logically implies the existence of two inter-oligomer duplexes. For both the ΔO-fit and the ΔN-fit sub-populations, smaller sample sizes generally correlated with larger kinetic dispersion
reductions. This trend is consistent with attempting to select a small number of fit samples from a larger population. Since the kinetic dispersion reductions were calculated relative to
equally sized random sub-populations, it is unlikely that this trend is caused by biased parameter estimation. While the minimum number of samples necessary to estimate IQRNL is 2 (i.e., _n_
= 2), _n_ = 3 often yielded lower kinetic dispersions. This may be explained by the different definitions of median for even and odd numbers of samples. The unnecessary intra-oligomer
duplexes present to explain the observed kinetic dispersion indicate that relatively small intra-oligomer duplexes are substantial enough to cause kinetic dispersion. For all 5 datasets,
unnecessary intra-oligomer duplexes with 3 or more base-pairs exist to explain the increased kinetic-dispersion of non ΔO-fit samples. However, the 16 most ΔO-fit samples in the H22F dataset
exhibited relatively large kinetic dispersions, yet these samples contain no unnecessary intra-oligomer duplex larger than 2 base-pairs. This was interpreted as evidence that unnecessary
3-base-pair intra-oligomer duplexes are substantial enough to affect hybridization kinetics under typical experimental conditions, and that unnecessary 2-base-pair intra-oligomer duplexes
may be substantial enough to affect hybridization kinetics under certain experimental conditions. The ΔO difference associated with these kinetic dispersion reductions indicates that
relatively small ΔO values are necessary to limit kinetic dispersion. For the 5 datasets, the ΔO of the three most ΔO-fit samples differed from other samples by a median of 3.3 x 103, 7.0 x
103, 7.0 x 103, 1.0 x 105, and 1.1 x 105 fitness points. The smallest of these differences (3.3 x 103) came from the H22F dataset and was associated with an 84% reduction in kinetic
dispersion, suggesting that a ΔO difference of 3.3 x 103 is sufficient to cause substantial kinetic dispersion. Since each unnecessary 3-base-pair intra-oligomer duplex contributes 1 x 103
fitness points to ΔO, the value of 3.3 x 103 is equivalent to approximately 3 unnecessary 3-base-pair duplexes. Unnecessary intra-oligomer duplexes may also explain the marginal kinetic
dispersion reductions observed for the most ΔN-fit sub-populations. The three most ΔN-fit samples in each dataset exhibited kinetic dispersions marginally lower than random samples, and the
ΔO of these ΔN-fit samples differed from other samples by a median of 2.4 x 103, 4 x 103, 4 x 103, 4 x 104, and 5 x 104 fitness points. These values are large enough that the observed
correlation between decreasing ΔN and decreasing kinetic dispersion may be explained by reduced unnecessary intra-oligomer duplexes. The decreased ΔO for ΔN-fit samples can be explained by
the fact that each intra-oligomer duplex logically implies the existence of two inter-oligomer duplexes. Kinetic dispersions were also estimated for the most fit sub-populations according to
several versions of the weighted fitness score above baseline (abbreviated ΔWx and detailed in the methods section). These fitness scores are a weighted linear combination of ΔN and ΔO
governed by the weighting parameter x and can be used to simultaneously quantify intra-oligomer and inter-oligomer unnecessary duplexes. A ΔWx fitness score with a specific x value is
denoted by incorporating the x value into the subscript (for example, ΔWx with x = 10,000 is denoted as ΔW10,000). For all ΔWx, a value of 0 indicates no unnecessary duplexes, and increasing
values indicate more substantial unnecessary duplexes. Kinetic dispersion estimates for the most ΔWx-fit sub-populations for a sample-size of _n_ = 3 are reported in Fig. 1c. For x ≥ 104
the most ΔWx-fit sub-populations were identical to the most ΔO-fit sub-populations, and the same large kinetic dispersion reductions were observed. For x ≤ 102, the most ΔWx-fit samples were
similar or identical to the most ΔN-fit samples, and similarly yielded only marginal kinetic dispersion reductions. These results were expected based on the definition of ΔWx and
demonstrate how larger x values can be used to increase the penalty for unnecessary intra-oligomer duplexes. Since ΔWx with x ≥ 104 performed sufficiently for all five datasets, it was
inferred that this fitness score may be useful across a range of experimental conditions and network designs. Based on the analysis of these 5 datasets, the following “no 3’s and no 8’s”
design rule is proposed for mitigating kinetic dispersion caused by unnecessary duplexes. This design rule has two conditions: (1) the network should contain no unnecessary intra-oligomer
duplex composed of 3 or more contiguous base-pairs, and (2) the network should contain no unnecessary inter-oligomer duplex composed of 8 or more base-pairs. The logic underlying the “no 3’s
and no 8’s” design rule is as follows. For intra-oligomer duplexes, evidence was observed that some unnecessary 3-base-pair intra-oligomer duplexes are substantial enough to affect
hybridization kinetics under typical experimental conditions. Thus, it is recommended that all unnecessary intra-oligomer duplexes of 3 or more base-pairs be removed. For inter-oligomer
duplexes, no clear evidence of unnecessary inter-oligomer duplexes affecting hybridization kinetics was observed, even though all 5 datasets contained such duplexes 7 base-pairs or larger.
Since it is well established that large inter-oligomer duplexes can affect hybridization kinetics9,23,36,37,38,39,40, it is conservatively recommended that unnecessary inter-oligomer
duplexes larger than 7 base-pairs be eliminated. SCALING OF UNNECESSARY DUPLEXES WITH NETWORK SIZE For networks of DNA oligomers, it is known that the largest unnecessary duplex generally
increases with both the number and length of oligomers present38,53,54. Here, the scaling of unnecessary duplexes with these factors was studied using in-silico random sampling. Networks
were randomly sampled for the model system design shown in Fig. 2a, which contained: (1) _i_ oligomers, (2) _j_ bases per oligomer, and (3) no intentional base-pairing. Since each of these
networks contain no intentional duplexes, ΔN = N and ΔO = O for each network. Values of ΔN and ΔO estimated for several _i_ and _j_ combinations are shown in Fig. 2b,c. In these figures,
only data for select _j_ values (i.e., 16, 64, 256, and 1024 bases) are shown. Additional _j_ values (i.e., 8, 32, 128, and 512 bases) were used for fitting but are omitted to reduce clutter
in the figure. Both ΔN and ΔO appear to scale exponentially with both _i_ and _j_. This was modeled using the equations ΔN=_a_∙_i__b_∙_j__c_ and ΔO=_a_∙_i__b_∙_j__c_. In these equations,
_a_, _b_, and _c_ are real-valued constants. The units of constant _a_ are fitness points and constants _b_ and _c_ are both unitless. ΔN values were well modeled by values of _a_ = 0.389,
_b_ = 3.19, and _c_ = 3.68 (green lines in Fig. 2b) and ΔO values were well modeled by values of _a_ = 0.0526, _b_ = 1.56, and _c_ = 3.63 (orange lines in Fig. 2c). Both ΔN and ΔO values
were observed to systematically diverge from this model for small _j_ values, which can be explained by the upper limit oligomer length establishes on unnecessary duplex length. It was
inferred from this data that nearly all networks possess unnecessary duplexes substantial enough to cause kinetic dispersion. This is based on the following logic. First, during the analysis
of existing experimental data (Fig. 1), 3-base-pair unnecessary intra-oligomer duplexes were observed to affect hybridization kinetics under typical experimental conditions. When the length
of the largest intra-oligomer unnecessary duplex is directly calculated for randomly sampled networks, 50% of networks contain 3 base-pair intra-oligomer unnecessary duplexes when: (1) the
network contains any oligomer 16 bases or larger, or (2) the network contains any 16 oligomers longer than 8 base-pairs. Networks satisfying these conditions are a small number of the
countably infinite number of possible networks, implying that nearly all networks possess unnecessary duplexes substantial enough to affect hybridization kinetics. This can also be inferred
using the following logic. According to the definition of ΔO, a single 3 base-pair unnecessary intra-oligomer duplex contributes 103 fitness points to ΔO. Since ΔO increases with both _i_
and _j_, networks with ΔO ≤ 103 are only a small fraction of the countably infinite number of possible networks. Thus, nearly all networks possess unnecessary duplexes substantial enough to
affect hybridization kinetics. The model in Fig. 2c implies that this is true for over 50% of networks when (0.0526) ∙ _i_1.56 ∙ _j_3.63 ≥ 103. This condition is satisfied by networks
containing any of the following: (1) an oligomer 16 bases or longer, (2) any 2 oligomers longer than 11 bases, or (3) any 4 oligomers longer than 8 bases. Based on this logic, the number of
8 base-pair oligomers that contain substantial unnecessary duplexes is approximately ¼ that of the prior logic, which can be explained by the fitness points accumulated from 2 base-pair
unnecessary intra-oligomer duplexes. GENERATION OF OPTIMIZED NETWORKS USING THE SEQEVO AND DEVPRO SOFTWARE Networks forming less substantial unnecessary duplexes were generated using the
following process which is depicted graphically in Fig. 3a. At this level of abstraction, the process begins with formalizing a design (i.e., the oligomers and intentional duplexes) for the
network. An initial implementation of this design, referred to as the initial network, is created as a starting point for optimization. A first round of optimization is then performed using
default parameters, yielding the first candidate network. This candidate network is evaluated in greater detail and either accepted as final or rejected. If rejected, either the optimization
parameters or the network design are updated and the optimization/validation cycle is repeated until a candidate network successfully passes validation. For the optimization step shown in
Fig. 3a, the custom-written Sequence Evolver (abbreviated SeqEvo) computer program was created. This program, instructions for its use, and several usage examples are freely available
online55. SeqEvo is a command line tool written in the java programming language. As input, the program requires four files: (1) a parameters file, (2) a fixed-domains file, (3) a
variable-domains file, and (4) an oligomers file. The parameters file specifies runtime parameters such as which fitness score to optimize. The fixed-domains file specifies a list of named
base-sequences, referred to as fixed domains, which will not be modified during optimization. The variable-domains file specifies a list of named base-sequences, referred to as variable
domains, which will be modified during optimization. The oligomers file specifies a list of named oligomers, with each oligomer declared by listing one or more domains and/or domain
complements. The SeqEvo program compiles the initial network from these files and proceeds with optimization. Optimization is performed using a custom evolution-inspired heuristic algorithm
which rearranges the bases within variable domains to minimize a given fitness score. There are 5 key parameters which control the optimization algorithm (labeled CPL, GPC, NDPM, NL, and
NMPC) and default values for these parameters (CPL = 100000, GPC = 1, NDPM = 1, NL = 8, and NMPC = 1) were determined by maximizing algorithm efficiency for a model system. At the end of the
optimization process, SeqEvo outputs a report file detailing the most-fit network encountered and the runtime parameters used for optimization. Supplementary note 2 reports additional
details regarding the operation of SeqEvo. This includes pseudocode explaining the optimization algorithm (supplementary fig. S13) and details of the process used to identify default
parameter values (supplementary table S3 and supplementary fig. S14). For the validation step shown in Fig. 3a, the custom-written Device Profiler (abbreviated DevPro) software was created.
This program, instructions for its use, and several usage examples are freely available online55. Similar to SeqEvo, DevPro is a command line tool written in the java programming language.
The program requires the same four input files (_i.e_., a parameters file, a fixed-domains file, a variable-domains file, and an oligomers file). DevPro compiles the network and uses an
exhaustive linear search to identify all necessary and unnecessary duplexes. The program then outputs a report summarizing these duplexes. If requested in the parameters file, the program
can also output values for select fitness scores and a list of the largest unnecessary duplexes. Supplementary note 3 reports additional details regarding the DevPro program. This includes
pseudocode for the algorithms used to identify duplexes (supplementary figs. S15 and S16). This also includes example score calculations (supplementary figs. S17 and S18). As of writing,
SeqEvo and DevPro have the following requirements: (1) Both programs require Java SE 8 or newer to be installed and (2) both programs require that the network’s design be described using
binding domains. Correct function of both programs has been verified on computers running Windows, Linux, and MacOS operating systems. Both programs require more time and memory for larger
network designs, however no absolute limits were encountered. The largest network analyzed using the current version of the programs is the “10 x 10 x 10 molecular canvas” reported by Ke et
al.56, which consists of 517 total oligomers containing up to 48 bases per oligomer. If default parameters fail to generate a network of sufficient quality, it is recommended that future
researchers tune the following parameters. (1) The CPL parameter. The CPL parameter controls the number of evolutionary cycles performed during optimization. Increasing CPL increases the
duration of the heuristic algorithm, allowing more mutations to be considered and more evolved networks to be generated. (2) The intraSLC and interSLC parameters, which abbreviate the
intra-oligomer scoring length criterion and the inter-oligomer scoring length criterion, respectively. Unnecessary duplexes less than intraSLC or interSLC do not contribute points to fitness
scores. Thus, increasing these values allows the optimization algorithm greater freedom to reduce larger duplexes by introducing unpenalized smaller duplexes. (3) The maxAA, maxCC, maxGG,
and maxTT parameters. These parameters control the maximum number of consecutive bases which are acceptable in generated networks. The default limits of 6, 3, 3, and 6 were set based on the
observations of previous researchers31,57,58. However, these values substantially limit the number of potential networks, especially for larger network designs. Increasing these parameters
allows the optimization algorithm greater freedom to eliminate unnecessary duplexes but may result in adverse experimental outcomes. (4) The scoringWeightX parameter. The scoringWeightX
parameter controls the value of x for the Wx fitness score. Increasing the scoringWeightX parameter increases the contribution of intra-oligomer duplexes to the Wx score, which allows
greater freedom to reduce intra-oligomer duplexes by increasing inter-oligomer duplexes. The following key observations regarding this generation method were made. (1) This generation method
treats unnecessary duplexes as fully independent and neglects any cooperation between multiple duplexes. It is possible that the effects observed here do not arise from a single unnecessary
duplex, but instead from the cooperative effects of multiple unnecessary duplexes. (2) Both the “no 3’s and no 8’s” design rule and the fitness scores used here quantify unnecessary
duplexes based on their size (_i.e_., the number of base-pairs they contain). This is derived from the idea that all duplexes of a given length are equally bad, which becomes a poor
assumption for duplexes containing approximately 4 or more base pairs since some 4 base-pair duplexes are more thermodynamically stable than some 5 base-pair duplexes. Consequently, when
larger unnecessary duplexes are present, these metrics may favor shorter duplexes which are actually more problematic. However, applying some form of thermodynamic criteria to distinguish
between unnecessary duplexes appears complicated by the fact that it has been specifically reported that some thermodynamically unfavorable duplexes can affect hybridization kinetics22. (3)
The fitness scores used here are inefficient in the sense that they penalize all inter-oligomer and all intra-oligomer unnecessary duplexes equally, while not all unnecessary duplexes appear
to affect all hybridization reactions. Consequently, these scores likely overlook many experimentally viable networks where the unnecessary duplexes are either not stable enough or located
in locations which do not affect the intended hybridization reaction. (4) There is an underlying logical link between intra-oligomer duplexes and inter-oligomer duplexes such that any
intra-oligomer duplex implies the existence of two equally sized inter-oligomer duplexes. It is possible that effects attributed to unnecessary intra-oligomer duplexes can also be explained
by these inter-oligomer duplexes. KINETIC DISPERSION OF NEWLY GENERATED NETWORKS To further investigate the relationship between unnecessary duplexes and kinetic dispersion, in-vitro kinetic
dispersions were measured for several newly generated networks. The model system for this study is shown in Fig. 4a and consists of three oligomers (labeled S1, S2, and S3) which are
intended to undergo two separate hybridization reactions (labeled duplex-formation and strand displacement). New networks were generated using one of four network generation methods: (1)
random sequence assignment (the RND group), optimization of the N fitness score (the N-fit group), optimization of the O fitness score (the O-fit group), or optimization of the W1 fitness
score (W1-fit group). Three networks were generated using each generation method. This number of samples was chosen based on the magnitude of previously observed kinetic dispersion
reductions. For Wx optimization, the value of x = 1 was chosen because it was the smallest x value which satisfied the “no 3’s and no 8’s” design rule. Supplementary note 4 reports
additional details regarding the generation and characterization of these networks. This includes the base-sequence for each oligomer, which are reported in supplementary table S4. The
unnecessary duplexes for each network were characterized using the DevPro program. Values of ΔN and ΔO for each network are shown in Fig. 4b. Histograms summarizing the unnecessary duplexes
for each network are shown in supplementary fig. S19. Networks in the RND group contain unnecessary intra-oligomer duplexes as large as 5 base-pairs and unnecessary inter-oligomer duplexes
as large as 8 base-pairs. No N-fit, O-fit, or W1-fit network contained unnecessary intra-oligomer duplexes larger than 2 base-pairs and no N-fit or W1-fit network contained unnecessary
inter-oligomer duplexes larger than 4 base-pairs. The ΔO > 10,000 exhibited by the N-fit, O-fit, and W1-fit networks result from the existence of numerous small intra-oligomer duplexes.
For example, the N-fit-1 network’s ΔO value of 18,400 is the sum of points from 790 1-base-pair duplexes and 105 2-base-pair duplexes. Relative to the other networks, O-fit networks
contained relatively large unnecessary inter-oligomer duplexes, including inter-oligomer duplexes of up to 10 base-pairs. This indicates a logical connection between inter-oligomer and
intra-oligomer duplexes such that networks with less substantial intra-oligomer duplexes tend to possess more substantial inter-oligomer duplexes. The chemical kinetics of each network were
characterized using a fluorescence signal and modeled using second-order rate equations. The duplex-formation reaction was modeled using the following rate-equation governed by rate constant
kf. $${{{\mbox{S}}}}_{1}+{{{\mbox{S}}}}_{2}\mathop{\to }\limits^{{{{\mbox{k}}}}_{{{\mbox{f}}}}}{{{\mbox{D}}}}_{1}$$ (1)
$$\frac{{{\mbox{d}}}\left[{{{\mbox{D}}}}_{1}\right]}{{{\mbox{dt}}}}={{{\mbox{k}}}}_{{{\mbox{f}}}}\left[{{{\mbox{S}}}}_{1}\right]\left[{{{\mbox{S}}}}_{2}\right]$$ (2) The strand-displacement
reaction was modeled using the following rate-equation governed by rate-constant kd: $${{{\mbox{S}}}}_{1}+{{{\mbox{D}}}}_{2}\mathop{\to
}\limits^{{{{\mbox{k}}}}_{{{\mbox{d}}}}}{{{\mbox{D}}}}_{1}+{{{\mbox{S}}}}_{3}$$ (3)
$$\frac{{{\mbox{d}}}\left[{{{\mbox{D}}}}_{1}\right]}{{{\mbox{dt}}}}={{{\mbox{k}}}}_{{{\mbox{d}}}}\left[{{{\mbox{S}}}}_{1}\right]\left[{{{\mbox{D}}}}_{2}\right]$$ (4) Based on these
equations, the rate-constants kf and kd both have units of M-1s-1. Rate-constants were characterized at several experimental temperatures, and the measured rate-constant values are shown in
Fig. 4c. The largest rate-constant value observed was 5.9 x 107 M-1s-1 (kf, O-fit-1, 60 °C), and the smallest was 9.2 x 103 M-1s-1 (kf, RND-1, 10 °C). The range of these rate-constant values
is consistent with values reported elsewhere in the literature12,22,23,43,47,59,60,61,62,63. The mean and standard deviation of triplicate measurements for select data points are shown as
error bars on Fig. 4c. The maximum difference within these triplicate measurements was 4% or less. Reports detailing the measured rate-constant values and associated fluorescence data are
indexed in supplementary table S5 and shown in supplementary figs. S20 to S95. Network generation via W1 optimization yielded the lowest kinetic dispersions. Across the six experimental
temperatures, these kinetic dispersions were reduced by a median of 86% (formation reaction) and 75% (displacement reaction) relative to the randomly generated networks. These reductions are
consistent with those observed while analyzing existing experimental data (Fig. 1). Three key pieces of information were inferred from the low kinetic dispersions of the W1-fit networks:
(1) Substantial kinetic dispersion remained despite the factors held constant in this study (_e.g_., temperature, viscosity, ionic strength, oligomer length, duplex stability, and toehold
sequence), (2) unnecessary duplexes contributing to W1 explain most of this kinetic dispersion, and (3) reducing unnecessary duplexes using _in-silico_ optimization of W1 substantially
mitigated kinetic dispersion for this model system. Optimizing N yielded networks with unnecessary duplexes very similar to the W1-fit group. However, the kf for two of the N optimized
networks (N-Fit-2 and N-fit-3) differed substantially from the W1-fit group. This effect was especially pronounced at low temperatures, where rate-constants differed up to a factor of 7.
This kinetic dispersion can be explained by unnecessary intra-oligomer duplexes containing 2 base-pairs. This is consistent with the prior evidence that 2-base-pair unnecessary
intra-oligomer duplexes are substantial enough to affect hybridization kinetics under some conditions. It was inferred from this data that the kinetic dispersion of 2-base-pair unnecessary
intra-oligomer duplexes were negligible above approximately 30 °C. Optimizing O yielded two networks with kinetics similar to the W1-fit group, and one network with a substantially faster
duplex-formation reaction. This accelerated reaction may be explained by the numerous large unnecessary inter-oligomer duplexes introduced during O optimization (i.e., duplexes as large as 8
base-pairs). Notably, the two O-fit networks with kinetics similar to the W1-fit group also contain large unnecessary inter-oligomer duplexes (i.e., as large as 10 base-pairs) which
indicates that not all unnecessary inter-oligomer duplexes affect all hybridization reactions. For all optimized networks, kinetic dispersion decreased at higher temperatures. This effect
was especially pronounced for the W1-fit oligomers undergoing the displacement reaction at 50 °C and 60 °C. At these temperatures, kinetic dispersions were reduced by 94% and 96% relative to
randomly generated oligomers. The remaining kinetic dispersion for the W1-fit oligomers at these temperatures appears to be beyond the resolution of the current study, which was estimated
at 4%. The trend of decreasing kinetic dispersion with increasing temperature may be explained by the destabilization of smaller intra-oligomer unnecessary duplexes. From this, it was
inferred that operating networks of DNA oligomers at relatively high temperatures may help mitigate kinetic dispersion, especially for networks containing relatively small unnecessary
duplexes. TEMPERATURE DEPENDENCE OF HYBRIDIZATION KINETICS Rate-constants for the newly generated networks were measured at temperatures of 10, 20, 30, 40, 50 and 60 °C. In this temperature
range, rate constants for the duplex-formation reaction were observed to be strictly increasing, suggesting an Arrhenius temperature dependence. Similar reactions where oligomers form a
single duplex have been characterized in previous studies and both Arrhenius41,43,47,64 and non-Arrhenius22,47,65 temperature dependences have been observed. Alternatively, rate constants
for the strand-displacement reaction typically exhibited a broad peak with a maximum rate-constant between 20 and 40 °C. The decreased strand-displacement rate-constants at higher
temperatures can be explained by a destabilizing intermediate, which was presumed to be a three-oligomer complex prior to strand-displacement. The temperature dependence of duplex-formation
rate-constants were modeled using the following equation: $${{{\mbox{k}}}}_{{{\mbox{f}}}}={{\mbox{A}}}\cdot \exp
\left(\frac{-{{{\mbox{E}}}}_{{{\mbox{a}}}}}{{{{\mbox{k}}}}_{{{\mbox{B}}}}\,{{\mbox{T}}}}\right)$$ (5) Referred to as the Arrhenius equation, this equation includes the following values: the
pre-exponential factor (A), the activation energy (Ea), the absolute temperature (T), and the Boltzmann constant (kB). Based on Eq. 5, a plot of the natural logarithm of kf as a function of
inverse temperature is expected to be linear, and such plots are shown in Fig. 5a. The activation energy and pre-exponential factor extracted from these fits are shown in Fig. 5b. The
duplex-formation rate-constants of all twelve networks were well described by the Arrhenius equation, which models the rate-limiting step of a reaction as a constant energy barrier overcome
using thermal energy. Dispersion in the magnitude of this barrier (_i.e_., the activation energy Ea) followed similar trends to kinetic dispersion and was most uniform for the W1 optimized
networks. A plot of the natural logarithm of the pre-exponential factor as a function of activation energy was observed to be highly linear and is shown in Fig. 5c. This implies the
following equation relating these parameters: $${{{{\mathrm{ln}}}}}\left({{\mbox{A}}}\right)={{{\mbox{C}}}}_{1}+{{{\mbox{C}}}}_{2}{{{\mbox{E}}}}_{{{\mbox{a}}}}$$ (6) where C1 and C2 are real
valued constants. A linear fit to the data in Fig. 5c is shown as a solid line. This fit yielded C1 and C2 values of 19.1 log M-1s-1 and 20.6 x 1019 J-1 log M-1s-1, respectively. Equation 6
was simplified by declaring the following constants: $${{{\mbox{C}}}}_{3}\equiv \exp \left({{{\mbox{C}}}}_{1}\right)$$ (7) $${{{\mbox{T}}}}_{{{\mbox{c}}}}\equiv
\frac{1}{{{{\mbox{k}}}}_{{{\mbox{B}}}}{{{\mbox{C}}}}_{2}}$$ (8) which yielded the following empirical model for kf: $${{{\mbox{k}}}}_{{{\mbox{f}}}}={{{\mbox{C}}}}_{3}\cdot \exp
\left(\frac{{{{\mbox{E}}}}_{{{\mbox{a}}}}}{{{{\mbox{k}}}}_{{{\mbox{B}}}}}\cdot \left(\frac{1}{{{{\mbox{T}}}}_{{{\mbox{c}}}}}-\frac{1}{{{\mbox{T}}}}\right)\right)$$ (9) One feature of this
equation is the theoretical critical temperature (Tc) at which formation rates equal the maximum rate (C3) regardless of the value of the activation energy (Ea). The C1 and C2 values from
the fit to Eq. 6 imply C3 and Tc values of 1.97 x 108 M-1s-1 and 352 K. These values are likely specific to the duplex-formation reaction in Fig. 4a and factors such as viscosity, ionic
strength, oligomer length, and duplex stability. Correlations such as Eq. 6 have been observed by previous researchers and were interpreted as evidence of enthalpy/entropy compensation in an
underlying thermodynamic model66. However, this explanation is contradicted by evidence that duplex formation reactions are a dynamic equilibrium governed by a rate-limiting transition
through an intermediate42,61,67,68. As such, it was concluded that Eq. 9 arises from a similarity in the rate-limiting reaction mechanism of the reactions. These results suggest the
following model of how unnecessary duplexes affect hybridization reactions. Hybridization reactions exhibit highly similar chemical kinetics manifesting as low kinetic dispersions when all
known factors, including unnecessary duplexes, are sufficiently controlled. For hybridization reactions where a single duplex is formed, these chemical kinetics exhibit an Arrhenius
temperature dependence. Relative to these chemical kinetics, unnecessary duplexes can either accelerate or decelerate a given hybridization reaction, which manifests as increased kinetic
dispersion. Different unnecessary duplexes affect the reaction mechanism in different ways, leading to a divergence in reaction mechanisms and many possible behaviors. This model appears to
adequately describe both relatively simple reactions such as the duplex-formation reaction in Fig. 4a, and more complicated reactions such as those from Fig. 1. For at least some
hybridization reactions, these divergent reaction mechanisms continue to follow mathematical relationships such as Eq. 9, regardless of the effects of unnecessary duplexes. GENERATION OF
LARGEST POSSIBLE NETWORKS Numerous methods for generating DNA oligomers have been reported and there exist at least fifteen computer programs specifically intended to generate networks of
interacting DNA oligomers7,38,53,54,57,69,70,71,72,73,74,75,76,77,78. The size limitations of SeqEvo and several freely available programs were assessed _in-silico_ by generating the largest
possible networks obeying the “no 3 s and no 8 s” rule. New networks were generated for two separate model systems. The first model system consisted of a single intentional duplex
containing the maximum possible number of base-pairs. The second model system consisted of the maximum possible number of independent 8-base-pair duplexes. The size of the largest networks
generated for these model systems is reported in Fig. 6. The largest networks yielded by default parameters of the SeqEvo program were a single 560-base-pair duplex and a network of 464
8-base-pair duplexes. For comparison, assigning bases randomly yielded only a 12-base-pair duplex and a network of 4 8-base-pair duplexes, indicating a significant improvement in network
size. Relative to the other generation methods, the default parameters of SeqEvo yielded both the second largest single duplex and the second largest network of 8-base-pair duplexes. The
EGNAS program yielded both the largest single duplex and the largest network of 8-base-pair duplexes. However, the EGNAS program is not currently capable of generating oligomers for other
network designs. The large networks generated using this program were attributed to three primary factors: (1) EGNAS appears specifically optimized for this type of network design (i.e.,
networks with no intentional connections between duplexes), (2) EGNAS quantifies unnecessary duplexes based on their length in base-pairs, and (3) the authors of EGNAS included clearly
labeled parameters controlling the size of the largest intra-oligomer unnecessary duplex and the size of the largest inter-oligomer duplex. These factors collectively make EGNAS well suited
for generating networks of independent duplexes obeying the “no 3’s and no 8’s” design rule. The default parameters of the NUPACK software yielded no networks satisfying the “no 3’s and no
8’s” design rule. This was attributed to the following factors: (1) NUPACK differentiates between networks using thermodynamic simulation and not based on the number of base-pairs in
unnecessary duplexes, and (2) NUPACK optimizes for network thermodynamics, and relatively small unnecessary intra-oligomer duplexes rarely affect the thermodynamic stability of intentional
duplexes. It has been specifically reported that some thermodynamically unfavorable duplexes can affect hybridization kinetics22. However, it is possible that the unnecessary duplexes
present in NUPACK-generated networks have no impact on hybridization kinetics. A future study directly comparing the kinetic dispersion of networks generated using thermodynamic and
non-thermodynamic optimization criteria may provide valuable insight into the relationship between unnecessary duplexes and reaction kinetics. Each generation method studied here has
parameters which can be changed by advanced users, and a sufficiently advanced user could almost certainly tune these parameters to generate larger networks satisfying the “no 3’s and no
8’s” design rule. For example, a network of 1024 8-base-pair duplexes was generated using the SeqEvo software by tuning the following parameters: (1) CPL increased from 100,000 to 200,000.
(2) interSLC increased from 1 to 8. (3) intraSLC increased from 1 to 3. A single 1024 base-pair duplex satisfying the “no 3’s and no 8’s” design rule was generated from the SeqEvo software
by tuning the following parameters: (1) CPL increased from 100,000 to 1,000,000. (2) intraSLC increased from 1 to 3. (3) interSLC increased from 1 to 8. Developing optimal methods for
mitigating kinetic dispersion using SeqEvo or other programs is a promising opportunity for future research and may substantially increase the size of networks for which kinetic dispersion
can be mitigated. NEW OLIGOMERS FOR EXISTING DESIGNS A key advantage of SeqEvo relative to other freely available design tools is this program’s ability to generate networks for a variety of
different network designs. However, larger and/or more complicated network designs may introduce issues that limit the effectiveness of this program. To better understand the variety of
networks which can be generated using SeqEvo, new networks were generated for several existing network designs. Attempts were made to generate networks satisfying the “no 3’s and no 8’s”
design rule for 4 existing network designs. These designs included: (1) the “entropy driven autocatalytic network” reported by Zhang et al.59, (2) the “autocatalytic four-arm junction”
reported by Kotani et al.79, (3) the “four-input OR seesaw-gate network” reported by Qian et al.58, and (4) the “10 x 10 x 10 molecular canvas” reported by Ke et al.56. The DevPro software
was used to characterize the unnecessary duplexes for both published and newly generated networks. These unnecessary duplexes are summarized in Table 1. Supplementary note 5 reports
additional information regarding these networks. The base-sequence of the new networks and the number of unnecessary duplexes of each length are detailed in supplementary tables S6 to S13.
Unnecessary duplexes were successfully reduced for each of the designs, however the “no 3’s and no 8’s” design rule could not be satisfied for two of the four designs. The failure to
generate a satisfactory network for the “10 x 10 x 10 molecular canvas” was attributed to the size of this design, which contained 517 total oligomers and oligomers up to 48 bases in length.
The amount of computation required to generate the new networks depended strongly on network design. The least amount of computation was required by the “entropy-driven autocatalytic
network” design. The new network for this design was generated after scoring 800,000 total networks in 2 min using a laptop computer (Intel Xeon E3-1505M v5 processor). The most computation
was required for the “10 x 10 x 10 molecular canvas” design. The new network for this design was generated after scoring 40,000,000 networks using a single node of a computing cluster for 28
h (dual intel Xeon E5-2680 v4 processors). The following SeqEvo parameters were tuned to generate this new network: (1) CPL increased from 100,000 to 1,000,000. (2) intraSLC increased from
1 to 3. (3) interSLC increased from 1 to 5. (4) scoringWeightX increased from 10,000 to 100,000,000. DISCUSSION For networks of interacting DNA oligomers, it is known that stretches of
Watson-Crick base-pairing which are not part of an intended hybridization reaction, here referred to as unnecessary duplexes, can affect the chemical kinetics of hybridization
reactions9,22,23,36,37,38,39,40. However, these chemical kinetics also depend on other known factors such as temperature12,22,41,42,43,44, ionic strength45,46, viscosity45,47, oligomer
length42, and duplex stability12,22,48,49. Here, a difference in chemical kinetics is referred to as kinetic dispersion, and the relationship between unnecessary duplexes and kinetic
dispersion was studied. Several key observations were made regarding unnecessary duplexes and kinetic dispersion. For one, evidence was observed that some unnecessary intra-oligomer duplexes
containing as few as 2 base-pairs are substantial enough to cause kinetic dispersion under certain conditions. Such evidence was observed both while analyzing previously reported in-vitro
kinetic dispersions (Fig. 1) and while analyzing the in-vitro kinetic dispersions of newly generated networks (Fig. 4). Evidence was also observed that not all unnecessary duplexes affect
all hybridization reactions. For the conditions studied here, the effects of 2-base-pair unnecessary intra-oligomer duplexes appear mitigated at temperatures of 30 °C or higher. However,
some 3-base-pair unnecessary intra-oligomer duplexes appear substantial enough to cause kinetic dispersion for a larger range of experimental conditions. Finally, when networks were randomly
sampled _in-silico_, it was observed that unnecessary duplexes typically scale exponentially with network size (Fig. 2) and that that all but the smallest networks contain unnecessary
duplexes substantial enough to cause kinetic dispersion. Results reported here demonstrate that sufficiently controlling factors known to affect hybridization kinetics can greatly limit
kinetic dispersion, and that these reductions are sometimes substantial enough that little to no kinetic dispersion can be observed. For example, networks generated using W1 optimization
(W1-Fit group, orange data in Fig. 4) exhibited exceptionally low kinetic dispersions for the duplex-formation reaction at 50 °C. Based on the kinetic model and experimental conditions used
here, the fastest and slowest of these reactions should require 4.2 s and 5.6 s to reach half-completion, respectively. Supplementary note 6 details the calculation of these half-completion
times and the calculated values are reported in supplementary table S14. When this kinetic dispersion is projected upon an event such as the release of medication from a nanoscale box14,
these 4.2 and 5.6 s half-completion times imply that newly generated boxes may require anywhere between 4.2 to 5.6 s to release one-half of the medication. Results reported here also
indicate that failing to mitigate unnecessary duplexes can lead to substantial kinetic dispersion. For example, new networks generated without _in-silico_ optimization (RND group, grey data
in Fig. 4) exhibited notably high kinetic dispersions for the duplex-formation reaction at 10 °C. Based on the kinetic model and experimental conditions used here, the fastest and slowest of
these reactions should require 160 and 11,000 s to reach half-completion, respectively. By the same logic above, this implies newly generated boxes may require anywhere between 160 to
11,000 s to release one-half of the medication. While either, or neither, of these time scales may be favorable depending on specifics of the application, not knowing if a newly generated
box will release medication in 160 or 11,000 s is almost certainly unfavorable. Such large differences in performance may contribute to problems such as increased development costs or
unreliable treatment outcomes. If the narrow range of performances expected for _in-silico_ optimized networks could be replicated in practice, this may reduce the costs associated with
treatment development or improve the reliability of treatment outcomes. Other applications for networks of DNA oligomers may also be sensitive to the magnitude of kinetic dispersion observed
here10,11,12,13,15,16,17,18,19,20,21, and it is plausible that these other applications could similarly benefit from the kinetic dispersion reductions observed here. The new network
generation method reported here utilizes _in-silico_ optimization to mitigate kinetic dispersion caused by unnecessary duplexes. Exceptionally low kinetic dispersions were demonstrated using
this method. This included the 3 W1-fit networks (blue in Fig. 4) whose rate-constants at 50 °C differ by no more than a factor of 1.33. Numerous other methods for generating networks of
DNA oligomers have been reported7,38,53,54,57,69,70,71,72,73,74,75,76,77,78. However, to the knowledge of the authors, this is the lowest such kinetic dispersion reported in the literature.
Using other reported methods, the lowest kinetic dispersion one could likely achieve would be by generating new networks using a modified version of the rate-constant predicting six-feature
weighted neighbor voting algorithm of Zhang et al.12. This algorithm was demonstrated to accurately predict most rate-constants within a factor of 2, and it is possible similar kinetic
dispersions could be achieved for network generation. The “no 3’s and no 8’s” rule is proposed here as a general guideline for researchers looking to mitigate unnecessary duplexes. The
following key observations were made regarding this design rule. (1) It is more difficult to satisfy this design rule for larger networks. While a formal limit for network size was not
observed, the largest networks reported here which satisfy this design rule include both a single duplex containing 1664 base-pairs and a network of 1927 8-base-pair duplexes. (2) Networks
which satisfy this design rule exhibit reduced kinetic dispersions under a range of typical experimental conditions. However, kinetic dispersions are lowest at experimental temperatures of
30 °C or higher. To help future researchers mitigate unnecessary duplexes, the SeqEvo and DevPro programs underpinning this network generation method have been made freely available. The
following key observations were made regarding these programs. (1) The programs can generate and/or analyze many different network designs. This was specifically demonstrated by generating
new oligomers for previously reported network designs (Table 1), for which SeqEvo was able to successfully eliminate 3-base-pair unnecessary intra-oligomer duplexes in 3 of 4 network
designs. It was speculated that SeqEvo failed to generate a suitable network for one of these networks due to the size of the design, which contained 517 oligomers and up to 48 bases per
oligomer. (2) When the largest possible networks satisfying the “no 3’s and no 8’s design” were generated using SeqEvo, the program yielded both a single duplex containing 1024 base-pairs
and a network of 1024 8-base-pair duplexes. These network sizes were larger than several other freely available design tools and were substantially improved by tuning SeqEvo parameters. The
only software which successfully generated networks larger than SeqEvo is not currently capable of generating networks for more complicated designs. Results here indicate the following
promising opportunities for future research. (1) There are likely existing applications for which the “no 3’s and no 8’s” design rule mitigates kinetic dispersion sufficiently to improve
performance and/or reliability. Characterizing the effects of the “no 3’s and no 8’s” design rule on such applications may both improve the outcomes of these applications and help develop
future design rules. (2) There appear to be some designs for which networks satisfying the “no 3’s and no 8’s” design rule exists but cannot be generated using current software. Improved
network generation software may increase the designs for which the “no 3’s and no 8’s” design rule can be satisfied, which may help expand the applications for which kinetic dispersion can
be mitigated. (3) While the “no 3’s and no 8’s” design rule and the fitness scores utilized here appear highly effective, they were observed to be limited by the assumption that unnecessary
duplexes of a given length are equally bad. Development of new fitness scores, design rules and network generation methods could greatly increase the applications for which kinetic
dispersion can be mitigated. METHODS IQRNL Kinetic dispersion was quantified as the inter-quartile range of the natural logarithm of rate-constant values (abbreviated IQRNL). The following
observations regarding this metric were made. Small IQRNL values correspond with smaller, and more favorable, kinetic dispersion. IQRNL values can only be meaningfully compared if they are
derived from the same rate-equations and represent the same rate-constant of these equations. IQRNL values are expected to be robust to both occasional outliers and the several orders of
magnitude expected for these rate constants. IQRNL values are typically insensitive to the outer 50% of data, making them a conservative metric for quantifying kinetic dispersion. The IQRNL
of a sample is expected to systematically underestimate the value of a population, marking it as a biased estimator and making the most meaningful comparisons of IQRNL values those
calculated from the same number of samples. DATASETS Datasets which reported 30 or more measurements of a specific rate-constant under consistent experimental conditions were identified for
analysis. The five datasets identified include: (1) 47 measurements by Hata et al.22 of a duplex-formation reaction at room temperature (approximated as 22 °C), (2) 51 measurements by Olson
et al.23 of a catalytic reaction at 25 °C, (3) 51 measurements by Olson et al.23 of a leak reaction at 25 °C, (4) 98 measurements by Zhang et al.12 of a duplex-formation reaction at 37 °C,
and (5) 95 measurements by Zhang et al.12 of a duplex-formation reaction at 55°C. The samples within each dataset are expected to have approximately constant temperature, ionic strength,
viscosity, oligomer length, and duplex-stability. Either no other oligomers, or only poly-T oligomers were consistently present in each dataset. It is expected that unnecessary duplexes are
the only known cause of kinetic dispersion not controlled within each dataset. The oligomers which yielded this data were neither randomly generated nor rationally designed. However, they
were approximated as independent and identically distributed80,81 for the statistical analysis. N, O, AND WX Inter-oligomer duplexes were summarized using the network fitness score
(abbreviated N). N was calculated by accumulating 10L fitness points for each duplex connecting two oligomers: $${{\mbox{N}}}\equiv \mathop{\sum
}\limits_{{{\mbox{inter}}}-{{\mbox{oligo}}}}^{{{\mbox{i}}}}{10}^{{{{\mbox{L}}}}_{{{\mbox{i}}}}}$$ (10) where L is the number of base-pairs in the duplex. Calculation of N included fitness
points for both duplexes which are part of larger duplexes and for each oligomer interacting with an identical oligomer. The value of N is zero when the oligomers can form no inter-oligomer
duplexes, and larger values of N indicate more substantial (i.e., larger or more numerous) inter-oligomer duplexes. Unnecessary inter-oligomer duplexes were quantified using the network
fitness score above baseline (abbreviated ΔN), which was calculated as network fitness score minus the network fitness score of all necessary duplexes. Intra-oligomer duplexes were
summarized using the oligomer fitness score (abbreviated O). O was calculated by accumulating 10 L fitness points for each duplex connecting an oligomer with itself: $${{\mbox{O}}}\equiv
\mathop{\sum }\limits_{{{\mbox{intra}}}-{{\mbox{oligo}}}}^{{{\mbox{i}}}}{10}^{{{{\mbox{L}}}}_{{{\mbox{i}}}}}$$ (11) where L is the number of base-pairs in the duplex. Calculation of O
included fitness points for duplexes which are part of larger duplexes. The value of O is zero when the oligomers can form no intra-oligomer duplexes, and larger values of O indicate more
substantial intra-oligomer duplexes. Unnecessary intra-oligomer duplexes were quantified using the oligomer fitness score above baseline (abbreviated ΔO), which was calculated as oligomer
fitness score minus the oligomer fitness score of all necessary duplexes. Inter-oligomer and intra-oligomer duplexes were collectively summarized using a class of weighted fitness scores
(generally abbreviated Wx). These fitness scores were calculated as weighted linear combinations of N and O: $${{{\mbox{W}}}}_{{{\mbox{x}}}}\equiv {{\mbox{N}}}+{{\mbox{x}}}\cdot
{{\mbox{O}}}$$ (12) where x is a positive real number determining the relative contribution of O. Specific Wx are denoted with the value of x in the subscript (_e.g_., W1 denotes Wx with x =
1). The use of larger x values allows one to place an increasing emphasis on intra-oligomer duplexes. The value of all Wx are zero when no duplexes are present, and larger values indicate
more substantial duplexes. Unnecessary duplexes were quantified using the weighted fitness scores above baseline (abbreviated ΔWx), which were calculated as the weighted fitness score minus
the weighted fitness score of any necessary duplexes. The following three observations were made regarding these fitness scores. First, N and O are not orthogonal since any intra-oligomer
duplex physically implies the existence of two inter-oligomer duplexes. Alternatively stated, if an oligomer can form base-pairs with itself, it can also form these base-pairs with an
identical oligomer. Consequently, any structure which contributes to O also contributes to N, and N is always greater than or equal to 2 ∙ O. Second, while oligomers with ΔN and ΔO values
approaching zero converge to the design, oligomers with larger values may diverge from the design by a variety of mechanisms resulting in a variety of behaviors. This makes ΔN and ΔO
appropriate for quantifying unnecessary duplexes, but less appropriate for predicting the behavior of oligomers. Third, DNA oligomers are known to participate in other interactions not
captured by N and O, so some oligomers with abnormal behavior may be expected. Such other interactions include non-canonical base-pairing, triplexes, quadruplexes, the binding of oligomers
to container walls, or aptamer-like binding to other targets82,83. ESTIMATING KINETIC DISPERSION IN EXISTING EXPERIMENTAL DATA Fitness score values for each sample were calculated using the
custom-written Device Profiler (abbreviated DevPro) computer program. This program utilizes an exhaustive linear search to identify both necessary and unnecessary duplexes. The source code
of the DevPro computer program has been made freely available online55. Supplementary note 3 provides more information regarding the DevPro program. The kinetic dispersions of randomly
selected samples were estimated using the following process. First, the number of samples to analyze, _n_, was chosen. The rate-constants of _n_ samples were selected randomly. These rate
constants were resampled with replacement 1000 times and IQRNL was calculated for each of the resamplings. The 25th, 50th, and 75th percentiles of these IQRNL values were recorded. This
process was repeated 1,000 times, and the median of the 25th, 50th, and 75th percentiles are reported in Fig. 1 as the kinetic dispersion estimates. The kinetic dispersions of the most-fit
samples were estimated using the following process. First, the number of samples to analyze, _n_, was chosen. The rate-constants of the _n_ most-fit samples were selected. These rate
constants were resampled with replacement 1000 times and IQRNL was calculated for each of these resamplings. The 25th, 50th, and 75th percentiles of these IQRNL values were recorded.
Normalized percentiles were calculated by dividing these values by the 50th percentile of the randomly selected samples and are reported in Fig. 1 as the kinetic dispersion estimates. For
both N-fit and O-fit samples, similar trends were observed when a greater number of samples were selected. However, kinetic dispersion reduction was most pronounced for small sample sizes.
This can be explained by the decrease in sample fitness when increasing the number of samples. Since IQRNL values were all calculated using the same number of samples, the increasing kinetic
dispersion reduction with decreasing number of samples is not expected to be an artifact of biased parameter estimation. SCALING OF UNNECESSARY DUPLEXES The typical unnecessary
inter-oligomer duplexes for a given combination of _i_ and _j_ was estimated using the following process. First, 10,000 networks with random base-sequence were generated. N was calculated
individually for each network. Since these networks contain no intentional duplexes, all duplexes are unnecessary duplexes and N = ΔN. The 50th percentile of these N values was taken to be
an estimate of the typical unnecessary inter-oligomer duplexes. The 25th and 75th percentile of the N values were taken as lower and upper bounds for this estimate. The unnecessary
intra-oligomer duplexes were estimated similarly, except using the O fitness score. GENERATION OF NEW NETWORKS New networks of oligomers forming less substantial unnecessary duplexes were
generated by using _in-silico_ optimization of ΔN, ΔO, or ΔWx. In terms of the design paradigms established by Dirks et al.84, this generation method contains a network design acting as a
positive design component and base-sequence optimization acting as a negative design component. New networks were generated using the following process described visually in Fig. 3a. First,
the intentional duplexes for the network were formalized as a domain-based design57. This involved describing each oligomer as a sequence of binding domains and binding domain complements.
Binding domains were declared as either fixed (i.e., not to be mutated) or variable (i.e., free to be mutated). Next, initial base-sequences for each domain were specified. The domain-based
design and initial domain sequences were used as input for the custom-written Sequence Evolver (abbreviated SeqEvo) computer program, which optimized ΔN, ΔO, or ΔWx via a custom
evolution-inspired algorithm. Supplementary note 2 details operation of SeqEvo. Supplementary note 3 details fitness score calculation. All mutations performed by SeqEvo rearrange bases
within a variable domain. Thus, all oligomers generated by SeqEvo have domains with the same length and base-composition as the initial base-sequences. For the first optimization, default
SeqEvo parameters were used. The network resulting from SeqEvo optimization, referred to as a candidate network, was then analyzed by a human looking for obvious flaws. For this purpose, the
custom-written Device Profiler (abbreviated DevPro) computer program was used to identify and profile unnecessary duplexes. It is possible to incorporate other software such as the nucleic
acid analysis package (NUPACK)75 thermodynamic simulator at this step if necessary. Based on the human analysis, the candidate network was either accepted as final or the SeqEvo parameters
were updated and optimization repeated. The source code of both the SeqEvo and DevPro programs have been made freely available online55. MEASUREMENT OF IN-VITRO RATE-CONSTANTS FOR NEW
NETWORKS The network design for this model system (Fig. 4a) included three DNA oligomers (labeled S1, S2, and S3) composed of two binding domains (α and β). From their 5’ ends: oligomer S1
contains domains α then β, S2 contains the binding complement of β then the binding complement of α, and S3 contains only domain α. These three oligomers form two intentional duplexes: D1
(which forms between S1 and S2) and D2 (which forms between S3 and S2). Domain α is a variable sequence of 10 C’s, 10 G’s, 10 A’s, and 11 T’s. Domain β is the fixed sequence TCTCCATG, which
was adopted from a previous study in order to mitigate the known effects of toehold stability on hybridization kinetics85. Hybridization reactions were characterized using the following
procedure. Oligomers were purchased HPLC purified from Integrated DNA Technologies with Cy3 and “Black Hole Quencher 1” modifications, respectively. Oligomer concentrations were calculated
using the absorbance coefficients reported by Integrated DNA Technologies and absorption measurements at 260 nm (Thermo Nanodrop One spectrophotometer). Reactants were prepared in 1x TE
buffer (supplemented with 1 M NaCl) at 1 μM concentration. Reactants were combined at an initial concentration of 10 nM in 1x TE (supplemented with 1 M NaCl) inside of one of two Cary
Eclipse spectrophotometers. The concentration of unquenched Cy3 dye, which was proportional to the concentration of unreacted S1, was monitored using excitation and emission wavelengths of
548 nanometers and 574 nanometers. Under these conditions, a plot of inverse-reactant-concentration as a function of time is expected to be linear with slope equal to the reaction
rate22,37,44. Linear fits were applied to the first 5, 10, 20, 50, 100, 200, and 500 s of data and the fit with the largest coefficient of determination (R2) was taken to be the best
rate-constant measurement. This method yielded median R2 values of 0.9968 and 0.9951 for the duplex-formation and oligo-displacement reactions. Supplementary note 4 includes the fluorescence
traces, concentration plots, and rate-constant values for each measurement. The precision of the rate-constant measurements was sampled using triplicate measurements of the “W1-fit-3”
oligomer set. This included three measurements each for the formation and displacement reactions at 20 °C and 40 °C. In total, this yielded twelve rate-constant values organized into four
groups of three. Within these groups of three, the largest deviation from a group average was 4% and the median deviation was 2%. Errors in commercial oligonucleotide synthesis have been
observed to induce up to a factor of two difference in rate-constant values12 and each group of three rate-constant values were based on a single synthesis. This suggests that, despite the
relatively high precision of the experimental method, all measurements for any given oligomer-set may be systematically offset by up to a factor of two due to synthesis errors. Since the
observed rate-constant values span four orders of magnitude, it is reasonable to expect the majority of observed rate variation to arise from factors other than synthesis errors. This
assumption is further supported by the high reproducibility of certain design groups, such as the W1-fit oligomers at high experimental temperatures. For Fig. 4d, kinetic dispersion was
estimated using the following process. First, the three networks were resampled with replacement 5000 times. For each resampling, the IQRNL was calculated at each of the six temperatures.
For each resampling, the median IQRNL of the six temperatures was calculated. The 50th percentile of the median IQRNL across the 6 temperatures was interpreted as the estimated kinetic
dispersion. The 25th and 75th percentiles were used as upper and lower bounds for this estimate. Normalized kinetic dispersions were calculated relative to the 50th percentile of the RND
group. GENERATION OF LARGEST POSSIBLE NETWORKS New networks were generated using the following freely available design tools: (1) The SeqEvo computer program, (2) the Domain Design (DD)
program57, (3) the DNASequenceGenerator (DSG) program54, (4) the Exhaustive Generation of Nucleic Acid Sequence (EGNAS) program53, (5) the Nucleic Acid Package (NUPACK) software75, and (6)
the Uniquimer3D program74. Networks were generated using default program parameters, with as minimal adjustments as possible made to program settings. For reference, networks were also
generated by randomly assigning base-sequences. Networks were first generated for a network forming a single 8 base-pair duplex. Either the number of duplexes or the number of base-pairs
were repeatedly doubled until 3 runs of the generation program could no longer yield a network satisfying the”no 3’s and no 8’s” design rule. At this point, the previous network size was
revisited. Either the number of duplexes or the number of base-pairs were then increased in 10% increments until 3 runs of the generation program could no longer yield a network satisfying
the “no 3’s and no 8’s” design rule. REPORTING SUMMARY Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article. DATA AVAILABILITY
Numerical data underlying graphs and charts is provided in the ‘Supplementary Data 1’ file. CODE AVAILABILITY The results reported in this manuscript were either created or duplicated using
version 2.0 of the SeqEvo and DevPro programs. The source code of these programs is available on GitHub55. The 2.0 versions of these programs have been archived86 and are also provided in
the ‘Supplementary Software 1’ file. REFERENCES * Seeman, N. C. DNA nanotechnology: from the pub to information-based chemistry. _Methods Mol. Biol._ 1811, 1–9 (2018). CAS PubMed Google
Scholar * Zhang, D. Y. & Seelig, G. Dynamic DNA nanotechnology using strand-displacement reactions. _Nat. Chem._ 3, 103–113 (2011). CAS PubMed Google Scholar * Pinheiro, A. V., Han,
D., Shih, W. M. & Yan, H. Challenges and opportunities for structural DNA nanotechnology. _Nat. Nanotechnol._ 6, 763–772 (2011). CAS PubMed PubMed Central Google Scholar * Chen, Y.
J., Groves, B., Muscat, R. A. & Seelig, G. DNA nanotechnology from the test tube to the cell. _Nat. Nanotechnol._ 10, 748–760 (2015). CAS PubMed Google Scholar * Bathe, M. &
Rothemund, P. W. K. DNA nanotechnology: A foundation for programmable nanoscale materials. _MRS Bull._ 42, 882–888 (2017). CAS Google Scholar * Seeman, N. C. Nucleic acid junctions and
lattices. _J. Theor. Biol._ 99, 237–247 (1982). CAS PubMed Google Scholar * Rothemund, P. W. K. Folding DNA to create nanoscale shapes and patterns. _Nature_ 440, 297–302 (2006). CAS
PubMed Google Scholar * Adleman, L. M. Molecular Computation of Solutions to Combinatorial Problems. _Science_ 266, 1021–1024 (1994). CAS PubMed Google Scholar * Yurke, B., Turberfield,
A. J., Mills, A. P. Jr., Simmel, F. C. & Neumann, J. L. A DNA-fuelled molecular machine made of DNA. _Nature_ 406, 605–608 (2000). CAS PubMed Google Scholar * Cherry, K. M. &
Qian, L. L. Scaling up molecular pattern recognition with DNA-based winner-take-all neural networks. _Nature_ 559, 370 (2018). CAS PubMed Google Scholar * Choi, H. M. T., Beck, V. A.
& Pierce, N. A. Next-Generation in Situ Hybridization Chain Reaction: Higher Gain, Lower Cost, Greater Durability. _Acs Nano._ 8, 4284–4294 (2014). CAS PubMed PubMed Central Google
Scholar * Zhang, J. X. et al. Predicting DNA hybridization kinetics from sequence. _Nat. Chem._ 10, 91–98 (2018). PubMed Google Scholar * Andersen, E. S. et al. Self-assembly of a
nanoscale DNA box with a controllable lid. _Nature_ 459, 73–U75 (2009). CAS PubMed Google Scholar * Zadegan, R. M. et al. Construction of a 4 Zeptoliters Switchable 3D DNA Box Origami.
_Acs Nano._ 6, 10050–10053 (2012). CAS PubMed Google Scholar * Grossi, G., Jepsen, M. D. E., Kjems, J. & Andersen, E. S. Control of enzyme reactions by a reconfigurable DNA nanovault.
_Nat. Commun._ 8, https://doi.org/10.1038/s41467-017-01072-8 (2017). * Siu, K. H. & Chen, W. Riboregulated toehold-gated gRNA for programmable CRISPR-Cas9 function. _Nat. Chem. Biol._
15, 217 (2019). CAS PubMed Google Scholar * MUckl, A., Schwarz-Schilling, M., Fischer, K. & Simmel, F. C. Filamentation and restoration of normal growth in Escherichia coli using a
combined CRISPRi sgRNA/antisense RNA approach. _Plos One_ 13, https://doi.org/10.1371/journal.pone.0198058 (2018). * Dickinson, G. et al. An alternative approach to nucleic acid memory.
_Nat. Commun._ 12, https://doi.org/10.1038/s41467-021-22277-y (2021). * Banal, J. et al. Random access DNA memory using Boolean search in an archival file storage system. _Nat. Mater._ 20,
1272 (2021). +. CAS PubMed PubMed Central Google Scholar * Cannon, B. et al. Excitonic AND Logic Gates on DNA Brick Nanobreadboards. _Acs Photonics_ 2, 398–404 (2015). CAS PubMed
PubMed Central Google Scholar * Dunn, K. & Elfick, A. Harnessing DNA Nanotechnology and Chemistry for Applications in Photonics and Electronics. _Bioconjugate Chem._
https://doi.org/10.1021/acs.bioconjchem.2c00286 (2022). * Hata, H., Kitajima, T. & Suyama, A. Influence of thermodynamically unfavorable secondary structures on DNA hybridization
kinetics. _Nucl. Acids Res._ 46, 782–791 (2018). CAS PubMed Google Scholar * Olson, X. et al. Availability: A Metric for Nucleic Acid Strand Displacement Systems. _ACS Synth. Biol._ 6,
84–93 (2017). CAS PubMed Google Scholar * Andersen, E. S. Prediction and design of DNA and RNA structures. _N. Biotechnol._ 27, 184–193 (2010). CAS PubMed Google Scholar * Chhabra, R.,
Sharma, J., Liu, Y., Rinker, S. & Yan, H. DNA self-assembly for nanomedicine. _Adv. Drug Deliv. Rev._ 62, 617–625 (2010). CAS PubMed Google Scholar * Doty, D. Theory of algorithmic
self-assembly. _Commun. Acm_ 55, 78 (2012). Google Scholar * Liddle, J. A. & Gallatin, G. M. Nanomanufacturing: A Perspective. _Acs Nano._ 10, 2995–3014 (2016). CAS PubMed PubMed
Central Google Scholar * Kumar, V. et al. DNA nanotechnology for cancer therapy. _Theranostics_ 6, 710–725 (2016). CAS PubMed PubMed Central Google Scholar * Zhirnov, V., Zadegan, R.
M., Sandhu, G. S., Church, G. M. & Hughes, W. L. Nucleic acid memory. _Nat. Mater._ 15, 366–370 (2016). CAS PubMed PubMed Central Google Scholar * Panda, D. et al. DNA as a digital
information storage device: hope or hype? _3 Biotech_ 8, 239 (2018). PubMed PubMed Central Google Scholar * Qian, L., Winfree, E. & Bruck, J. Neural network computation with DNA
strand displacement cascades. _Nature_ 475, 368–372 (2011). CAS PubMed Google Scholar * Chen, Y. J. et al. Programmable chemical controllers made from DNA. _Nat. Nanotechnol._ 8, 755–762
(2013). CAS PubMed PubMed Central Google Scholar * Lin, C., Liu, Y., Rinker, S. & Yan, H. DNA tile based self-assembly: building complex nanoarchitectures. _Chemphyschem_ 7,
1641–1647 (2006). CAS PubMed Google Scholar * Winfree, E., Liu, F., Wenzler, L. A. & Seeman, N. C. Design and self-assembly of two-dimensional DNA crystals. _Nature_ 394, 539–544
(1998). CAS PubMed Google Scholar * Chen, J. H. & Seeman, N. C. Synthesis from DNA of a molecule with the connectivity of a cube. _Nature_ 350, 631–633 (1991). CAS PubMed Google
Scholar * Mir, K. U. & Southern, E. M. Determining the influence of structure on hybridization using oligonucleotide arrays. _Nat. Biotechnol._ 17, 788–792 (1999). CAS PubMed Google
Scholar * Gao, Y., Wolf, L. K. & Georgiadis, R. M. Secondary structure effects on DNA hybridization kinetics: a solution versus surface comparison. _Nucl. Acids Res._ 34, 3370–3377
(2006). CAS PubMed PubMed Central Google Scholar * Seeman, N. C. De novo design of sequences for nucleic acid structural engineering. _J. Biomol. Struct. Dyn._ 8, 573–581 (1990). CAS
PubMed Google Scholar * Lysne, D. et al. Availability-Driven Design of Hairpin Fuels and Small Interfering Strands for Leakage Reduction in Autocatalytic Networks. _J. Phys. Chem. B_ 124,
3326–3335 (2020). CAS PubMed Google Scholar * Srinivas, N., Parkin, J., Seelig, G., Winfree, E. & Soloveichik, D. Enzyme-free nucleic acid dynamical systems. _Science_ 358,
https://doi.org/10.1126/science.aal2052 (2017). * Sikora, J. R., Rauzan, B., Stegemann, R. & Deckert, A. Modeling Stopped-Flow Data for Nucleic Acid Duplex Formation Reactions: The
Importance of Off-Path Intermediates. _J. Phys. Chem. B_ 117, 8966–8976 (2013). CAS PubMed Google Scholar * Menssen, R. J. & Tokmakoff, A. Length-Dependent Melting Kinetics of Short
DNA Oligonucleotides Using Temperature-Jump IR Spectroscopy. _J. Phys. Chem. B_ 123, 756–767 (2019). CAS PubMed Google Scholar * Sanstead, P. J. & Tokmakoff, A. Direct Observation of
Activated Kinetics and Downhill Dynamics in DNA Dehybridization. _J. Phys. Chem. B_ 122, 3088–3100 (2018). CAS PubMed Google Scholar * Morrison, L. E. & Stols, L. M. Sensitive
fluorescence-based thermodynamic and kinetic measurements of DNA hybridization in solution. _Biochemistry_ 32, 3095–3104 (1993). CAS PubMed Google Scholar * Stellwagen, N. C. &
Stellwagen, E. DNA Thermal Stability Depends on Solvent Viscosity. _J. Phys. Chem. B_ 123, 3649–3657 (2019). CAS PubMed Google Scholar * Carrillo-Nava, E., Mejia-Radillo, Y. & Hinz,
H. J. Dodecamer DNA Duplex Formation Is Characterized by Second-Order Kinetics, Positive Activation Energies, and a Dependence on Sequence and Mg2+ Ion Concentration. _Biochemistry_ 47,
13153–13157 (2008). CAS PubMed Google Scholar * Wallace, M. I., Ying, L. M., Balasubramanian, S. & Klenerman, D. Non-Arrhenius kinetics for the loop closure of a DNA hairpin. _Proc.
Natl Acad. Sci._ 98, 5584–5589 (2001). CAS PubMed PubMed Central Google Scholar * Owczarzy, R. et al. Predicting sequence-dependent melting stability of short duplex DNA oligomers.
_Biopolymers_ 44, 217–239 (1997). CAS PubMed Google Scholar * SantaLucia, J., Allawi, H. & Seneviratne, A. Improved nearest-neighbor parameters for predicting DNA duplex stability.
_Biochemistry_ 35, 3555–3562 (1996). CAS PubMed Google Scholar * Solari, M. Chakravarti,IM - handbook of methods of applied statistics. _Nature_ 216, 901 (1967). Google Scholar *
Wilcoxon, F. Individual comparisons by ranking methods. _Biometrics Bull._ 1, 80–83 (1945). Google Scholar * Anderson, T. & Darling, D. Asymptotic theory of certain goodness of fit
criteria based on stochastic processes. _Ann. Math. Stat._ 23, 193–212 (1952). Google Scholar * Kick, A., Bonsch, M. & Mertig, M. EGNAS: an exhaustive DNA sequence design algorithm.
_BMC Bioinforma._ 13, 138 (2012). Google Scholar * Feldkamp, U., Saghafi, S., Banzhaf, W. & Rauhe, H. 23-32 (Springer Berlin Heidelberg). * MTobiason. MTobiason/osp: The Oligomer
Sequence Package. https://github.com/MTobiason/osp, https://doi.org/10.5281/zenodo.8409054 (2023). * Ke, Y. G., Ong, L. L., Shih, W. M. & Yin, P. Three-dimensional structures
self-assembled from DNA bricks. _Science_ 338, 1177–1183 (2012). CAS PubMed Google Scholar * Zhang, D. Y. Towards domain-based sequence design for DNA strand displacement reactions. _Lect
Notes Comput Sci._ 6518, https://doi.org/10.1007/978-3-642-18305-8_15 (2010). * Qian, L. & Winfree, E. Scaling up digital circuit computation with DNA strand displacement cascades.
_Science_ 332, 1196–1201 (2011). CAS PubMed Google Scholar * Zhang, D. Y., Turberfield, A. J., Yurke, B. & Winfree, E. Engineering entropy-driven reactions and networks catalyzed by
DNA. _Science_ 318, 1121–1125 (2007). CAS PubMed Google Scholar * Sekar, M. M. A., Bloch, W. & St John, P. M. Comparative study of sequence-dependent hybridization kinetics in
solution and on microspheres. _Nucl. Acids Res._ 33, 366–375 (2005). CAS PubMed PubMed Central Google Scholar * Sikora, J. R., Rauzan, B., Stegemann, R. & Deckert, A. Modeling
stopped-flow data for nucleic acid duplex formation reactions: the importance of off-path intermediates. _J. Phys. Chem. B_ 117, 8966–8976 (2013). CAS PubMed Google Scholar * Green, S.
J., Lubrich, D. & Turberfield, A. J. DNA hairpins: fuel for autonomous DNA devices. _Biophys. J._ 91, 2966–2975 (2006). CAS PubMed PubMed Central Google Scholar * Bonnet, G.,
Krichevsky, O. & Libchaber, A. Kinetics of conformational fluctuations in DNA hairpin-loops. _Proc. Natl Acad. Sci._ 95, 8602–8606 (1998). CAS PubMed PubMed Central Google Scholar *
Rauzan, B. et al. Kinetics and Thermodynamics of DNA, RNA, and Hybrid Duplex Formation. _Biochemistry_ 52, 765–772 (2013). CAS PubMed Google Scholar * Chen, K. K. et al. Digital Data
Storage Using DNA Nanostructures and Solid-State Nanopores. _Nano Lett._ 19, 1210–1215 (2019). CAS PubMed Google Scholar * Fox, J. M., Zhao, M. X., Fink, M. J., Kang, K. & Whitesides,
G. M. The Molecular Origin of Enthalpy/Entropy Compensation in Biomolecular Recognition. _Annu. Rev. Biophys._ 47, 223–250 (2018). CAS PubMed Google Scholar * Chen, C. L., Wang, W. J.,
Wang, Z., Wei, F. & Zhao, X. S. Influence of secondary structure on kinetics and reaction mechanism of DNA hybridization. _Nucl. Acids Res._ 35, 2875–2884 (2007). CAS PubMed PubMed
Central Google Scholar * Srinivas, N. et al. On the biophysics and kinetics of toehold-mediated DNA strand displacement. _Nucleic Acids Res._ 41, 10641–10658 (2013). CAS PubMed PubMed
Central Google Scholar * Tanaka, F., Kameda, A., Yamamoto, M. & Ohuchi, A. Design of nucleic acid sequences for DNA computing based on a thermodynamic approach. _Nucl. Acids Res._ 33,
903–911 (2005). CAS PubMed PubMed Central Google Scholar * Birac, J. J., Sherman, W. B., Kopatsch, J., Constantinou, P. E. & Seeman, N. C. Architecture with GIDEON, a program for
design in structural DNA nanotechnology. _J. Mol. Graph Model_ 25, 470–480 (2006). CAS PubMed PubMed Central Google Scholar * Andersen, E. S. et al. DNA origami design of dolphin-shaped
structures with flexible tails. _Acs Nano_ 2, 1213–1218 (2008). CAS PubMed Google Scholar * Douglas, S. M. et al. Rapid prototyping of 3D DNA-origami shapes with caDNAno. _Nucl. Acids
Res._ 37, 5001–5006 (2009). CAS PubMed PubMed Central Google Scholar * Williams, S. et al. Tiamat: A Three-Dimensional Editing Tool for Complex DNA Structures. _Lect. Notes Comput. Sci._
5347, 90–101 (2009). Google Scholar * Zhu, J., Wei, B., Yuan, Y. & Mi, Y. UNIQUIMER 3D, a software system for structural DNA nanotechnology design, analysis and evaluation. _Nucl.
Acids Res._ 37, 2164–2175 (2009). CAS PubMed PubMed Central Google Scholar * Zadeh, J. N. et al. NUPACK: Analysis and design of nucleic acid systems. _J. Comput Chem._ 32, 170–173
(2011). CAS PubMed Google Scholar * Beliveau, B. J. et al. OligoMiner provides a rapid, flexible environment for the design of genome-scale oligonucleotide in situ hybridization probes.
_Proc. Natl Acad. Sci._ 115, E2183–E2192 (2018). CAS PubMed PubMed Central Google Scholar * Engelhardt, F. et al. Custom-Size, Functional, and Durable DNA Origami with Design-Specific
Scaffolds. _Acs Nano._ 13, 5015–5027 (2019). CAS PubMed PubMed Central Google Scholar * Jun, H. et al. Automated Sequence Design of 3D Polyhedral Wireframe DNA Origami with Honeycomb
Edges. _Acs Nano._ 13, 2083–2093 (2019). CAS PubMed PubMed Central Google Scholar * Kotani, S. & Hughes, W. L. Multi-Arm Junctions for Dynamic DNA Nanotechnology. _J. Am. Chem. Soc._
139, 6363–6368 (2017). CAS PubMed PubMed Central Google Scholar * Efron, B. 1977 Rietz lecture - bootstrap methods - another look at the jackknife. _Ann. Stat._ 7, 1–26 (1979). Google
Scholar * Efron, B. Second thoughts on the bootstrap. _Stat. Sci._ 18, 135–140 (2003). Google Scholar * Ghosh, A. & Bansal, M. A glossary of DNA structures from A to Z. _Acta
Crystallogr. Sect. D.-Struct. Biol._ 59, 620–626 (2003). Google Scholar * Ellington, A. D. & Szostak, J. W. Selection in vitro of single-stranded DNA molecules that fold into specific
ligand-binding structures. _Nature_ 355, 850–852 (1992). CAS PubMed Google Scholar * Dirks, R. M., Lin, M., Winfree, E. & Pierce, N. A. Paradigms for computational nucleic acid
design. _Nucl. Acids Res._ 32, 1392–1403 (2004). CAS PubMed PubMed Central Google Scholar * Zhang, D. Y. & Winfree, E. Control of DNA strand displacement kinetics using toehold
exchange. _J. Am. Chem. Soc._ 131, 17303–17314 (2009). CAS PubMed Google Scholar * MTobiason. MTobiason/osp: Version 2.0. https://zenodo.org/record/8409055,
https://doi.org/10.5281/zenodo.8409055 (2023). Download references ACKNOWLEDGEMENTS We thank the Nanoscale Materials & Device Research Group at Boise State University. We also thank Dr.
Tim Andersen, Dr. James Alexander Liddle, and Dr. Igor Medintz. We acknowledge the support of the R2 compute cluster (DOI: 10.18122/B2S41H) provided by Boise State University’s Research
Computing Department. This research was funded by the: (1) W.M. Keck Foundation; (2) Higher Education Research Council (HERC) Idaho Global Entrepreneurial Mission (IGEM), (3) Semiconductor
Research Corporation [2018-SB-2842]; (4) National Science Foundation [CMMI 1344915, ECCS 1807809, and ECCS 2227626]; and (5) National Institutes of Health from the National Institute of
General Medical Sciences [K25GM093233]. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Department of Computer Science, Boise State University, Boise, ID, USA Michael Tobiason * Micron School
of Materials Science & Engineering, Boise State University, Boise, ID, USA Michael Tobiason & Bernard Yurke * Department of Electrical & Computer Engineering, Boise State
University, Boise, ID, USA Bernard Yurke * School of Engineering, University of British Columbia Okanagan Campus, Kelowna, BC, Canada William L. Hughes Authors * Michael Tobiason View author
publications You can also search for this author inPubMed Google Scholar * Bernard Yurke View author publications You can also search for this author inPubMed Google Scholar * William L.
Hughes View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS M.T. performed the research and wrote the manuscript. B.Y. and W.L.H. supervised the
research and guided manuscript revisions. CORRESPONDING AUTHORS Correspondence to Michael Tobiason or William L. Hughes. ETHICS DECLARATIONS COMPETING INTERESTS M.T. founded C-TAG
Corporation while this research was in peer-review. The authors declare there are no other competing interests. PEER REVIEW PEER REVIEW INFORMATION _Communications Chemistry_ thanks the
anonymous reviewers for their contribution to the peer review of this work. A peer review file is available. ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with
regard to jurisdictional claims in published maps and institutional affiliations. SUPPLEMENTARY INFORMATION PEER REVIEW FILE SUPPLEMENTARY INFORMATION DESCRIPTION OF ADDITIONAL SUPPLEMENTARY
FILES SUPPLEMENTARY DATA 1 SUPPLEMENTARY SOFTWARE 1 REPORTING SUMMARY RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source,
provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons
license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit
http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Tobiason, M., Yurke, B. & Hughes, W.L. Generation of DNA oligomers with similar
chemical kinetics via in-silico optimization. _Commun Chem_ 6, 226 (2023). https://doi.org/10.1038/s42004-023-01026-w Download citation * Received: 22 February 2023 * Accepted: 10 October
2023 * Published: 18 October 2023 * DOI: https://doi.org/10.1038/s42004-023-01026-w SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get
shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative