Volume 23, Issue 4 e202300052

RESEARCH ARTICLE

Open Access

Efficient integration of deep neural networks in sequential multiscale simulations

Jendrik-Alexander Tröger,

Corresponding Author

Jendrik-Alexander Tröger

[email protected]

orcid.org/0000-0002-4999-4558

Institute of Applied Mechanics, Clausthal University of Technology, Clausthal-Zellerfeld, Germany

Correspondence

Jendrik-Alexander Tröger, Institute of Applied Mechanics, Clausthal University of Technology, Adolph-Roemer-Str. 2A, 38678 Clausthal-Zellerfeld, Germany.

Email: [email protected]

Search for more papers by this author

Hamidreza Eivazi,

Hamidreza Eivazi

Institute for Software and Systems Engineering, Clausthal University of Technology, Clausthal-Zellerfeld, Germany

Search for more papers by this author

Stefan Hartmann,

Stefan Hartmann

Institute of Applied Mechanics, Clausthal University of Technology, Clausthal-Zellerfeld, Germany

Search for more papers by this author

Stefan Wittek,

Stefan Wittek

Institute for Software and Systems Engineering, Clausthal University of Technology, Clausthal-Zellerfeld, Germany

Search for more papers by this author

Andreas Rausch,

Andreas Rausch

Institute for Software and Systems Engineering, Clausthal University of Technology, Clausthal-Zellerfeld, Germany

Search for more papers by this author

Jendrik-Alexander Tröger,

Corresponding Author

Jendrik-Alexander Tröger

[email protected]

orcid.org/0000-0002-4999-4558

Institute of Applied Mechanics, Clausthal University of Technology, Clausthal-Zellerfeld, Germany

Correspondence

Jendrik-Alexander Tröger, Institute of Applied Mechanics, Clausthal University of Technology, Adolph-Roemer-Str. 2A, 38678 Clausthal-Zellerfeld, Germany.

Email: [email protected]

Search for more papers by this author

Hamidreza Eivazi,

Hamidreza Eivazi

Institute for Software and Systems Engineering, Clausthal University of Technology, Clausthal-Zellerfeld, Germany

Search for more papers by this author

Stefan Hartmann,

Stefan Hartmann

Institute of Applied Mechanics, Clausthal University of Technology, Clausthal-Zellerfeld, Germany

Search for more papers by this author

Stefan Wittek,

Stefan Wittek

Institute for Software and Systems Engineering, Clausthal University of Technology, Clausthal-Zellerfeld, Germany

Search for more papers by this author

Andreas Rausch,

Andreas Rausch

Institute for Software and Systems Engineering, Clausthal University of Technology, Clausthal-Zellerfeld, Germany

Search for more papers by this author

First published: 11 October 2023

https://doi.org/10.1002/pamm.202300052

Citations: 1

Share a link

Email
Wechat
Bluesky

Abstract

Multiscale computations involving finite elements are often unfeasible due to their substantial computational costs arising from numerous microstructure evaluations. This necessitates the utilization of suitable surrogate models, which can be rapidly evaluated. In this work, we apply a purely data-based deep neural network as a surrogate model for the microstructure evaluation. More precisely, the surrogate model predicts the homogenized stresses, which are typically obtained by the solution of (initial) boundary-value problems of a microstructure and subsequent homogenization. Furthermore, the required consistent tangent matrix is computed by leveraging reverse mode automatic differentiation. To improve data efficiency and ensure high prediction quality, the well-known Sobolev training is chosen for creating the surrogate model. This surrogate model is seamlessly integrated into a Fortran-based finite element code using an open-source library. As a result, this integration, combined with just-in-time compilation, leads to a speed-up of more than 6000× , as demonstrated in the example of a plate with a hole examined in this work. Furthermore, the surrogate model proves to be applicable even in load-step size controlled simulations, where it overcomes certain load-step size limitations associated with the microstructure computations.

1 INTRODUCTION

Multiscale computations often rely on finite elements for spatial discretization on both macro- and microscale, as explained, for example, in the literature [1]. However, the evaluation of heterogeneous microstructures, frequently represented using representative volume elements (RVEs), is a computationally expensive task. This evaluation typically involves solving boundary value problems (BVPs) or initial BVPs, especially when dealing with inelastic or viscous constitutive behavior, followed by a subsequent homogenization step. Recently, neural network-based surrogate models have become invaluable to substantially accelerate these computations or make them even feasible, when considering three-dimensional RVEs as well as complex constitutive or multiphysical behavior.

Machine learning methods, including deep neural networks (DNNs), have already been applied in multiscale computations, as comprehensively reviewed in the literature [2]. Among other approaches, [3] choose clustering techniques in a data-driven two-scale approach for the prediction of homogenized quantities from the microscale. An adaptive switching between the evaluation of a reduced order model and a neural network is presented by literature [4] to ensure accurate predictions. Beyond simple feedforward DNNs, as applied in this work, more complex architectures are applied to multiscale problems as well, see [5-7] and the cited references. Moreover, [8] introduced a data-driven multiscale framework coupled with autonomous data mining for multiscale computations. A current trend is the incorporation of physics into neural networks, see, exemplarily, [9, 10] and the literature cited therein, which is beyond the scope of this contribution. Moreover, Sobolev training [11], as demonstrated by literature [12], has proven to enhance the accuracy of the neural network-based surrogate models and is therefore followed also in this work together with reverse mode automatic differentiation (AD).

In this contribution, we begin by briefly summarizing the use of DNN surrogate models in sequential multiscale analyses. Afterward, the architecture and training process of a straightforward data-driven DNN surrogate is explained, which can be constructed directly from RVE evaluations, specifically homogenized stress and strain quantities. We intentionally omit the evaluation of other approaches, such as those involving the local strain-energy, as carried out, for instance, in the literature [13]. Our focus in this contribution is primarily on the efficient integration into an existing finite element code, for which a simple model is sufficient. Finally, a particular application example of a plate with a hole is studied, where non-linear elastic constitutive behavior is present in the microstructure. This example considers three key aspects: first, the prediction accuracy, second, the load-step size behavior, and third, the speed-up compared to classical concurrent multiscale FE² computations. The novelty of this work lies in the efficient integration of a Python-based DNN surrogate model into a Fortran-written finite element code while facilitating just-in-time compilation techniques, leading to a significant speed-up compared to reference FE² computations. Furthermore, we demonstrate the applicability of a simple data-based DNN surrogate model in load-step size controlled simulations.

Regarding the notation, column vectors and matrices at the global finite element level are symbolized by bold-type italic letters and column vectors and matrices on the local (element) level using bold-type Roman letters A. Here, microscale quantities are denoted by $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0001$ . Further, calligraphic letters $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0002$ denote DNN surrogate models.

2 SEQUENTIAL MULTISCALE ANALYSES WITH DEEP NEURAL NETWORK SURROGATE MODEL

In multiscale analyses, the goal is to incorporate the effective constitutive behavior of a heterogeneous microstructure into a macroscale analysis. Typically, concurrent multiscale FE² computations are carried out as follows: first, the macroscale strains $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0003$ are used to impose boundary conditions on the surface of the RVE, representing the microstructure. Subsequently, a BVP is solved on the microscale, followed by a homogenization step to obtain the macroscopic stresses $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0004$ and the consistent tangent $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0005$ . The notation $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0006$ denotes a quantity associated with macroscale integration point j of element e. However, the solution of the microscale BVPs can be efficiently replaced by the evaluation of a surrogate model to reduce computational costs. This leads to sequential multiscale analyses because the RVE has to be evaluated beforehand. In this context, the general assumptions still hold, that is the effective constitutive behavior has to be an intrinsic material property that is not influenced by the specific macroscale BVP. Additionally, scale separation between micro- and macroscale is assumed to use first-order homogenization. Moreover, since we apply periodic displacement boundary conditions on the RVE in this work, it is assumed that no discontinuities are present, and thus, composite phases are perfectly bonded in the microstructure. Further, it should be mentioned that we restrict ourselves to problems in the small strain domain.

Rather than repeatedly solving a microstructural BVP, we employ a common feedforward DNN as a surrogate model. This means that the microscale evaluations are entirely replaced by the surrogate model. In our case, the surrogate model $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0007$ is evaluated depending on the macroscale strains $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0008$ , while parametrized with the trainable parameters $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0009$ . The DNN outputs are the homogenized stresses $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0010$ , which are used to employ reverse mode AD, see [14], on the surrogate model $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0011$ to obtain $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0012$ and predict the consistent tangent matrix $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0013$ ,

$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0014$ (1)

The architecture is depicted in Figure 1.

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Architecture of feedforward deep neural network surrogate model for predicting the effective constitutive behavior of the microstructure.

In the literature [15] it was shown that employing reverse mode AD with a single neural network surrogate model to obtain $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0015$ is superior to an approach that uses two separate neural networks for predicting $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0016$ and $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0017$ independently.

3 DEEP NEURAL NETWORK SURROGATE MODEL

In this work, we restrict ourselves to two-dimensional examples in the small strain domain, that is the macroscale strains $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0018$ , $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0019$ , is the vector of input quantities for the surrogate model and the output quantities are $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0020$ , $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0021$ and $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0022$ . Here, the strains and stresses can be arranged in column vectors since the corresponding tensors are symmetric in the small strain case. Furthermore, the coefficients of the consistent tangent matrix, which is here a symmetric 3 × 3 matrix, are compiled in a column vector as well, $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0023$ .

The DNN surrogate model is developed by drawing on two different frameworks, TensorFlow [16] and JAX [17]. The data for the surrogate modeling is generated by performing RVE computations with different prescribed macroscale strains that are generated by Latin hypercube sampling [18]. Thereby, particular symmetry relations of the constitutive models, which are applied in the RVE, are employed leading to an essential reduction of the required RVE computations for the data generation. We refer to [15] for further details regarding the data generation. The macroscale strains are restricted to a domain of $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0024$ as maximum strain for each strain component. Further, the inputs and outputs of the model are scaled with their mean and standard deviation to obtain efficient training. There, the scaling of the consistent tangent output has to retain the relationship between consistent tangent and homogenized stress.

The influence of three hyperparameters of the DNN (i.e. number of hidden layers $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0025$ , the number of neurons per each hidden layer $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0026$ , and the activation function) on the performance of the model is investigated. A summary of the obtained results is reported in Table 1 only for three different sizes of the DNN with swish activation function, for the sake of brevity. Based on the results, we choose a DNN architecture of eight hidden layers, each with 128 neurons and employing the swish activation function. The weights and biases are initialized by drawing on the Glorot uniform algorithm [19]. The training is done with 80% of the data for training and 20% for validation, using the Adam optimizer [20]. Here, different sizes of the dataset containing 10³ to 10⁶ samples are chosen. The training is done for 4000 epochs with an exponential decay of the learning rate η, which is initially chosen to $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0027$ . The decay step is 1000 and the decay rate $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0028$ . The dataset is decomposed to obtain 100 batches and the mean squared error is selected for the loss function,

$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0029$ (2)

Here, $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0030$ and $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0031$ denote the weighting factors for the two loss components of the predicted stresses and consistent tangent, respectively. Note that we track the validation loss during the training process and select the model parameters based on the lowest validation loss to avoid overfitting. The DNN surrogate model is implemented into the MPI-parallelized FORTRAN in-house code TASAFEM using the FORPy library [21]. It should be noted that utilizing more sophisticated DNN models, for example physics-informed NNs, see [10], could improve the performance of the model. However, here we focus more on developing an efficient framework for integrating DNN models into finite-element codes for sequential multiscale simulations by employing modern scientific computing tools such as just-in-time compilation and efficient FORTRAN-Python communication. Further, we show the applicability of our framework for load-step size controlled simulations in the following numerical example.

TABLE 1. Summary of the results obtained for training and validation losses and the required time of training for different sizes of the deep neural network.

$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0032$	$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0033$	$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0034$	$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0035$	$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0036$	Training time (min.)
32× 2	1.81 $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0037$	1.79 $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0038$	3.07 $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0039$	2.98 $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0040$	10.9
64× 4	3.74 $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0041$	4.28 $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0042$	1.29 $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0043$	1.28 $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0044$	14.5
128× 8	3.78 $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0045$	3.55 $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0046$	3.02 $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0047$	2.97 $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0048$	31.2

Note: Results are reported for models with swish activation function. A dataset with the size of 10⁵ is employed for training.

4 NUMERICAL EXAMPLE

The framework explained earlier, which utilizes a DNN surrogate model to predict the homogenized stress $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0049$ and consistent tangent matrix $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0050$ , is employed to analyze the mechanical response of a two-dimensional plate with a hole subjected to shear load. Here, we focus on three aspects: the prediction quality using the DNN, the load step-size behavior in load-step controlled simulations, and the speed-up.

4.1 Problem setup

The macroscale geometry for the numerical example is shown in Figure 2A.

The load functions are defined as follows: $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0051$ and $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0052$ . The plate is discretized using eight-noded quadrilateral elements, leading to $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0053$ elements and $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0054$ nodes. In the analysis, a plane strain state is assumed. The corresponding RVE is depicted in Figure 2B and comprises matrix material and fiber sections. The matrix material is modeled with a non-linear elasticity relation,

$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0055$ (3)

where the bulk modulus $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0056$ and the parameters $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0057$ and $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0058$ are the material parameters of the matrix material.

$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0059$ denotes the deviatoric strains and $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0060$ represents the Euclidean norm. In contrast, the fibers are assumed to be linear elastic with bulk modulus $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0061$ and shear modulus $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0062$ . The RVE is spatially discretized with $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0063$ eight-noded quadrilateral elements and $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0064$ nodes.

4.2 Prediction quality

To evaluate the prediction quality using the DNN surrogate model, which is described in Section 3, we introduce the absolute percentage error of the predicted stresses

$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0065$ (4)

where $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0066$ is the corresponding stress quantity for a reference FE² solution and $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0067$ represents the prediction from the DNN surrogate model. Since some of the stresses are zero, we use $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0068$ as the mean value of the stresses in the reference solution to obtain a relative error measure. Results are summarized in Table 2 for different sizes of the training/validation dataset $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0069$ , where $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0070$ and $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0071$ indicate the mean and the standard deviation of the absolute percentage error ε over all the integration points and stress components. It can be observed that the DNN model trained on a dataset with only 10³ samples can provide very accurate results with $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0072$ and $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0073$ of around 1%. Moreover, the error is decreased by employing a larger dataset for training and validation. In the following, we will focus on the results obtained from the DNN model trained on the dataset with $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0074$ of 10⁵.

TABLE 2. Results obtained from deep neural network surrogates trained using varying sizes of the training/validation datasets.

$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0075$	$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0076$ (%)	$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0077$ (%)
10³	1.08	1.14
10⁴	0.14	0.13
10⁵	0.04	0.03
10⁶	0.02	0.03

The absolute percentage errors for the predicted stresses are shown in Figure 2 for the final evaluation time of the applied load. Apparently, the DNN surrogate model is able to precisely predict the stresses in the present example, with the maximum error in the stresses being about 0.10%. As depicted in Figures 3A,B, the absolute error across a large domain is below 0.01%, with only a few integration points in the vicinity of the hole exhibiting slightly larger errors in the normal stresses. However, it is worth mentioning that the prediction of the shear stress τ₁₂ is less accurate compared to σ₁₁ and σ₂₂. This discrepancy could be attributed to the geometry of the RVE and the different stiffness of the constituents, which results in higher differences in the orders of magnitude for stresses and components of the consistent tangent matrix when subjected to shear load. Similar behavior has been previously reported for different macroscale problems in the literature [15].

4.3 Load-step size behavior

FE² computations usually utilize a load-step control since the initial load-step (or time-step, respectively) size is often very small to avoid convergence issues. Therefore, it is essential to investigate the performance of the DNN surrogate model in step-size controlled computations.

For both reference FE² computation and DNN-enhanced computation, an initial time-step size $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0078$ s is selected. The incremental application of the prescribed load, which can be interpreted as time integration, is performed using the Backward-Euler method. The system of linear equations arising in non-linear elastic finite element computations can be formally extended with $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0079$ to obtain a system of differential-algebraic equations as it is common in computations dealing with inelastic or viscous constitutive behavior. The determination of the time-step size for the next time-step $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0080$ is based on the number of global Newton iterations $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0081$ and the current time-step size $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0082$ ,

$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0083$ (5)

where we choose the coefficients $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0084$ and $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0085$ . The step-size behaviors for reference FE² computation and the multiscale computation using a DNN surrogate model are shown in Figure 4.

In this specific example, the initial time-step size $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0086$ is suitable for both methods. Moreover, both methods exhibit a similar behavior in the first iterations. However, the reference FE² computation shows a certain limit for the time-step size $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0087$ at around $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0088$ s, which appears due to non-convergence of the RVE computations for certain load-increments. In contrast, the evaluation of the DNN surrogate model allows a continuously increasing time-step size. Thus, the application of a DNN surrogate model for the microscale evaluations even extends the applicable time-step sizes, at least for the non-linear elastic problem studied here.

4.4 Speed-up

In general, the application of a DNN surrogate model for multiscale simulations aims at significantly decreasing the computational time. Here, the speed-up is calculated by dividing the total computation time $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0089$ of the reference FE² simulation by that of the multiscale simulation with DNN surrogate $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0090$ ,

$urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0091$ (6)

The simulations with DNN surrogate model are performed on an 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz CPU with 16 threads. In contrast, the reference simulations are performed on a 2nd Gen Intel(R) Xeon(R) Silver 4216 @ 2.10GHz CPU with 16 processes and one thread per process. For the example of a plate with a hole, the speed-up is 524× using the TensorFlow implementation of the DNN. However, as mentioned afore, we also apply the JAX framework for the development of a surrogate model with the neural network library Flax [22]. JAX facilitates the just-in-time compilation of the prediction function, that is the compilation of the Python code into machine code just before the execution to reduce resource consumption. As demonstrated in the literature [15], just-in-time compilation significantly increases the speed-up in multiscale simulations with surrogate models. Here, we obtain a speed-up of 6048× for the specific example of a plate with a hole.

5 CONCLUSIONS

As it is well-known, (deep) neural network surrogate models can significantly reduce the computational expenses of multiscale computations. In this contribution, an efficient integration of such a DNN surrogate model into an existing finite element code is explained, utilizing the open-source library FORPy [21] for Fortran-Python interoperability. This integration already results in a speed-up of around 500× for the specific example of a plate with a hole. Furthermore, using just-in-time compilation has significantly increased the speed-up to around 6000× . In load-step size controlled simulations, where non-linear elastic material behavior is present in the microstructure, it is observed that the surrogate model can overcome specific step-size limitations inherent in the RVE computations. Therefore, the use of a DNN surrogate model can even extend the feasible load-step sizes for multiscale computations.

ACKNOWLEDGMENTS

Hamidreza Eivazi's research was conducted within the Research Training Group CircularLIB, supported by the Ministry of Science and Culture of Lower Saxony with funds from the program zukunft.niedersachsen of the Volkswagen Foundation.

Open access funding enabled and organized by Projekt DEAL.

REFERENCES

1Schröder, J. (2014). A numerical two-scale homogenization scheme: The FE²-method. In J. Schröder, & K. Hackl (Eds.), Plasticity and beyond: microstructures, crystal-plasticity and phase transitions (pp. 1–64). Springer Vienna.
10.1007/978-3-7091-1625-8_1
Google Scholar
2Bishara, D., Xie, Y., Liu, W. K., & Li, S. (2023). A state-of-the-art review on machine learning-based multiscale modeling, simulation, homogenization and design of materials. Archives of Computational Methods in Engineering, 30, 191–222.
10.1007/s11831-022-09795-8
Web of Science® Google Scholar
3Liu, Z., Bessa, M., & Liu, W. K. (2016). Self-consistent clustering analysis: An efficient multi-scale scheme for inelastic heterogeneous materials. Computer Methods in Applied Mechanics and Engineering, 306(7), 319–341.
10.1016/j.cma.2016.04.004
Google Scholar
4Fritzen, F., Fernández, M., & Larsson, F. (2019). On-the-fly adaptivity for nonlinear twoscale simulations using artificial neural networks and reduced order modeling. Frontiers in Materials, 6(5), 75.
10.3389/fmats.2019.00075
Google Scholar
5Rao, C., & Liu, Y. (2020). Three-dimensional convolutional neural network (3D-CNN) for heterogeneous material homogenization. Computational Materials Science, 184(11), 109850.
10.1016/j.commatsci.2020.109850
Google Scholar
6Li, B., & Zhuang, X. (2020). Multiscale computation on feedforward neural network and recurrent neural network. Frontiers of Structural and Civil Engineering, 14(6), 1285–1298.
10.1007/s11709-020-0691-7
Web of Science® Google Scholar
7Aldakheel, F., Elsayed, E. S., Zohdi, T. I., & Wriggers, P. (2023). Efficient multiscale modeling of heterogeneous materials using deep neural networks. Computational Mechanics, 72, 155–171.
10.1007/s00466-023-02324-9
Web of Science® Google Scholar
8Kalina, K. A., Linden, L., Brummund, J., & Kästner, M. (2023). FE $urn:x-wiley:16177061:media:pamm202300052:pamm202300052-math-0092$ : An efficient data-driven multiscale approach based on physics-constrained neural networks and automated data mining. Computational Mechanics, 71, 827–851.
10.1007/s00466-022-02260-0
Google Scholar
9Nguyen, L. T. K., & Keip, M.-A. (2018). A data-driven approach to nonlinear elasticity. Computers and Structures, 194, 97–115.
10.1016/j.compstruc.2017.07.031
Google Scholar
10Linden, L., Klein, D. K., Kalina, K. A., Brummund, J., Weeger, O., & Kästner, M. (2023). Neural networks meet hyperelasticity: A guide to enforcing physics. Journal of the Mechanics and Physics of Solids, 179, 105363.
10.1016/j.jmps.2023.105363
Web of Science® Google Scholar
11Czarnecki, W M., Osindero, S., Jaderberg, M., Swirszcz, G., & Pascanu, R. (2017). Sobolev training for neural networks. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in neural information processing systems. Curran Associates, Inc.
Google Scholar
12Feng, N., Zhang, G., & Khandelwal, K. (2022). Finite strain FE² analysis with data-driven homogenization using deep neural networks. Computers and Structures, 263, 106742.
10.1016/j.compstruc.2022.106742
Google Scholar
13Le, B. A., Yvonnet, J., & He, Q.-C. (2015). Computational homogenization of nonlinear elastic materials using neural networks. International Journal for Numerical Methods in Engineering, 104, 1061–1084.
10.1002/nme.4953
Web of Science® Google Scholar
14Baydin, A. G., Pearlmutter, B. A., Radul, A. A., & Siskind, J. M. (2017). Automatic differentiation in machine learning: A survey. Journal of Machine Learning Research, 18(1), 5595–5637.
Google Scholar
15Eivazi, H., Tröger, J.-A., Wittek, S., Hartmann, S., & Rausch, A. (2023). FE² computations with deep neural networks: algorithmic structure, data generation, and implementation. Mathematical and Computational Applications, 28(4), 91.
10.3390/mca28040091
Web of Science® Google Scholar
16Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D. G., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., … Zheng, X. (2016). Tensorflow: A system for large-scale machine learning. In Proceedings 12th USENIX Symposium on Operating Systems Design and Implementation (pp. 265–283). USENIX.
Google Scholar
17Bradbury, J., et al. (2018). JAX: composable transformations of Python+NumPy programs. Available online: https://github.com/google/jax
Google Scholar
18McKay, M. D., Beckman, R. J., & Conover, W. J. (1979). A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics, 21(2), 239–245.
10.2307/1268522
Web of Science® Google Scholar
19Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, (pp. 249–256). PMLR.
Google Scholar
20Kingma, D. P., & Ba, J. (2017). Adam: A method for stochastic optimization. arXiv:1412.6980 [cs.LG]. https://doi.org/10.48550/arXiv.1412.6980
10.48550/arXiv.1412.6980
Google Scholar
21Rabel, E., Rüger, R., Govoni, M., & Ehlert, S. (2020). Forpy: A library for Fortran-Python interoperability. Available online: https://github.com/ylikx/forpy
Google Scholar
22Heek, J., et al. (2023). Flax: A neural network library and ecosystem for JAX. Available online: https://github.com/google/flax
Google Scholar

Citing Literature

Volume23, Issue4

Special Issue:93rd Annual Meeting of the International Association of Applied Mathematics and Mechanics (GAMM)

December 2023

e202300052

Efficient integration of deep neural networks in sequential multiscale simulations

Abstract

1 INTRODUCTION

2 SEQUENTIAL MULTISCALE ANALYSES WITH DEEP NEURAL NETWORK SURROGATE MODEL

3 DEEP NEURAL NETWORK SURROGATE MODEL