4.7 Article

OpenMP and MPI implementations of an elasto-viscoplastic fast Fourier transform-based micromechanical solver for fast crystal plasticity modeling

Journal

ADVANCES IN ENGINEERING SOFTWARE
Volume 126, Issue -, Pages 46-60

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.advengsoft.2018.09.010

Keywords

Crystal plasticity; Constitutive equations; Cray supercomputer; Parallel computing; MPI

Funding

  1. Los Alamos National Laboratory [388715]
  2. U.S. National Science Foundation [CMMI-1650641]

Ask authors/readers for more resources

We explore several parallel implementations of an elasto-viscoplastic fast Fourier transform (EVPFFT) model using Message Passing Interface (MPI), OpenMP, and a hybrid of MPI and OpenMP to efficiently predict micromechanical response of polycrystals. Performance studies using EVPFFT are performed based on domain decomposition over voxels of a periodic cell, which is a representative volume element (RVE) of polycrystalline copper. We begin by parallelizing the computationally intensive Newton-Raphson (NR) single crystal solver within EVPFFT. Next, we compare the performance of the serial and parallel FFTW (Fastest Fourier Transform in the West) using OpenMP (OpenMP-FFTW) and MPI (MPI-FFTW) with the original Numerical Recipes-based FOURN routine within EVPFFT. In the parallel environment, we find that the FFT calculations are best performed using the MPI version of FFTW. Finally, the remainder of the code, except read/write subroutines, is parallelized. Significant speedups of the original EVPFFT model are achieved using MPI on shared memory multicore workstations. Furthermore, results achieved on a distributed memory Cray supercomputer show promising strong and weak scalability and in some cases even super scalability for the single crystal NR solver in EVPFFT. MPI-FFTW also scales perfectly for microstructure RVEs larger than 64(3) FFT voxels. For example, the MPI-EVPFFT parallel version of the code accelerates the simulations for approximately two orders of magnitude using 64 cores over the old serial code for an RVE size of 128(3). The parallel EVPFFT code developed in this work can run massive voxel-based microstructural RVEs taking the advantages of thousands of logical cores provided by more advanced clusters.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available