Whitepapers

Accelerating Ansys Fluent: The Impact of GPU Solving on Simulation Preformance

Issue link: https://resources.randsim.com/i/1527941

Contents of this Issue

Navigation

Page 3 of 10

The GPU Study "So how many CPU cores is a GPU anyway?" Most simulation engineers understand the CPU core language. In other words, "if X problem is 4 million elements in size, I'll likely need Y number of CPU cores to solve that problem in Z hours." By benchmarking several GPUs against 128 core 8x Intel Xeon Gold 6242 CPUs (a typical size for a remote system or cluster), we can create a relative speed (and cost) comparison that engineers and businesses can leverage when scoping future hardware needs. The study uses two simulation models from Ansys specifically used for benchmarking: a 4 million element mixing tank model and a 24 million element combustor model. These models represent a model size range a heavy simulation user would likely encounter in their industry. One can always use GPU solving for smaller (and larger) problems, but this initial range is a way to start the GPU conversation and show its value to simulation users. Additionally, the study covers the impacts of using a segregated or coupled solving method when using Ansys Fluent's Native GPU solver. Typically, a segregated solve will require less time per iteration, but will require more iterations to converge a given model. Conversely, a coupled solve will require roughly half the iterations to converge, but each iteration takes longer. Each of these outcomes can play a factor in how many equivalent CPU cores a GPU can be (in terms of compute time). To distill the results of the study, the following metrics were used: Average wall clock time per iteration (seconds): At the completion of each simulation, this data was collected through Fluent's Parallel Usage calculation, and serves as a base for the speed metrics. Relative speed and relative cost: These two metrics are a ratio of the baseline case's average wall clock time per iteration and hardware cost to the respective GPU's average wall clock time per iteration and associated hardware cost. For this study, a 128 CPU core cluster serves as the baseline case, with a relative speed and cost value of one (1). Cost eectiveness: This is a ratio of the GPU's (or CPU configuration's) relative speed and relative cost to purchase. This can be considered a "bang for buck" number and allows for easy comparison between all GPUs and the baseline reference case. Hardware and Licensing Costs: In this study, hardware costs refer to the entire system cost, not just the equivalent GPU or CPU configuration. Ocial Ansys Elite Channel Partner Rand Simulation partnered with Exxact Corporation to estimate what those costs would look like from a system perspective. Licensing costs refer to the Ansys licensing solutions required to drive the given hardware configuration (ie, solver license and HPC pack licenses). The GPUs selected for testing represent a large range of budgets as well as computational capabilities. Ranging from Workstation GPUs (NVIDIA RTX TM 4000 Ada Generation, NVIDIA RTX 6000 Ada Generation, and NVIDIA A800 40GB Active GPUs) to Data Center GPU (NVIDIA A100 Tensor Core GPU), the study gives potential GPU solver users and customers a holistic way to right-size their GPU needs. ©2024 Rand Simulation 4 1. 2. 3. 4. The Impact of GPU Solving on Simulation Performance

Articles in this issue

Links on this page

view archives of Whitepapers - Accelerating Ansys Fluent: The Impact of GPU Solving on Simulation Preformance