Monday, December 31, 2018

Structures Used to Implement on GPU Machine

So, our job is to figure out how best to code the above equations in CUDA C/C++. After examination of these basic evolution equations, it is clear we need to keep track of the 5 evolution variables at each of our NxNxN points in our grid. These are:

 conformal metric (3x3 matrix)

 Trace-free excrinsic curvature (3x3 matrix)

moving puncture conformal variable (scalar)

K  Trace of the extrinsic curvature (scalar)


moving puncture Gamma (vector)

Note that scalars are one-valued at each point to keep track of. Alternatively,  each vector has three values:  x-direction, y-direction,, and z-directionfor each of our grid points. 

The 3x3 matrices, with 9 values at each point, for example:
                                      


In addition, at each point in the grid we need to track the Ricci curvature tensor, and its component values, Note that Ricci curvature tensor and its components are all 3x3 matrices:



One might use a 3x3x3 (Tensor) structure store the 3x3x3 = 27 values the Christoffel symbol:               
   
                                        
We also need to track the lapse (α) and shift variables (β i). While α is a scalar, the shift variable is a 3-value vector (β x,  β and β z).

Additionally, there are a lot first and second order derivatives to keep track of. Now, knowing that each time we take the derivative of a tensor, the result is a tensor of one higher order, we can predict that, the 3 partial derivatives of a scaler will be a 3-vector, the 3 partial derivatives of each component of a 3-vector will be a 3x3 matrix, the 3 partial derivatives of each component of a 3x3-matrix will be a 3x3x3-tensor, and the 3 partial derivatives of each component of a 3x3x3-tensor (or the second derivatives of a 3x3 matrix) will be a 3x3x3x3-tensor (which, for convenience, will be referred to as a 3x3x3x3-hypercube).

As a side note, of course all of the 3x3 matrices we use in the BSSN moving puncture method are symmetric, which is to say that for any 3x3 matrix F, the components . While this looks like a good way to save memory, it does involve more complicated indices in many of the for-loops we’ll be using. So, for this simplest of examples, we’ll stick to just computing the full 3x3 = 9 values, knowing full well that exploiting these symmetries would be a great way to optimize this code in time and memory space.

One way to code this in CUDA C/C++ is to create a few simple structures which match our needs. For example:

typedef struct{
      float v[3];
}vector;                         //1x3 array

typedef struct{
      float M[3][3];
}Matrix;                         //3x3 array

typedef struct{
      float T[3][3][3];
}Tensor;                         //3x3x3 array

typedef struct{
      float H[3][3][3][3];
}Hypercube;                      //3x3x3x3 array

The general idea is to create NxNxN arrays of these structures.

Next: Implementation on CPU Machine -->

<--Back: Kreiss-Oliger Spatial Filtering
<--Back: Overview

No comments:

Post a Comment

Overview -- Numerical Relativity Using CUDA C/C++ Is Easier Than You Think!

Simulation results of a binary black hole head-on collision run on a GPU based home gaming computer Not too long ago, the only way ...