Wednesday, January 4, 2023

Hello integrator

I thought many times about posting some kind of tutorial for physics simulation. And my mind always goes back to the basics: the integrator. Actually the most simple integrator that works: Symplectic Euler. But how do you implement it best if I were to start over (or from scratch)?  

This is how I would do it today:

$v  += h g  + hM^{-1}f$

$x  += h v$

The main idea is that I would be using vector notation just like in Matlab. The main quantities are the positions $x$ and the velocities $v$. Now there is the question how do you actually store these positions: either as an array of vector3 (or vector2) or store first the x coordinates, then the y and so on. Or you can store them as the rectangular matrix where the rows are the individual coordinates and the columns the vectors. In my experience, the first option is the best, as most operations are done on vector3 corresponding to each particle. The problems start appearing when you want to switch from 3D to 2D or from particles to rigid bodies. In fact you can have any number of degrees of freedom (DOFs) per body and in total, so it's up to the vector abstraction code to deal with such irregularities (the "views" from NumPy and PyTorch may be a way to go, but not sure how expensive they are).

But in the example above we're just dealing with regular n-dimensional vectors. The only hiccup is the broadcast of the gravitational acceleration $g$ to the bigger vector $v$. But assuming that we are dealing with a list of vector3's (and leaving rigid bodies aside) this should be rather easy. Same goes for scaling by a number, in this case the time-step $h$. Multiplying a general vector by a matrix should also be doable, although in this case we know it is a diagonal matrix and we should probably do it by a pair-wise multiplication instead (would need a special operator notation). (When doing FEM with non-lumped masses, $M$ is no longer diagonal.)

The whole idea here is to avoid for loops and leave the vector abstraction layer do it for you. It's simply more elegant and easier to write if you ask me. And if the vectors are very, large as it usuallys happens, resort to parallel fors or some other parallelization technique (GPU is a bit trickier to abstract as plain C++ code, but would be nice too). For the time being, I expect all these linear algebra operations to be executed on the spot (by order of precedence of operators). Some kind of lazy evaluation (a la Eigen) would be nice, or even fully storing the graph of operations (a la PyTorch) for later execution on GPU or auto-differentiation would be even nicer. But let's stay on planet Earth for the moment and stick with an in-house implementation of these operation before we consider other linear algebra or tensor libraries.

And I did implement this in C++ in a pretty straightforward manner:

void Update()
{
    const real h = real(0.16);
    vec2 gravity(0, 10);
    vel += mask<vec2>(h * gravity, invMass) + h * invMass * forces;
    pos += h * vel;
}

Above, invMass is another vector containing the diagonal entries of $M^{-1}$. But as you can see, this is not the end of the story: we still have the problem of "boundary conditions". Usually, when dealing with particle systems we only deal with one type of boundary condition (or constraint): fixed particle in space. In some cases this particle can also have a prescribed motion in which case we call it kinematic rather than fixed. But in either case, such particles are not to be integrated. This results in a simple algorithm where we check to see if the particle is fixed or kinematic and skip it in the integrator. The convention is usually that such particles have infinite mass (they can't be moved by any forces) and thus zero inverse mass. A lot of code out there (especially constraint solvers) is relying on this convention and storing only the inverse masses for particles and exploiting the zero value whenever needed (e.g., a constraint linked to fixed anchor in space or coming from animation). But as you can imagine, this makes it working with the actual masses harder. However, let's indulge ourselves in this convention and continue.

There are a few ways I identified in which we can treat boundary conditions (Dirichlet ones mostly):

  1. group all the fixed particles at the beginning of the vector, store an offset to know where the dynamic particles begin and only integrator from there on;
  2. use a "mask" operation (like the ones used in SIMD) to select between the integrator result and another value, most probably zero;
  3. store a list of indices in the big vector to mark the dynamic particles (or fixed ones, respectively) and then use that list to access a sub-set of the vector (a la Matlab or NumPy);
  4. eliminate those fixed particles altogether from the global DOFs vector (through "direct substitution"), but then you have to store them somewhere else and take them into account when computing (spring) forces.
The first option is something that I used extensively in my FEM code (where a node is equivalent to a particle). I also used option 3 when porting the same FEM code to Python (either NumPy or PyTorch) and it's basically inspired from the Matlab way of doing things. That doesn't mean you can't use the other options for a Matlab or Python simulator. It's clear that here I chose option 2, as I think now it's a better choice, at least for the Symplectic Euler code. It pretty much corresponds to the old way of putting an if inside a for loop, although the actual implementation of the abstraction may vary. Finally, option 4 is something I did in some of my constraint solvers, but the downside is that then you have to differentiate between your constraint types: particle-world vs particle-particle. In the end, it's a lot about personal tastes and how you like to organize your code. Which is the faster one is hard to say, probably 1 or 4, but such options are quite limiting too. For example, you can't dynamically set particles to be kinematic or dynamic without some complicated reshuffling or re-labeling of edge forces. Hence the mask option for now, which can probably be implemented some SSE instructions.

Finally, the hardest part in this code is not even contained in it: it's computing the actual forces $f$ acting on each particle. I already talked about this in my force accumulation pattern post, and I intend to do it one more time in a future post, as there is not enough space left in this one. But I will mention that a lot of my current motivation is coming from the Simit language. Even if it looks more like GPU code (kernels/shaders) and it's not always vectorized like Python or Matlab code, it has an important feature: reductions (like the force accumulation) are abstracted and automated by the language. The "map" step as it is called in Simit allows the user to perform Matlab-like linear algebra operations after the vectors and matrices have been assembled (i.e., vectorized code). And these could be either an implicit integrator of some sort involving a linear solver (or nonlinear optimization) or an explicit one as above. In the end, this is what I would like to achieve in my own C++ code too, or even use Simit directly if possible. 

The end goal is to understand all the steps and abstractions necessary in writing simulation code and paving the way to a pipeline design like in graphics, while leaving enough flexibility (and ease of coding) to implement completely different algorithms (integrators, solvers, etc.). I'm not sure if a domain specific language (DSL) is the way to go here or if writing everything in C++ once for multiple backends is better (although not sure how to target GPU from there: SyCL hints in that direction, also PyTorch is an example of success too).

No comments:

Post a Comment