Thursday, January 26, 2023

Laplacian smoothing and friends

In its simplest form, it amounts to replacing a point by the average of its neighbors. For a mesh, a neighborhood can be a flexible concept, but we can reduce it to being the one-ring of each vertex, i.e., the vertices connected through an edge to the current vertex. This is valid for both triangle and quad meshes or any polygonal faces. For quad meshes, we can probably extend it to contain vertices that lie opposite (along the diagonal of the quad) to the current vertex. This is often done for cloth meshes to define stretch springs - in which case we don't really have a mesh anymore, but rather a graph. Note that this extension to quads (and cloth) is not really a second order neighborhood (vertices connected through 2 edges), but rather a subset of it.

As you can see, already the notion of a neighborhood is rich enough for a discussion. I guess we can also apply the same concept in a mesh-free context (like in SPH), where a neighborhood is defined by a radius around the current vertex. The next tricky part are the weights of each vertex in the neighborhood. In my experiments, using the "umbrella" Laplacian operator (e.g. [-1 2 1]) seems to be just fine for many cases and I'll stick with it for simplicity.

In this blog post, I will try to look at Laplacian smoothing and its various facets and try to learn something new out of it. Mainly, how is it many things at the same time? A low pass filter, a membrane simulation, a 2D diffusion process, etc. And how are they connected? Definitely, it's not going to be an exhaustive answer.

Let's start with the math:

$p'_i = p_i + \lambda L p_i = (I + \lambda L) p_i = p_i + \lambda \Delta p_i$

The new position $p'$ is the old position plus a delta term which is obtained by applying the Laplacian $L$ operator (which is in fact a matrix) to the current position $p$. But this operator depends also on the neigboring vertices of $p$. So, in the end it's a sum of the form

$\Delta p = \sum_j w_j (p_j - p_i)$

where the $w$ terms are called the weights (like in a weighted average or a convolution).

I won't talk here much about the shrinking effect of Laplacian smoothing. The most common solutions are to use the Taubin method (also known as the "lambda | mu" method) or trying to preserve the volume somehow.

Image filters

Laplacian filters are used in image processing to detect edges or sharpen the image. Usually, the convolution kernels are not "normalized" and look something like [-1 2 -1]. My guess is this is because they want to exaggerate its effect, as the image colors are clamped anyway between 0 and 255. Another important distinction is that the signed is reversed from the one used for smoothing curves and meshes. From my experiments on images and meshes so far, my conclusion  is that a negative L will sharpen the field values and a positive one will smooth them.

Another interesting thing I noticed is that the resulting averaging kernel of the form (e.g. [0.5 0 0.5]) obtained after adding the Laplacian (with a full step) does not include the original value anymore. The result is still a smoothing, hence a blurring of the image, but very different from a classic Gaussian low pass filter (or sinc, or cubic B-spline for that matter). However, Taubin argues that the averaging is still a Gaussian weighting, but maybe the difference is that one works per nodes (for images) and the other per edges (for meshes). Still, it's pretty clear they are both low pass filters, so it's probably just different ways of doing it, with different derivations.

Clearly, the biggest distinction is that images are represented on a regular grid. If we were to represent an image as a scalar field broken down into triangular elements the situation would be more similar. It would be like working on the uv space of the mesh. But an image also does not have intrinsic curvature, but that is taken into account by the Laplace-Beltrami operator (the contangent weights). But if we use uniform weights, the analogy is pretty clear, even more if we used quad meshes, as then we would have something very close to a grid, at least topologically. By the way, there are two flavours of 3x3 Laplacian convolution kernels in image processing: one uses only direct neighbors and the other also the diagonal ones (as discussed above).

Mesh smoothing

The Laplacian is arising from minimizing some kind of membrane (or thin plate) energy. This in turn is usually the squared norm of some strain tensor, which also approximates total curvature. Given this objective, it is clear to see that the end result is a surface without curvature or at least the minimal curvature everywhere that still preserves the topology of the mesh. Given no other constraints, the minimization may also shrink the mesh in the process.

Desbrun et al. show that smoothing is in fact a diffusion PDE (time-dependent) integrated on a curved mesh surface. (It's like a temperature becoming uniform everywhere on the mesh; or some spiky hairs on it, like the normal vectors.) But the mesh is already a space discretization, so in the end we only need to integrate an ODE in time. This can be done either using explicit Euler resulting in the classic repeatead application of the Laplacian operator. The method using implicit Euler is called "implicit fairing" and requires the solving of a linear system, as it is usually the case with implicit integration.

Graph Laplacian

A more general way to define the Laplacian operator is to do it on any graph, not only a manifold mesh. And in words, this is a matrix that has the degree (of incidence) of each vertex on the diagonal and then -1 for each connected vertex index and everything else is zero. This can be summarized by the equation
$L = D-A$,
where $D$ is the degree matrix and $A$ the adjacency matrix (1 for each neighboring vertex). But a more interesting way is to use the incidence matrix $B$, which encodes edges in pairs of the form [1 -1]. And this results in a symmetric form factorization:
$L = B B^T$

A mesh is a special case of a graph. So, this tells us that the above Laplacian will always be a symmetric matrix (under normal assumptions) that in turn can come from some kind of "difference" rectangular matrix multiplied with itself transposed. Which makes sens given that tn vectory analysis the Laplacian is the "del" operator or the gradient dotted with itself: $\Delta = \nabla\cdot \nabla$ (sometimes denoted $\nabla^2).

Cloth simulation

Smoothing is equivalent to having a spring on each edge (according to Desbrun et al.), but a special type of spring with zero rest length. This is related to the minimization structure described above. It also explains the shrinking.

We could then build a cloth simulator that reproduces the same behavior as Laplacian smoothing. This could be done using the Baraff-Witkin method (or any other implicit scheme). In this case, the method should be equivalent to "implicit fairing". But it can also be done with an explicit integration step.

Given that a zero-rest length spring doesn't really have a direction, but it's more like a 3 DOF constraint, it is also fully linear. It is equivalent to 3 independent springs acting on each axis. This makes the forces (thus the Laplacian) a linear operator acting on the positions. Thus the stiffness matrix (gradient of force, Hessian of energy) is the Laplacian matrix itself. All the pieces fit!

A first extension that we could think of is to implement a smoothing algorithm based on cloth simulation that doesn't target zero rest length, but only a fraction of the rest length. Of course, it will make things more complicated. For one, the Laplacian operator will no longer be what it used to be, as it's being replaced by the stiffness matrix and nonlinear spring forces. But in my mind it would offer more control over the smoothing process. And the reasoning also goes the other way around: cloth simulation is essentially a mesh smoother. Add on top of that bending springs or volume preservation and you get a powerful geometric tool.

The fact that cloth simulation is a "smoother" is even more clear when doing explicit integration or constraint solving. There is a pattern for accumulating forces in almost any solver, that I call the Jacobi fashion. And this uses the neighborhood of each vertex or incident constraints/springs. You could do it in two passes: one over edges, compute some edge quantities and then sum them up for each vertex (either using the incidence info or not). Or you could do it in one go for each vertex (and its one ring neighborhood), which starts looking a lot like mesh smoothing or a convolution kernel. And in the end a Jacobi constraint solver (or Gauss-Seidel) is just a smoother. In fact relaxation solvers are sometimes called smoothers in the context of multigrid. And there are actual results out there in fluid simulation where Jacobi acts as recursive convolution kernels - they call them Jacobi-Poisson filters.

And to go a bit into details, we already know from the context of implicit cloth simulation that the Laplacian operator corresponds to the stifness matrix $K=k J^T J + K_g$, where $k$ is the springs stiffness (considered the same for all springs), $J$ is the Jacobian of the constraint functions (used by the springs too) and $K_g$ is something that I will call geometric stiffness and ignore here. Here $J$ stands for an equivalent (transposed) of the adjacency matrix $B$ - it's actually an equivalent with extra orientation in space. $J$ also stands for the Jacobian matrix, which is just a generalized and transposed gradient.

Now, if we take a projection operation, which is at the heart of any constraint solver, you get
$p' = (I - J^T (JJ^T)^{-1}J) p$.
This again gives us a glimpse of an equivalent Laplacian operator. But even more, this is equivalent to solving a linear system
$(JJ^T)x = b$
which in fact is what is done in practice for finding out the Lagrange multipliers, here denoted by $x$. And the matrix of this system is clearly a Laplacian of some graph itself and the whole linear system corresponds to a discrete form of a Poisson equation. I posit that this Laplacian is nothing else but the one coming from the dual graph of the constraints/springs. And it is something that I've been glimpsing throughout the years. It's not especially hard if you look closely at a PBD or impulse solver (especially Jacobi ones). But my current realization, which I've been guessing all this time, is that a constraint solver is really a smoothing operation of some field (velocity, position) in the end.

However, in the end, smoothing and cloth simulation are still distinct processes. Maybe they are both more or less equivalent to each other and they both act as spatial filltes. It is also said that the "contangent Laplacian" can be derived from FEM, thus it cannot come from springs alone, but rather triangle elements. So, it's just different flavors of "Laplacians" that appear in mass-spring or constraint-based simulations. Also, smoothing seems to act more on the curvature part of the surface, thus bending forces or constraints may play a bigger role on the side of cloth simulation. In fact, usually they are all super-posed on each other (stretching, shearing, bending) resulting in a global stiffnes matrix or Laplacian. So, there are a lot of unanswered questions in my head and experiments to be done.

Bibliography

Desbrun et al., "Implicit Fairing of Irregular Meshes using Diffusion and Curvature Flow", 1999

Wednesday, January 4, 2023

Hello integrator

I thought many times about posting some kind of tutorial for physics simulation. And my mind always goes back to the basics: the integrator. Actually the most simple integrator that works: Symplectic Euler. But how do you implement it best if I were to start over (or from scratch)?  

This is how I would do it today:

$v  += h g  + hM^{-1}f$

$x  += h v$

The main idea is that I would be using vector notation just like in Matlab. The main quantities are the positions $x$ and the velocities $v$. Now there is the question how do you actually store these positions: either as an array of vector3 (or vector2) or store first the x coordinates, then the y and so on. Or you can store them as the rectangular matrix where the rows are the individual coordinates and the columns the vectors. In my experience, the first option is the best, as most operations are done on vector3 corresponding to each particle. The problems start appearing when you want to switch from 3D to 2D or from particles to rigid bodies. In fact you can have any number of degrees of freedom (DOFs) per body and in total, so it's up to the vector abstraction code to deal with such irregularities (the "views" from NumPy and PyTorch may be a way to go, but not sure how expensive they are).

But in the example above we're just dealing with regular n-dimensional vectors. The only hiccup is the broadcast of the gravitational acceleration $g$ to the bigger vector $v$. But assuming that we are dealing with a list of vector3's (and leaving rigid bodies aside) this should be rather easy. Same goes for scaling by a number, in this case the time-step $h$. Multiplying a general vector by a matrix should also be doable, although in this case we know it is a diagonal matrix and we should probably do it by a pair-wise multiplication instead (would need a special operator notation). (When doing FEM with non-lumped masses, $M$ is no longer diagonal.)

The whole idea here is to avoid for loops and leave the vector abstraction layer do it for you. It's simply more elegant and easier to write if you ask me. And if the vectors are very, large as it usuallys happens, resort to parallel fors or some other parallelization technique (GPU is a bit trickier to abstract as plain C++ code, but would be nice too). For the time being, I expect all these linear algebra operations to be executed on the spot (by order of precedence of operators). Some kind of lazy evaluation (a la Eigen) would be nice, or even fully storing the graph of operations (a la PyTorch) for later execution on GPU or auto-differentiation would be even nicer. But let's stay on planet Earth for the moment and stick with an in-house implementation of these operation before we consider other linear algebra or tensor libraries.

And I did implement this in C++ in a pretty straightforward manner:

void Update()
{
    const real h = real(0.16);
    vec2 gravity(0, 10);
    vel += mask<vec2>(h * gravity, invMass) + h * invMass * forces;
    pos += h * vel;
}

Above, invMass is another vector containing the diagonal entries of $M^{-1}$. But as you can see, this is not the end of the story: we still have the problem of "boundary conditions". Usually, when dealing with particle systems we only deal with one type of boundary condition (or constraint): fixed particle in space. In some cases this particle can also have a prescribed motion in which case we call it kinematic rather than fixed. But in either case, such particles are not to be integrated. This results in a simple algorithm where we check to see if the particle is fixed or kinematic and skip it in the integrator. The convention is usually that such particles have infinite mass (they can't be moved by any forces) and thus zero inverse mass. A lot of code out there (especially constraint solvers) is relying on this convention and storing only the inverse masses for particles and exploiting the zero value whenever needed (e.g., a constraint linked to fixed anchor in space or coming from animation). But as you can imagine, this makes it working with the actual masses harder. However, let's indulge ourselves in this convention and continue.

There are a few ways I identified in which we can treat boundary conditions (Dirichlet ones mostly):

  1. group all the fixed particles at the beginning of the vector, store an offset to know where the dynamic particles begin and only integrator from there on;
  2. use a "mask" operation (like the ones used in SIMD) to select between the integrator result and another value, most probably zero;
  3. store a list of indices in the big vector to mark the dynamic particles (or fixed ones, respectively) and then use that list to access a sub-set of the vector (a la Matlab or NumPy);
  4. eliminate those fixed particles altogether from the global DOFs vector (through "direct substitution"), but then you have to store them somewhere else and take them into account when computing (spring) forces.
The first option is something that I used extensively in my FEM code (where a node is equivalent to a particle). I also used option 3 when porting the same FEM code to Python (either NumPy or PyTorch) and it's basically inspired from the Matlab way of doing things. That doesn't mean you can't use the other options for a Matlab or Python simulator. It's clear that here I chose option 2, as I think now it's a better choice, at least for the Symplectic Euler code. It pretty much corresponds to the old way of putting an if inside a for loop, although the actual implementation of the abstraction may vary. Finally, option 4 is something I did in some of my constraint solvers, but the downside is that then you have to differentiate between your constraint types: particle-world vs particle-particle. In the end, it's a lot about personal tastes and how you like to organize your code. Which is the faster one is hard to say, probably 1 or 4, but such options are quite limiting too. For example, you can't dynamically set particles to be kinematic or dynamic without some complicated reshuffling or re-labeling of edge forces. Hence the mask option for now, which can probably be implemented some SSE instructions.

Finally, the hardest part in this code is not even contained in it: it's computing the actual forces $f$ acting on each particle. I already talked about this in my force accumulation pattern post, and I intend to do it one more time in a future post, as there is not enough space left in this one. But I will mention that a lot of my current motivation is coming from the Simit language. Even if it looks more like GPU code (kernels/shaders) and it's not always vectorized like Python or Matlab code, it has an important feature: reductions (like the force accumulation) are abstracted and automated by the language. The "map" step as it is called in Simit allows the user to perform Matlab-like linear algebra operations after the vectors and matrices have been assembled (i.e., vectorized code). And these could be either an implicit integrator of some sort involving a linear solver (or nonlinear optimization) or an explicit one as above. In the end, this is what I would like to achieve in my own C++ code too, or even use Simit directly if possible. 

The end goal is to understand all the steps and abstractions necessary in writing simulation code and paving the way to a pipeline design like in graphics, while leaving enough flexibility (and ease of coding) to implement completely different algorithms (integrators, solvers, etc.). I'm not sure if a domain specific language (DSL) is the way to go here or if writing everything in C++ once for multiple backends is better (although not sure how to target GPU from there: SyCL hints in that direction, also PyTorch is an example of success too).