This post was triggered by a section in the Lanczos book which talks about the physical meaning of the Lagrange multipliers. It is basically the argument that constraint forces are in fact the gradients of a very strong potential. And the usual picture is that this potential has a minimum exactly on the constraint curve and hence a zero force there just as expected. One such example of a potential is a typical inter-molecular force (e.g., van der Waals, Lennard-Jones, Morse) which has an equilibrium point at a certain distance from the molecule, corresponding to a stable point of the chemical bond. Molecules can vibrate around these points (like in a crystal) resulting in a thermal motion that is usually reduced away, leaving only the macroscopic degrees of freedom (DOF). The microscopic DOF oscillations have a reasonably sized amplitude and usually a very high frequency due to the stiff forces.
One first question that comes to mind when dealing with constrained Lagrangian mechanics is why are the Lagrange multipliers non-zero but the constraint is always satisfied $\phi(x)=0$? We know how to compute them algebraically, but where do they come from? And the standard answer is that they are the limiting case of a very strong potential as described above? But what does it mean very strong? Is the force becoming infinite or just the stiffness, i.e., the second derivative value at the equilibrium point?
A partial answer comes from the fact that if initial conditions are compatible with the constraint then the constraint forces have the duty to cancel out external and centrifugal forces that act perpendicular to the constraint direction (i.e., along the normal to the constraint manifold). This is in accord with d'Alembert's principle and the Newtonian view where there is always a reaction force keeping the particles in equilibrium along the constrained direction. This simplifies the problem to one of statics. But what is the potential we are trying to minimize?
Apparently, there should be three potentials summed up: one for inertial terms, another for external fields like gravity and finally the constraints. But if the latter is very strong then the other shouldn't matter anymore, should they? Also, the arguments should not be mixed: we either start from the axiomatic principles of d'Alembert and Lagrange and derive equations with reduced DOF due to constraints; or we start with the maximal DOF, consider very strong potentials and take the limiting case. And at the end ask ourselves if they are the same.
Lanczos and Arnold argue that they are indeed the same. Other authors go a bit deeper and say that the convergence to the Lagrangian equations of motion is only in a weak or averaged sense. Some go as far as saying that the Lagrange equations on a constraint manifold are not even correct: either from a numerical standpoint (the integrator cannot achieve the same results as the reduced DOF solution) or theoretical (the convergence solution depends on many aspects of the real strong potential function).
For small displacements, it is enough to approximate the potential function around the equilibrium point. Thus, the the first order them (the force) is zero and only the second order term survives, resulting always in a quadratic potential. Hence, the ubiquitous analogy with a very strong linear spring. (Of course, with some positive definiteness assumptions.) But what justifies us taking the small displacement limit? Is it the a priori assumption that a constraint should always be satisfied? Or is it a more practical assumption that somehow the physical system will never oscillate too far from the equilibrium point? That is, the initial conditions are either compatible with the constraint or very close to it. The reasoning is a bit circular, because we would never idealize a very strong interaction force with a constraint, unless all the conditions in practice were met for such an idealization (e.g., a rigid body). And this would involve in my view some kind of energy dissipation through which all the particles would fall into their places (condense) and stay put on the constraint manifold.
But there would still be a microscopic vibrating motion, that we can either ignore or not. In the latter case, in molecular dynamics simulations it amounts to thermodynamical properties. Usually, great effort is put into correctly simulating this highly oscillatory motion and preserving the energy. But in mechanical engineering, this motion is totally ignored. It is not clear though what is assumed about this motion? Does it still happen as a perpetual oscillation and the resulting motion is an averaged motion? Or that our numerical method hits only those points where the constraint is satisfied at either position or velocity level? This does happen in practice a lot. Or is this vibrational energy totally damped out?
The view that Lanczos takes is that the microscopic deviation always happen, it cannot be prevented. And that is why there is always a non-zero constraint force (magnitude), i.e., the Lagrange multiplier. So, at a microscopic level, the constraint is never fully satisfied, except at some sample points or on average. This would answer our first question. (This is also the view behind constraint regularization, soft constraints and mixed formulations.) But then there are many authors who claim that an ideal constraint exists only if infinite forces are allowed. So the second question is: are constraint forces bounded? Again, the Lagrangian formalism answers simply by giving us a formula which can never produce infinite values. But the limiting potential theory could, as the derivative of the potential can grow towards infinity.
In my view, the answer is simple. If the constraint force is just a spring force $f = -k \phi(x)$ then taking the limit $k \rightarrow \infty$ then the force would become infinite. But remember that we also considered the deviation $\phi$ to be very small, i.e. $\phi \rightarrow 0$. Or, more often, the compliance $\epsilon = 1 / k$ is used instead, and so we get the ratio of two very small numbers; they both tend towards zero, but their ratio is finite. (Although, this probably doesn't qualify as a proof.) But is the small deviation hypothesis a sensible one?
In practice with numerical methods we can always get configurations where the constraint is badly violated. And what about the real physical world? If we raise the temperature high enough, the bonds will vibrate wildly, not only fast, but with high amplitude. Of course, at some point they would break and the constraint idealization would not make sense anymore. So, even if $k$ is not small we can see visible macroscopic violations of the constraints. But would that matter in an averaged sense? (If we consider a rigid body as the stiff limit of an elastic body, impacts or rigidity itself can be seen as transient dynamic phenomena that dissipate inside the bodies. The energy gets transformed into microscopic vibrations or heat that flows to a cold sink.)
I don't know how these cases should be considered. All I know, is that for the problems at hand in multibody systems, the motion outside the constraint plane does not concern us and it can be either oscillatory or damped, it doesn't matter. Or does it (given all the numerical drift)? And this is why projection methods are often preferred to penalty methods, because the latter usually oscillate visibly (unless carefully tuned and damped). And in my view we should always use projection because it's the only method that can handle incompatible initial conditions or position and velocity drift. But what is the physical limiting phenomenon behind projection? I hope to answer this question in the future, for now it is only a discussion. I think that such a phenomenon should clearly contain damping or impulsive components, but the details are not yet clear.
To conclude, I think whenever you read about infinite forces you should take it with a grain of salt. It's probably that the stiffness of the potential is infinite, but the constraint force should be bounded. Maybe only if you are considering impulsive motion (non-smooth) you can consider the projection of the velocity as the result of an infinite force integrated over a small time (Dirac delta function). But when we are talking about a strong harmonic potential and constraint regularization, there cannot be such forces. The actual motion is in fact highly oscillatory and the constraint is only satisfied on average.