Honest question (am a prof. of Maths for engineering at a Spanish Univ. and am wondering about this a lot):
What is the point of explaining, say the QR factorization or iterative methods like Gauss-Jordan to engineers? I think of them as the method of integration of rational functions (the complete method, not the basic cases): totally useless today...
I understand, for example, the Jordan decomposition theorem in a Linear Algebra course because it gives a clear-cut description of a linear map and the meaning of eigenvalues/eigenspaces, etc.
It sort of depends upon the course. This particular syllabus looks like a really standard linear algebra course (I have no idea what is robotics-specific about it), and so of course Gauss-Jordan would show up there.
More generally, though, I think there are two main reasons to put algorithms like Gauss-Jordan in the engineering curriculum.
1. While it would be nice to treat linear algebra solvers as a perfect black box, in practice this does not work. Engineers have to be aware of numerical stability issues, how to diagnose when this is an issue, and how to reformulate their routines in a way that resolves the problem. And to do this, they need to know the library of techniques.
A reasonable way of teaching this is to teach Gauss-Jordan, then showing how it goes badly awry, and then showing how things like pivoting can fix it.
2. Personally, though, I find the topic of numerical stability to be a little bit depressing, since it focuses on all the ways computers don't work!
To take a more positive view, a huge fraction of the algorithms in an undergraduate CS course -- from finite automata to parsing to relational algebra to graph traversals -- can be understood as basically doing linear algebra using modules over different semirings (rather than just the reals). Eg, for all-pairs shortest paths, the Floyd-Warshall algorithm is doing an LU decomposition, and Kleene's algorithm is doing Gauss-Jordan.
Not every student will enjoy this, but for the ones who are algebraically minded, it's really exciting to be able to offer them a unified perspective. And then you can show them the GraphBLAS library!
> 2. Personally, though, I find the topic of numerical stability to be a little bit depressing, since it focuses on all the ways computers don't work!
Maybe a way to more positively reformulate this would be: There is no a priori reason to assume that floating point numbers are well behaved. The fact that we were able to come up with a structure so that it approximates real numbers adequately, that arithmetic operations on it are fast (which they aren't for infinite precision) and that, if we design the algorithms correctly, errors are well-behaved, is an astonishing feat of engineering.
Computers work perfectly fine. Engineering was done for a long time with slide rules to avoid the tedium of looking up values in a book and grinding out results by hand or adding machine. Using them correctly requires knowing their limitations just as knowing the limitations of a computer is important. They aren't magic oracles that always give correct answers.
> To take a more positive view, a huge fraction of the algorithms in an undergraduate CS course -- from finite automata to parsing to relational algebra to graph traversals -- can be understood as basically doing linear algebra using modules over different semirings (rather than just the reals). Eg, for all-pairs shortest paths, the Floyd-Warshall algorithm is doing an LU decomposition, and Kleene's algorithm is doing Gauss-Jordan.
The blew my mind! I loved these undergrad and grad courses but never realized they are connected. Could you please forward me to few books or resources that go over these relationships?
I really agree with your comments; especially number 1. Often I can't use some black box implementation of an solver (or other algorithm for that matter) without some modifications. Numerical stability is a big one but also just performance. Sometimes the mathematically correct way of doing something is not always the best in practice. Short cuts and approximations can provide huge benefits. It is difficult to make those modifications without understanding the inner workings of the original method.
One example is that GJ stops being generally efficient in arbitrary precision settings. Many people will never have to deal with this, but if you are doing cryptography, it matters.
For example, while you can use GJ to calculate the determinant of a matrix, this can easily become exponential, and for integer matrices (or generally for matrices over division rings), there is an alternative method (the Bareiss algorithm) that is actually efficient.
> What is the point of explaining, say the QR factorization (...) engineers?
As a personal data point, I'm a mathematician that regularly teaches the QR and SVD factorizations to engineers. Not because I find them interesting, but because they need it for their daily jobs and they ask me to.
More precisely, they are engineers working in the geometrical modelling of satellite sensors, doing "computer vision" from satellite images. The QR and SVD factorizations of 3 and 4-dimensional matrices are essential tools for the bundle adjustment steps, to represent the position and orientation of each satellite and the transformations between them. A crucial idea (for them) is that A=QR is a unique decomposition of A which is also continuous (the coefficients of Q and R depend smoothly on the coefficients of A). On the contrary, the SVD decomposition is neither unique nor continuous. Since the SVD provides an easier-to-use decomposition, you want to use it whenever possible, but being careful to never assume continuity nor uniqueness; if you need these properties then you pick the QR.
Deep understanding of the mathematics behind a given problem space can help you later in the applications. So maybe they'll never have to actually implement Gauss Jordan (and they shouldn't), but the core ideas of normalization and reduction are very important to any sort of numerical methods and come up again and again.
As a somewhat related example, our engineering college required a calculus based probability course that probably 95% of the engineers dreaded. Conditional probability, multivariable PDFs etc. A few years after college, I had the (dis)pleasure of needing to generate random values from an unknown PDF, given only the cumulative distribution. To a statistician, the obvious answer is to use the inverse CDF (a simple interpolated lookup table based on the known CDF) and a uniform random, but an engineer who didn't fully grok the probability basics would probably have just hacked around it, trying different approximations until something stuck.
Neither approach is right or wrong, but sometimes simply understanding the essential material and methods gives you unforeseen insights later on.
The most mathematically-relevant stuff I remember was inverse kinematics and using numerical approximation to really cut down on computation when switching between coordinate spaces. This gets more important as the degrees of freedom increase.
For instance, you'd like to know that when a state-space matrix is not full rank, you lose some control. In practical terms, a robot arm can usually move in any arbitrary Cartesian direction. A fully outstretched robot arm cannot.
For computing inverse kinematics... It's much simpler for a processor to have a slow update cycle that computes how every motor's changes will affect the position of the end effector (robot "tip") and then take a numerical inverse and use that to figure out a time-local, space-local approximation of getting to your eventual goal. It's a helluva lot easier than solving exactly.
I also recall some of the mathematics being relevant to collision detection (both in planning and execution stages), but the details are hazy. Basically, computation is expensive and convex hulls and precomputation can save a lot of cycles until it matters.
Yet other linear algebra lands in the realm of computer vision, optimization, and finite-element analysis.
Yes, of course, but those are not specific algorithms (except possibly the Gauss reduction method which deserves a special treatment because it is not "just a method of computing" but "a way of understanding" by itself).
Yep. I haven't really had much need for the techniques I learned in my graduate linear algebra course and found the in-class examples to really lack specific usefulness in the real world.
I just looked up my old class: linear algebra basics; Gaussian/LU/Choleski decomp; determinants; normed spaces; condition number; iterative methods; Euclidean spaces; QR decomp; Hermitian geometry; Eigenscheissen; spectral theorems; finite elements method; SVD and pseudoinverses; quadratic optimization.
Can confirm, totally useless in the non-research world. The most applicable task was computing spline curves. Even the Google Images result for "hermitian geometry" is mostly images of research papers. How is that real-world relevant?
The useful robotics stuff for me is either already written as a library I can call or pseudocode I can find in AIMA. Then again, all of that stuff had to come from somewhere and receive the optimization treatment.
I'd liken that entire Michigan mathematics course to the first week of my FEA course. "Here's how to calculate, by hand, a basic example of stress and strain in a very simple geometry using matrix operations. Cool, now that you see how much of a real hassle that is, never do it again because we have Ansys and Solidworks."
-----
On the other hand, a paper such as http://ras.papercept.net/images/temp/IROS/files/3131.pdf would seem totally inaccessible without a class like this Robotics 501. Maybe that constains good examples of the math being instructed. (Disclaimer: I didn't do more than glance at the course material and watch a few moments from the lectures.)
Can you explain the difference between GJ as "the Gauss reduction method"? Because I haven't seen these terms used before in a way that makes a difference between them.
In robotics specifically, understanding how methods like QR factorization work (and fail) is necessary to debug robot code, even if roboticists rarely have to implement QR factorization from scratch.
Take a look at GTSAM, a library used widely in robotics for mapping and localization. To read the intro tutorial for the library, you need to understand most of the topics listed in the OP repo:
I think the Gauss-Jordan method has some reason to be taught:
- Most people have probably solved systems of equations by hand in school. G-J basically just systematises this process by abstracting away the unnecessary details and it has a clear sequence of steps (although you can still choose different pivots). I think it would not be satisfying to never see that.
- G-J can be done by hand on simple systems.
- It is a rather simple algorithm, but it still has some edge cases to consider (what if you have a 0 in pivot position?). I've seen a number of people try to implement G-J (we used it as a coding challenge and, yes, this was actually job-related in this case ;)) and fail on particular examples, e.g. stopping iteration too early after skipping a column.
- The algorithm runs in polynomial time (assuming floating point arithmetic; it's unfortunately not polynomial for arbitrary precision integers), so it's efficient. Knowing that a system of linear equations can efficiently be solved is, I think, rather important.
- you can discuss numerical stability in the context of different pivot strategies
- other algorithms use it as a building block or are basically a variation of it (LR factorisation, matrix inversion, simplex algorithm, etc.)
Of course, you could probably just tell people that instead of teaching them the algorithm, but I would think that it sticks better if you've actually seen it and it's not that hard of an algorithm (and I don't think you need to require people to memorise it either).
If you'd ask about teaching Jordan decomposition to engineers I would agree. :)
(But I've also never taught anyone, especially not engineers. So this is just what makes sense to me personally.)
Well, if I understand your point you are refering to the fact that in many high level programming languages, the language itself is able to dected the most efficient method to do solve some linaer algebra problems, say solving linear systems.
However QR factorization (or other factorizations) may be usefull when you have to solve different problems but with the same matrix, so that factorizing once and for all gives you an advantage (for example many undetermined linear problems [1]). In this case knowing something about factorizations, or what to use when, may be helpful
Well, I'm currently in my masters in electrical engineering in Germany and we are taught the same stuff. We even have a mandatory course about basic numerical algorithms.
It is unlikely you are going to use the Gauss-Jordan method (if you find a linear problem of huge size, you are just going to plug it into Matlab or whatever) and it does really not give you any information on the problem.
On the other hand, eigenvectors (and the principal components theorem) do give you an idea.
I reckon (as of now) that Theory is much more important than methods: these are going to become obsolete very fast, whereas the geometric/qualitative/quantitative insights of theoretical results are what give you understanding.
> if you find a linear problem of huge size, you are just going to plug it into Matlab or whatever
It depends on your viewpoint. Are you going to be the user who calls the MATLAB function, or the engineer who implements a specific iterative method, optimizes it for new hardware, and so on?
I am trying to make the jump into the second category.
No: the person who implements the method is certainly working alongside (or has handy) a say, mathematician/physicist/specialist in numeric computation. I would not ask "just an Engineer" (no offense) to understand how to properly implement such type of methods (mostly because they are full of special cases which require a specialist).
Hope you make the jump to that job, which is surely interesting (and pretty hard).
> Are you going to be the user who calls the MATLAB function, or the engineer who implements a specific iterative method, optimizes it for new hardware, and so on?
99% of engineers will be in the former.
Of the remaining 1% who would like to be the ones implementing this stuff, about 90% will not get the job that lets them do it.
Of those who do get that job, likely over 50% of them will hate it.
I'm not even in electrical engineering but in Industrial engineering and I still had to take numerical algorithms course this semester. Also had to do the iterations by hand for the exam.
Just recently I solved a problem in two different ways (i.e. two methods that led to an equation of the form Ax=b), and after I'd done that, I wondered if they were merely equivalent. The first thing that popped into my head was to reduce to Row Reduced Echelon Form.
There are probably other ways to show equivalence, but here is one engineer who did use it.
(Albeit for a hobby, not a job).
Sadly, most engineers (including me) do not use on the job over 90% of the math we learn in university. And when I actively did, there were disincentives to do so (e.g. no rewards for using them, culture just doesn't value them, etc).
To expand on some of the other replies: maybe not QR and Gauss-Jordan directly, but understanding linear maps and subspaces is relevant in kinematics, especially kinematic chains like robot arms / manipulators.
You can formulate a task in terms of some nonlinear function mapping the robot's joint positions to a desired "task space" and set a reference point or trajectory in that space (e.g. keep a tool perpendicular to a surface while following a line on that surface).
Approaches for solving these tasks usually look at the Jacobian (inverse) of the task-function and use iterative methods to find a trajectory in joint-space that solves the task, either making a plan offline or as part of the control algorithm in real time.
The singular value decomposition of that Jacobian has a couple of applications like
- find the pseudoinverse for the solution
- use the condition number to detect (proximity to) singularities
- find a Tikhonov-regularized solution to avoid singularities (aka. "singularity robust inverse" or "damped least squares")
- find a projection into the null space of the Jacobian or that of a lower-rank approximation and solve lower priority tasks in that space
- use the range of the Jacobian to measure "manipulability" or agility in task space, e.g. as a secondary optimization objective
One needs to know the math to be able to understand how to transform the physical world to math so as to solve the problems. Math is not a computer module that can be used as a black box except in trivial cases.
One needs to know the math to be able to understand how to transform the physical world to math so as to solve the problems.
Example: Travelling salesman used to repair document gone through the document shredder. There isn't a book you can find this, either you understand the math principles or not.
https://news.ycombinator.com/item?id=27713441
Do you need to know the proofs though? Most of the time no, but sometimes you need to know more.
Many proofs for example invert tables and in real code you better avoid it. Buy you need to know the Gauss Jordan to avoid the wtf when you crash into its common numerical instabilities.
Pay attention to all of the answers you're receiving. It's like they're coming from the University itself! Not one single concrete example of where someone has used GJ in the field, just a bunch of the same excuses and hypotheticals regurgitated in different ways.
*Replying to the answers received as of 17:02 20220127 GMT+1.*
Thanks A HUGE LOT to all and sundry for taking the time.
I am the first one to understand that theory is the only way to really grasp a problem (this is why differential equations and calculus and linear algebra are key in the education of an Engineer). This is something I guess all of you agree on (as a matter of fact, I am a pure mathematician by education).
My problem with *specific algorithms* is that they are deemed to be obsolete, unless they are inherently theoretical. In this sense:
a) Gauss reduction method (some of you have mistaken this for the iterative Gauss-Jordan method, I guess): this is not just a method but a way to understand a linear map (and a system of equations) in a much simpler way, and of getting information by itself (i.e. the eigenvalues). Same for the Jordan decomposition result (but this is a bit overkill to me).
b) Newton-Raphson: the most important use of the "derivative as approximation" that exists, apart from showing the importance of iterations and stability.
c) The condition number, and (in)stability of linear systems: this is a pure theoretical notion (nobody computes the condition number because if you can, then you can also invert the matrix) which has very important *applications as understanding*.
However:
a) Gauss-Jordan (i.e. the iterative method): I understand that it may be good to teach it as a tool to understand convergence and iterations but (you won't believe this): here they ask the students to *perform* iterations of this method BY HAND WITH A CALCULATOR...
b) QR: I never understood teaching this to undergrad engineers (it is just the Gauss reduction method with a little care when your matrix is symmetric). What interest does this have in general for an engineer?
Etc...
Compare the two above with, for instance, the method of integration of rational functions in its most general form, which was taught in Spain in schools of engineering until twenty years ago. Nobody sees this as relevant any more, as it adds absolutely nothing to the understanding of the integral, and is just a waste of time (Wolfram|Alpha does it for free)...
By the way, and you will not believe this either (I have been teaching this course several years by now). Where I work it is *in the first year*. This hurts a lot, as my students have absolutely no way to apply any of this to any engineering problem.
Even as a pure maths student, I've rarely found it useful to have to carry out complicated computations in exam settings. Integrations methods I can even understand up to a point as it is useful for pattern matching – but I had to actually carry out the simplex method on paper in one exam, and worse, I had to memorise several variants of it with different pivoting strategies. Of course, I barely remember how that algorithm works by now.
I guess one reason why such computations are asked for in exams, though, is because it gives people something concrete to study so that if they prepare well, they can score some points. If you only have "applied knowledge" exams, they get harder. And somehow, society wants to perpetuate the notion that you can achieve anything if you put in enough effort (no matter whether that effort is actually valuable).
In an ideal world, we could instead maybe give students a high level description of the algorithm, have them carry it out explicitly on extremely simple input just to see it once, and then maybe ask them to implement the algorithm in a computer program. Sadly, at least for pure maths students, the coding skills are typically low even when there are explicit CS requirements, don't know what it's like for engineering students.
Just want to point out, this is a graduate course(although 1st year). Many students taking it are phd students, who at Michigan can do quite theoretical research. Its not uncommon to take graduate math and cs courses as an engineering phd at Michigan
I wouldn't say that, it is a mix of masters/phd. the robotics department is quite new, the overall curriculum is pretty rigorous mathematically, but also a lot of lab work.
this is more of a catch all to make sure everyone is able to handle the rest of the program, since depending on undergrad school and country, mathematical maturity seems to be a large spectrum for incoming robotics students
Having learned both the implementation details of various QR methods and seen QR pop up in proofs elsewhere I think that it can be helpful as a building block rather than useful standalone.
That being said I’ve seen lots of algorithms that are only practical given that the matrix is in a special form for decomposition/inversion/etc and if you’re trying to implement an existing algorithm, tweak it, etc. that knowledge is critical.
The answer is simple: this course is taught at a university, so they have to fill the time with _something_. And if that something is a rote method that lends itself well to testing, even better! It matters little if actual engineers in the field use it (I've done robotics for years, and knowing that Gauss-Jordan exists has helped me precisely zero times).
Modern Robotics adopts the Product of Exponentials (PoE) mathematical approach (also known as screw-theoretic) over Denavit-Hartenberg (D-H) for forward kinematics . Do robotics companies use PoE in their written software or is D-H still the practical choice?
I work for Viam[0] and while all of our actual calculations are made using Spatial Vector Algebra[1], inputs can be made in a number of different ways and supporting DH is crucial for allowing pre-existing configurations to be easily brought over to our platform. SVA can also conveniently be easily used for both kinematics and dynamics.
Personally I find SVA way more intuitive than DH or PoE.
Thanks for the reference to SVA. I found a pretty understandable set of intro slides [0] on Roy Featherstone's home page [1] under the Teaching Material section.
This is my disappointment as well. Mathematics for robotics is quite challenging due to control theory and non-linear optimization, which do not seem to be the focus of this course.
I suspect this is a prerequisite course for the ones that are more topical.
I'm not familiar with this particular department approach, but scanning the syllabus it seems like a pretty standard introductory "leveling" course you use in a graduate program taking in students from lots of different undergrad degrees.
This is a class in the Robotics program at University of Michigan, which has a long and deep history in control theory and related topics. The detail you're missing is that the control theory classes are for both Robotics students and general EE's, ME's, Aerospace Engineers, etc, so they're not going to get combined with this. The thing this lets people avoid is the EE linear systems theory class, which broadly covers the same topics, probably along with the EE intro to probability class, since that's more for full on signal processing or stochastic control. It fits in nicely with the other curriculum components, just not as a general purpose math for robotics for a non-program student.
Prof Grizzle is very much about practical learning and has a pretty good sense of humor too.
When I did undergrad at Michigan, I got the sense he was one of those exceptional profs who knows what they're doing + truly cares about the non-academics.
What is the point of explaining, say the QR factorization or iterative methods like Gauss-Jordan to engineers? I think of them as the method of integration of rational functions (the complete method, not the basic cases): totally useless today...
I understand, for example, the Jordan decomposition theorem in a Linear Algebra course because it gives a clear-cut description of a linear map and the meaning of eigenvalues/eigenspaces, etc.
Anyone cares to share? Thanks in advance.