ENH: Updated and refactored Mathieu function implementation (part 1).#144
ENH: Updated and refactored Mathieu function implementation (part 1).#144steppi wants to merge 105 commits into
Conversation
main.c to test_mathieu.cpp in the tests dir. I ran clang-format on all .h files.
functions to replace duplicated code.
and that's what I have.
fix the failed builds in CI.
and lapack which are requred for Mathieu fcns. Also fix the same build problem when using pixi. Also rerun clang-format on all files.
as well as the entirety of scipy.
|
Oh wow, GPU CI actually ran after I updated the ubuntu runner to 24.04. Tests are failing, but I think will work if we update CuPy in the pixi.lock file. I'll make a separate PR for that. |
|
The formatting errors seem to be because the clang-format version was updated when I updated the pixi.lock file. Let's leave them here and fix them in another PR. |
|
Great to see this get merged in! |
| if (q > 1.0) { | ||
| double qq = std::log10(q); // I need to use size of q to compute N. | ||
| N = m + 25 + 10 * qq; | ||
| } else { | ||
| return cem_cva(m, -q); | ||
| N = m + 25; |
There was a problem hiding this comment.
If we think of the very negative if (std::abs(q) > 1.0) instead?
There was a problem hiding this comment.
I think you may be right. @brorson, can you comment on this?
There was a problem hiding this comment.
I think that's right. The point here is that we need more terms in the sum when abs(q) > 1. So, yes, if (std::abs(q) > 1.0) is correct. Don't forget to use std::log10(std::abs(q)) on line 167 too.
| // This makes sure the fcn has the right overall sign for q<0. | ||
| // Someday combine this with the above sums into the same for loop. | ||
| double x = 0.0; | ||
| for (decltype(N) l = 0; l < N; l++) { | ||
| x += X(l); |
There was a problem hiding this comment.
I understand that if X is an eigenvector, then so is -X but can you detail the rule "if the coefficients mostly sum negative, we flip the results" implemented here? I guess the coefficients of X(l) can have any sign in this framework (unless I missed something), so if the sum is very close to 0, can this be unstable? is there a better way?
There was a problem hiding this comment.
This is a very good question and I think you've unearthed a potential bug. dstevd will return the eigenvectors normalized to have
so I think to handle FuncParity odd, the above should be replaced with:
if constexpr (FuncParity == Parity::Even) {
x += X(l);
} else {
double r = sqrt_di<Parity::Odd, OrderParity>(l);
x += r * X(l);
}
@brorson, does that sound reasonable?
Details
For
which implies that the sign conventions I mentioned above hold for
the ambiguity of sign being resolved by (28.2.29) when q=0 and by continuity for the other values of q.
It wasn't obvious to me at first, but suppose for instance there is some
There was a problem hiding this comment.
As for stability, the sum of coefficients is 0 at roots, and calculating these Mathieu functions seems to be inherently unstable near roots (or roots of the derivative). I found even the arbitrary precision Mathematica reference struggles near roots. We still have a major improvement over Zhang and Jin, so I think it's fine if there's still some rough edges.
The above was nonsense. Sorry. The sum of coefficients is zero implies mathieu_cem is identically zero for all x since the function and it's derivative would be 0 at x = 0 and
I don't think this situation can occur for any values of m or q.
There was a problem hiding this comment.
Yes, I think sum(k*B[k]) is reasonable. Go ahead and change it. I hope to have students run my test suite on the results at some point so if your idea doesn't work then we will pick it up.
Done. If I can do anything to help with the review process or merge let me know. Once you have a stable branch please tell me and I can check it out and run some of my other tests on it. |
Thanks @brorson! I think this is getting into good shape now. I'll ping you again if we need more help. I will also run out some more tests against Mathematica's implementation. |
Co-authored-by: Florian Bourgey <bourgeyflorian@gmail.com>
Co-authored-by: Florian Bourgey <bourgeyflorian@gmail.com>
|
@fbourgey, I think that should address everything from your first pass. |
Co-authored-by: Florian Bourgey <bourgeyflorian@gmail.com>
|
@fbourgey, can you give this one last look? |
fbourgey
left a comment
There was a problem hiding this comment.
This LGTM, modulo with last comments.
|
|
||
| for (decltype(v.size()) i = 0; i < v.size(); i++) { | ||
| double out, out_diff; | ||
| xsf::mathieu::sum_fourier_series<Even, Even>(AA_span, v[i], out, out_diff); |
There was a problem hiding this comment.
Do we need a test for AngleUnitPolicy::Degrees as well?
There was a problem hiding this comment.
Yeah, we can add some if you like.
| @@ -0,0 +1,230 @@ | |||
| /* Less accurate (but faster) legacy versions of Mathieu functions. */ | |||
There was a problem hiding this comment.
what is the plan for this file? what about backward-compatibility?
should we not keep old tests for smoke coverage?
There was a problem hiding this comment.
This file remains for now because it's still being used for the modified (radial) mathieu functions, and as a large order fallback for mathieu_cv. Once the modified mathieu functions are refactored like this, and we figure out something else for the fallback, then we can get rid of the file. We can't really keep the old tests because they rely on computing the functions from end to end and xsf doesn't offer that capability. The full implementations for CPU live in SciPy and rely on kernels from here that the CuPy implementations will also use. All of those old test cases (using degrees as angular unit) remain in SciPy.
Reference issue
Partially supersedes #99
What does this implement/fix?
This PR is based on @brorson's PR #99, which gives improved implementations of the Mathieu functions. Algorithmically, #99 is excellent work. This is a promised refactor in order to put the work in #99 into the form that would be most useful for xsf and SciPy. The primary departure is that this PR does not give full implementations of Mathieu functions; it only contains the core numerical kernels that would be useful in either a CPU implementation for SciPy or a GPU implementation for CuPy. The primary functions are for generating the recurrence matrices and for summing the Fourier series. Generating the Fourier coefficients requires finding eigenvectors of symmetric tridiagonal recurrence matrices. #99 used LAPACK's dstevd for this, which is out of scope for
xsf. I do not rule out eventually writing a CPU+GPU capable symmetric tridiagonal eigenvector solver, but even then, it would still not be particularly useful for SciPy and CuPy to have full scalar implementations inxsf. For SciPy implementations, it is best to use the stateful-functor with in-ufunc caching pattern established here scipy/scipy#24894, in order to allow reuse of the same computed fourier coefficients across varying values ofx. For CuPy, we will want to use the gufunc expansion + reduction pattern proposed here cupy/cupy#9839.The vision I'm coming to is that xsf will contain simple numerical kernels which will work on both CPU and GPU. If a function is simple enough for the same scalar implementation to work on both, then a full implementation belongs in xsf. Otherwise, xsf should contain the kernels that can be common to both platforms and CPU specific implementations can live in SciPy and GPU specific implementations in CuPy.
This PR only contains kernels used for the characteristic value functions
mathieu_aandmathieu_band the Mathieu functionsmathieu_cemandmathieu_semfrom SciPy. I plan to addressmathieu_modcem1,mathieu_modcem2,mathieu_modsem1, andmathieu_modsem2in an eventual follow-up. This is primarily to cut down the scope of work in this PR. There are also some redundant computations in the Fourier-Bessel summations from #99 that we can eliminate by pushing through #50, which is worth doing on its own as well.I have used extensive templating to cut down on the duplicate code in #99. The implementations here should otherwise be completely faithful to #99, with the exception that I have added an
AngleUnitspolicy template arg to the Fourier series summation so that the angular units can be taken to be either degrees or radians. For backwards compatibility, the companion PR to SciPy still takes arguments in degrees. There is an easy path forward to adding a kwarg to the SciPy functions to select between radians and degrees.The tests here test only the recurrence matrix generation and the fourier summation. The tests for fourier summation use coefficients generated by the SciPy implementation and compare the evaluated sums to reference implementations of Mathieu functions from Mathematica. I've discovered the Wolfram Engine Free Developers program, and used the Python bindings to get reference values. This seems like something that will be useful for xsref too, which has been in a languishing state, but which I still have high hopes for.
I've removed the old Mathieu tests, which were self tests against the inaccurate SciPy code anyway.
Additional information
After cleaning up the stuff adding BLAS and LAPACK to the dependencies, the rebuild of pixi.toml failed. This is because the cupy-tests environment was included in
solve-group = "default", but because the CuPy tests are Linux only and pull down things not available on other platforms, this caused solver errors. I simply removed the CuPy environment from that solve group.AI Generation Disclosure
The test code in
xsf/tests/xsf_tests/test_mathieu.cppwas largely AI generated with Google Gemini 3.1 Pro. I checked carefully that everything is correct. The implementation code is all good-old-fashioned artisanal hand-crafted code; this was too fun to delegate to AI.