Replies: 3 comments 3 replies
-
My question stands, but I would like to note it took 19 hours, 8 minutes to run, and the output has gone from >1 billion operations to just 3,679 FLOPs: 1,090 adds, 3,768 muls, 390 pows, and 12 sin/cos.
This is just barely usable for my case. However, adding a couple extra 'nice to have' things to my function takes the Jacobian from 1 billion terms to half a trillion, where cse() might take a year to run. Also, is there any standard way of gathering statistics about large expression trees? If not, should I clean up my function emitting these summaries and make a pull request? |
Beta Was this translation helpful? Give feedback.
-
We have a function here: https://github.com/sympy/sympy/blob/master/sympy/simplify/_cse_diff.py that you can try. Note that it is private (i.e. subject to change without deprecation). |
Beta Was this translation helpful? Give feedback.
-
Thank you!! It seems that still somehow keeps a lot of common sub expression between Jacobian terms? (I copied the However, a second pass of This functionality is very useful to me. If I can get this working satisfactorily, I would like to help make this a part of the public interface of SymPy. I've never contributed to open source before, so I'll review https://docs.sympy.org/latest/contributing/index.html |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I have some modestly complicated robotics-style code: a couple hundred lines of code operating on scalars and 3-vectors. I want to take the Jacobian of this (3x20; 3 outputs, 20 inputs), run
cse(optimizations='basic')
on it, and ultimately emit the code into both Python and C++. This works on smaller problems, butcse()
is intractably slow for my entire task.When I walk the expression tree of my function (not its Jacobian), I get 5 million operations, not including leaf nodes like Symbol and Integer. Counting the actual operations I see 1m (1 million) adds, 2.3m muls, 0.2m pows, and 1.7m sin/cos. For example:
Running
cse(f, optimizations='basic', order='none')
on this takes 3.5 minutes and results in just 405 operations total, with 131 adds, 231 muls, 31 pows, and 12 sin/cos. It is more efficient that my carefully hand-written code, and performance is a concern. Love it!Unfortunately, computing the Jacobian of my function takes 25 minutes, and results in a tree that is 1 billion (10^9) terms long. 187 million adds, 450 million muls, 44 million pows, and 75 million sin/cos. I've left
cse(jac_f, optimizations='basic', order='none')
running for 18 hours and it still has not completed.It might yet still return in a "reasonable" length of time, but given the exponential scaling of the number of terms in the expression tree with the depth of the computation I am certainly close to the point where this becomes infeasible.
So far I have tried explicitly breaking the dependency. I used the multivariate chain rule on the Jacobian of f of g of x, and carefully prevented
f
from "seeing"g
, so I could instead optimize multiple smaller expression trees.https://wikimedia.org/api/rest_v1/media/math/render/svg/8e1b5731ed718474bd9d8fa61241cad7e0c7337a
To evaluate the Jacobian of
f(g(x))
w.r.t.x
I look at the output dimension ofg(x)
and use that to create a new ImmutableMatrix of new symbolsg_out
with no symbolic dependency onx
that I plug into the Jacobian off
so I have more, smaller (shallower) expression trees, and then docse
optimization on each individually. To actually evaluate Jac_f_g I simply do a tiny bit of semi-manual surgery to take the output ofg(x)
and plug it intoJac_f
instead ofg_out
.I've only just now experimentally implemented this idea with
lambdify(cse=True)
just to get an idea of the scaling properties ofcse
:I haven't really tested this yet, but it seems to work on a toy multivariate problem. Of course this approach will not optimize away common sub expressions shared between 2 or more of
Jac_f
,g
, andJac_g
.Is this my best path forward? Is there something I can do to make this more efficient, easier, etc? Or should I simply emit source code implementing the cse-optimized form of my function (not its Jacobian), then rely on (say) C++ templates to facilitate forward mode automatic differentiation to compute the Jacobian?
Is there a function already in Sympy which can do something analogous to my
simple_chain_rule(f, g, x)
function? Or is there something I am missing outright?Beta Was this translation helpful? Give feedback.
All reactions