Putting it mildly! Threading bugs are probably the worst class of bugs to debug
Definitely debatable if this is worth the risk of impossible bugs. Python is very slow, and multi threading isn’t going to change that. 4x extremely slow is still extremely slow. If you care remotely about performance you need to use a different language anyway.
Python can be extremely slow, it doesn’t have to be. I recently re-wrote a stats program at work and got a ~500x speedup over the original python and a 10x speed up over the c++ rewrite of that. If you know how python works and avoid the performance foot-guns like nested loops you can often (though not always) get good performance.
Unless the C++ code was doing something wrong there’s literally no way you can write pure Python that’s 10x faster than it. Something else is going on there. Maybe the c++ code was accidentally O(N^2) or something.
In general Python will be 10-200 times slower than C++. 50x slower is typical.
Nope, if you’re working on large arrays of data you can get significant speed ups using well optimised BLAS functions that are vectorised (numpy) which beats out simply written c++ operating on each array element in turn. There’s also Numba which uses LLVM to jit compile a subset of python to get compiled performance, though I didnt go to that in this case.
You could link the BLAS libraries to c++ but its significantly more work than just importing numpy from python.
Numba is interesting… But a) it can already do multithreading so this change makes little difference, and b) it’s still not going to be as fast as C++ (obviously we don’t count the GPU backend).
Putting it mildly! Threading bugs are probably the worst class of bugs to debug
Definitely debatable if this is worth the risk of impossible bugs. Python is very slow, and multi threading isn’t going to change that. 4x extremely slow is still extremely slow. If you care remotely about performance you need to use a different language anyway.
Python can be extremely slow, it doesn’t have to be. I recently re-wrote a stats program at work and got a ~500x speedup over the original python and a 10x speed up over the c++ rewrite of that. If you know how python works and avoid the performance foot-guns like nested loops you can often (though not always) get good performance.
Unless the C++ code was doing something wrong there’s literally no way you can write pure Python that’s 10x faster than it. Something else is going on there. Maybe the c++ code was accidentally O(N^2) or something.
In general Python will be 10-200 times slower than C++. 50x slower is typical.
Nope, if you’re working on large arrays of data you can get significant speed ups using well optimised BLAS functions that are vectorised (numpy) which beats out simply written c++ operating on each array element in turn. There’s also Numba which uses LLVM to jit compile a subset of python to get compiled performance, though I didnt go to that in this case.
You could link the BLAS libraries to c++ but its significantly more work than just importing numpy from python.
Numpy is written in C.
Numba is interesting… But a) it can already do multithreading so this change makes little difference, and b) it’s still not going to be as fast as C++ (obviously we don’t count the GPU backend).