What to write back-end, in which a lot of math?

Waborala13 asked 9 months ago

There is an idea of ​​one service, the backend of which will take quite a lot of math on very voluminous data (statistics, fft, convolutions, sound and image processing, maybe a bit of recognition and all that). Actually the question is what language to choose?
Some time ago I found out about NumPy, and I really liked it. In this regard, of course, the idea arose to choose a Python. An experienced comrade added fuel to the fire, who said something like “yes, now the whole web needs to be done on Python” (mentally I will give a discount to his personal sympathies).
But then I came across an article, from which it follows that Python is on average an order of magnitude (!) Slower than Go. This greatly shaken my opinion. 2-3 times could be forgiven, but the order is a very serious reason to think. Servers are not cheap.
And so I thought what could be done differently – the main part of the backend is stupid in PHP (simpler and cheaper), and the most critical and mathematical places on Go?
On the other hand, I read that the NumPy seems to be implemented in C ++ inside, but is there only an interface for the python one? If so, what tests in the above article may not be indicative in this case?
And in order not to get up twice – are there libraries like NumPy for PHP, Go, etc.? To a similar ideology, the number of functions and so on. Of course, something native on C ++ (of the FFTW type) will work the fastest, but something I don’t want to mess around with a large zoo of scattered libraries trying to make friends with them. I want one Swiss knife.

Curtis replied 9 months ago

Python to start more than fit. On it you will write faster and cheaper.
And further, if the project is successful – you will write on what is more optimal in terms of performance and symmetry of the architecture.
Phyton is a great language for writing a prototype.

9 Answers
Best Answer
Rich answered 9 months ago

Python to calculate slow !? You just do not know how to cook it!
I already wrote

The fact is that Python has a unique combination of qualities:

  1. General purpose language (R and MATLAB after all for a narrow audience).
  2. Dynamic interpretable script enabling very fast development.
  3. Numpy opens access to vector calculations (without explicitly described cycles) at almost extreme speeds for iron. On its basis, the huge infrastructure of the mathematical python has grown. The whole scientific field, the size of which is difficult to imagine (in Scikit alone there are several dozen libraries in all areas).
  4. Cython makes it possible to manually add those little things that someone may not be enough in Numpy, in a language compiled like C with a python syntax.

that Python is only a dynamically typed script on top (which is necessary for development speed), but numpy vector calculations are performed on the hardware itself, that is, so that you do not write this in C / C ++ faster than a few percent.
In addition, for those cases where there are not enough vector calculations, Cython exists, it is the same compiled (and not inferior in performance) as C / C ++ language, with direct access to the python object passed from the script.

statistics, fft, convolution, sound and image processing, maybe some recognition

All this is realized through vector calculations or through the appropriate libraries, which are also not written in the script and which will not yield to the ideal solution by more than a few percent. If even for some separate task there is no ready-made solution in the form of a library, there is always the option to make this trifle in Cython.

Matthew replied 9 months ago

Yeah of course. after that two acrobat brothers take the stage: GIL and GC and all mathematics become very fast 🙂

Brian replied 9 months ago

I’m sorry, but how are GIL / GC and C written?

Matthew replied 9 months ago

they are caused all the same from the Python code.

Brian replied 9 months ago

and what prevents them from scattering calculations in different streams? I just don’t understand GIL quite well, and I can’t find anything , does GIL affect libraries written in C.

Rich replied 9 months ago

1. numpy frees GIL for the duration of heavy calculations. The effect extends to all libraries that shift the main computational load to numpy.
2. Cython allows you to free / capture GIL not only for the whole function, but also in any code fragment.
3. When working with vectors, many small objects are not created, so the load on the GC is almost zero.
4. In Cython, local variables are located on the stack (as in C), memory from the heap is not allocated (unless specifically requested).
All potential performance problems have long been thought out and mostly resolved.

Holland answered 9 months ago

Write everything on the faster it turns out, look at the server load, which parts of the software load and fix it quickly. And tomorrow you will read what the assembler, or something like that, does all the languages combined and start rewriting again. And so you at least have a working version of any kind and in a short time. Maybe you will not need to rewrite anything, I often have this.

Matthew answered 9 months ago

In C++, but better in C. We do this.

Facan answered 9 months ago

I would take python for everything (web) and c++ for math.
Go for network problems, in mathematics * it is inferior c++

Boricurwh answered 9 months ago

C++ on bottlenecks, everything else is convenient.

Oliver answered 9 months ago

from which it follows that Python is on average an order of magnitude (!) slower than Go.

Well, the rumors are in order greatly exaggerated. 😉
Actually there are 2 orders (Python2) or more (Python3).
Here someone advised:

In C++, but better in C. We do this.

“We do this” is, of course, a strong argument …, but Go is not slower than C / C++, … well, on some types of tasks it is slower up to 2 times.

Noah answered 9 months ago

The digital signal processing domain is best written in high-performance languages and collected in libraries. Then from any web framework, say in Python, you can work with library calls.
In general, Fortran / Go / Java is suitable for the back-end. At Fortran write, mostly bearded uncle. And Go / Java is for the current generation.
As an experiment, I can offer another language Julia, which according to the developers has a speed close to C. And this language under the hood uses calls to mathematical libraries: FFTW, LAPACK, OpenBLAS, GMP, etc.

Emily answered 9 months ago

Indeed, if you use numpy or other libraries written on C++, then you won’t be able to write to Go faster. The question is how much your logic fits into these C libs.
Here the boys write that now you can write modules for python on Go!
https://stackoverflow.com/questions/12443203/writing-a-python-extension-in-go-golang
Those. You can rewrite the slow code sections to Go.

Christopher Haynes answered 9 months ago

I recommend to watch Julia. The language is comparable to matlab in terms of mathematics, and more powerful than python as a general purpose.