Aidvisors are Artificially Intelligent Advisors evaluating the risk probability and suggesting preventive actions to better manage the risk related to your financial investments.
Below is the current technological stack as of Q3 2021.
- Application Platform
- Backtesting Engine
- Machine Learning Engine
- Machine Learning Platform
- Business Intelligence Platform
- This Website
Linux is a light and mature operating system, simply because it was developed as a light version of the already mature UNIX family of operating systems. With Kubernetes and Docker, Linux just became even more versatile.
That application platform allows us to run on-premise software and to leverage cloud platforms as well. After some trial and errors with the plenty of options that exist, we decided to run our own private cluster with Ubuntu and K3s which made deployment very Very VERY simple and reliable.
There would be a lot to say about database. Briefly, if you screw up when you choose your database, you will have a slow system that doesn’t scale. Speed was key to us because there’s a lot of data to process, store and restore.
We compared MongoDB with other traditional relational databases and the performance gain was huge. With MongoDB and few other enhancements we were able to increase the system speed by a factor of 600. So we stuck with MongoDB and added Redis over time. Nowadays, we wonder if PostgreSQL would be a better fit but we didn’t benchmark it yet.
There is a great article about: Should you build your own backtester? In our case we did and it was a good decision because it made it easy to integrate with custom data structure and produce any financial indicator we want. But once again, speed was key to us. We tested and benchmarked our backtesting engine against other backtesting software and it was faster by a factor of 20.
Briefly, we developed our backtester in C# to leverage speed and avoid managing memory. The backtester is based on vectorization, similar to AmiBroker. We developed a Python wrapper on top of Mono to create an ultra-fast cross-platform backtester. Actually, it’s faster to rerun trading algorithms instead of storing the results in database, which help with database scalability because we store less data.
Machine Learning Engine
The most important for us with machine learning was to remain independent, flexible and being able to use the latest algorithms when they are available. The best fit was to leverage the Python programming language with its great AI library ecosystem and strong community support.
For most machine learning models we use the Python library scikit-learn. For deep learning, we use Tensorflow and its Python API. We also use other Python libraries such as XGBoost. For hyperparameter optimization we use the Python library scikit-optimize.
Machine Learning Platform
The glue between all components above is what we call the machine learning platform. It allows to run the machine learning pipelines, collect the results in database and visualize their performance.
At the time we started there was nothing as nearly complete as Kubeflow, and still we hesitate to switch because many of our requirements are not supported by Kubeflow.
We developed a simple Python framework which allows to create projects, phases, tasks and operations that are all stored/restored in JSON. It supports to resume projects anytime, anywhere without having to recompute everything. It also provides metrics for all micro-tasks that are run. The metrics are both for system performance as for financial performance. The performance metrics were key for runtime efficiency.
The tasks are distributed through a very simple queue system called MRQ that matched our tech stack beautifully and so far it’s good enough to debug when required. The run of one machine learning pipeline over 20 years of data will normally end up in more than 500,000 micro-tasks. Debugging this for performance and accuracy is actually not easy. But MRQ made it simple enough.
Business Intelligence Platform
We created our own simple BI platform to visualize all the performance and financial metrics created by the machine learning platform.
It’s a simple ReactJS application that runs in our private Kubernetes cluster and it leverages the flexibility of the awesome React-Chart-Editor from Plotly. Getting any data from MongoDB to build any custom charts is really easy with that awesome React component.
And to finish that tech stack: WordPress. Yes the website is built with that good old buddy. Using ReactJS for a simple website is just too much work for a business which has to focus on artificial intelligence.
Essentially a bunch of JSON files are transferred automatically on the WordPress instance one-way only. The metrics from JSON files are displayed with PlotlyJS charts. The rest is just traditional WordPress. Thank you WordPress.
The good news is that it makes this website entirely independent from the machine learning pipelines. That makes the whole system highly secure. We cannot get our machine learning pipelines hacked through the online website. That’s how we offer our customers a system that provides exclusive and privileged information from a very unique and secure machine learning platform.
This is it for now!