·
Mar 24, 2025 IntroductionHello, everyone. I’m Hideaki Imamura, a committer for Optuna. Optuna is a powerful black-box optimization framework that can be executed as a standalone program. It also supports distributed optimization using multiple processes with an RDB server. However, when thousands of processes access an RDB server, it can impose a heavy load on the server, potentially causing slowdowns. To address this issue in large-scale distributed optimization, the latest Optuna v4.2.0 release introduced a feature called the gRPC Storage Proxy.
In this article, I will explain how to perform distributed optimization using Optuna and introduce the gRPC Storage Proxy, which enables large-scale distributed optimization.
TL;DRLet’s explore the following simple Optuna code and try executing distributed optimization. This code can be executed as a single program, but it can also be used for distributed optimization without any modifications.
To perform distributed optimization using Optuna, an RDB server is required so that multiple processes can reference the optimization history. In the above code, the MySQL connection URL is passed to the storage argument of the optuna.create_study
function. Before running the code, start an RDB server using MySQL on your local PC as follows. If necessary, refer to the installation guide to install MySQL.
mysql -u root -e “CREATE DATABASE IF NOT EXISTS example”
Executing distributed optimization is straightforward. Open multiple terminals on your PC. Then, run the same command in each terminal. If the code is saved in foo.py
, simply execute python foo.py in each terminal. By referencing data stored in the RDB server, distributed optimization can be achieved using the same code as a single program. Below are the actual outputs from two terminal sessions:
$ python foo.py
[I xxx] Trial 0 finished with value: 45.35553104173011 and parameters: {‘x’: 8.73465151598285}. Best is trial 0 with value: 45.35553104173011.
[I xxx] Trial 2 finished with value: 4.6002397305938905 and parameters: {‘x’: 4.144816945707463}. Best is trial 1 with value: 0.028194513284051464.
…
$ python foo.pyWhat is the gRPC Storage Proxy?
[I xxx] Trial 1 finished with value: 0.028194513284051464 and parameters: {‘x’: 1.8320877810162361}. Best is trial 1 with value: 0.028194513284051464.
[I xxx] Trial 3 finished with value: 24.45966755098074 and parameters: {‘x’: 6.945671597566982}. Best is trial 1 with value: 0.028194513284051464.
…
While distributed optimization in Optuna is simple with an RDB server, as Optuna expands from hyperparameter optimization in machine learning to areas like material discovery, large-scale and computationally intensive optimizations have emerged. For example, at Preferred Networks, in simulations for material discovery, optimization is performed on studies running tens to hundreds of parallel processes, with up to hundreds of thousands of trials per study. As a result, thousands of workers generate a heavy load on the RDB server, and query response times become a bottleneck.
The gRPC Storage Proxy, introduced in Optuna v4.2.0, is designed for large-scale distributed optimization. It acts as an intermediary between optimization workers and the RDB server, relaying Storage API calls. In environments with hundreds or thousands of workers, deploying a gRPC Storage Proxy for every few dozen workers significantly reduces the load on the single point of failure — the RDB server. Additionally, by managing Study and Trial information as a shared cache for workers, the load is further reduced.
Press enter or click to view image in full size Figure: A concept of gRPC storage proxy Usage and BenchmarkingTo use the gRPC Storage Proxy for distributed optimization, a proxy server must be set up between the optimization workers and the RDB server. The optuna.storages.run_grpc_proxy_server
function provides this capability. Below is an example of setting up a proxy server on port 13000 for a MySQL RDB server running on localhost.
To utilize the proxy server, update the optimization worker code by passing optuna.storages.GrpcStorageProxy
as the storage argument in optuna.create_study
. Below, localhost and port 13000 are used to specify the proxy server.
By changing the port number, multiple proxy servers can be used. Additionally, specifying a different hostname allows the use of proxy servers across networks. In environments managed by Kubernetes, for instance, a proxy server can be deployed per node, with multiple containers within a pod accessing the proxy server, reducing the load on the RDB server.
To validate the effectiveness of the gRPC Storage Proxy, we conducted a performance comparison. With 300 workers, we deployed one proxy server for every 10 workers and executed 10,000 trials using RandomSampler
. The following table shows the average processing time per worker:
Note that in scenarios where the MySQL server can handle the worker load efficiently, the gRPC Storage Proxy may introduce performance overhead, resulting in slower execution. In single-worker cases, we observed a 1.2–1.3x slowdown. The gRPC Storage Proxy is specifically designed for large-scale distributed optimization.
ConclusionOptuna enables easy distributed optimization. For large-scale distributed optimization, using the gRPC Storage Proxy helps reduce the load on the RDB server. We will continue expanding features for large-scale distributed optimization. Give Optuna’s distributed optimization and gRPC Storage Proxy a try!
I wish you a great black-box optimization life!
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4