Many real-world applications—from autonomous driving to robotic manipulation—require solving similar optimization problems sequentially under strict runtime constraints. The performance of local optimizers is highly sensitive to the initial solution: poor initialization can lead to slow convergence or convergence to suboptimal local minima.
We propose MISO (Learning Multiple Initial SOlutions), a framework that trains a single neural network to predict multiple diverse initial solutions for optimization problems. Unlike ensemble methods that require training and running multiple models, MISO uses a single efficient network with diversity-promoting training objectives.
Single-optimizer setting: Select the most promising initial solution using a selection function, then run a single optimizer.
Multiple-optimizers setting: Initialize multiple parallel optimizers, each with a different predicted solution, and select the best final output.
Guaranteed improvement: By including the default initialization among predictions, MISO is guaranteed to perform equal or better than baseline.
Efficient inference: Unlike ensembles, MISO's inference time remains nearly constant as $K$ increases.
Figure 1:Prior work (top) predicts a single initial solution from problem parameters. MISO (middle, bottom) predicts multiple diverse initial solutions using a single neural network. In the single-optimizer setting, a candidate evaluator selects the best initial solution. In the multiple-optimizers setting, all solutions initialize parallel optimizers, with the best final output selected.
Motivation
Real-World Scenario: Autonomous Driving
Consider an autonomous vehicle following a planned trajectory. Suddenly, a pedestrian steps onto the road, causing the high-level planner to abruptly change the reference path. The previous trajectory (warm-start) is now far from optimal—the optimizer must find a completely different solution within milliseconds. Similar abrupt changes occur with traffic light switches, sudden lane closures, or newly detected obstacles.
Traditional initialization approaches fail in these scenarios:
Warm-start: Reuses the previous solution, which becomes invalid after abrupt changes.
Single-output regression: Learns to predict the mean of multiple modes, often corresponding to a local minimum.
Ensemble of regressors: Each model is biased toward the same mean, failing to cover multiple modes.
Figure 2:Left: A cost function $c(x)$ with global minima at A and C and a local minimum at B. Right: Predicted initial solutions for different methods. Single-output regression and ensembles converge to the mean (local minimum B), while MISO with winner-takes-all and mixture losses successfully predicts both global optima.
Key Insight: By explicitly training for diversity using winner-takes-all or mixture losses, MISO learns to cover multiple modes of the solution distribution, enabling the optimizer to reach global optima regardless of which mode is active.
Method
Given a parameter vector $\boldsymbol{\psi}$ that defines a problem instance (e.g., current state, goal, obstacles), our multi-output predictor outputs $K$ candidate initial solutions:
In the single-optimizer setting, a selection function $\Lambda$ chooses the most promising initial solution. A natural choice is selecting the candidate that minimizes the objective: $\Lambda := \arg\min_{k} J(\hat{\boldsymbol{x}}_{k}^{\mathrm{init}}; \boldsymbol{\psi})$. Alternative criteria include constraint satisfaction, robustness measures, or domain-specific requirements.
Experiments
We evaluate MISO across three optimal control benchmark tasks, each using a different local optimization algorithm:
Cart-pole Box-DDP optimizer
Reacher MPPI optimizer
Driving (nuPlan) iLQR optimizer
Main Results
Single Optimizer Setting
Table 1: Mean cost (lower is better) for single optimizer with different initialization methods. Best results in bold green, second best underlined.
Method
K
One-Off Optimization
Sequential Optimization
Reacher
Cart-pole
Driving
Reacher
Cart-pole
Driving
Warm-start
1
13.48
11.69
283.86
13.48
11.69
283.86
Regression
1
13.40
11.19
74.23
19.56
6.18
70.62
Ensemble
32
13.39
10.94
47.22
8.40
3.55
52.59
MISO WTA
32
13.36
10.48
30.17
2.72
0.83
30.75
MISO mix
32
12.74
10.48
33.95
2.44
0.79
33.38
Key Results: MISO WTA and MISO mix consistently outperform all baselines. On the challenging driving task, MISO reduces mean cost by 89% compared to warm-start and 42% compared to ensemble methods in sequential optimization.
Scaling with Number of Initial Solutions
(a) Single Optimizer(b) Multiple Optimizers
Figure 3: Mean cost on the driving task (sequential optimization) for varying $K$ values. MISO scales effectively with the number of predicted initial solutions.
Inference Time Comparison
Figure 4: Mean inference time as $K$ increases. MISO (multi-output) maintains nearly constant inference time, while ensemble inference time grows linearly with $K$.
Mode Frequency Analysis
Figure 5: Heatmap of output selection frequency for MISO WTA on the driving task. All outputs remain active, demonstrating that MISO effectively utilizes all predicted initial solutions.
Qualitative Results
Figure 6: Left: Single optimizer for the driving task. When the high-level planner abruptly modifies the reference path, warm-start and regression fail to adapt, while MISO closely follows the updated reference. Right: MISO's multiple initial solutions for cart-pole capture different modes.
Citation
@article{sharony2024learningmultipleinitialsolutions,
title={Learning Multiple Initial Solutions to Optimization Problems},
author={Elad Sharony and Heng Yang and Tong Che and Marco Pavone and Shie Mannor and Peter Karkus},
journal={arXiv preprint arXiv:2411.02158},
year={2024},
}
References
Amos, B., et al. (2018). Differentiable MPC for end-to-end planning and control. NeurIPS.
Williams, G., et al. (2015). Model predictive path integral control using covariance variable importance sampling. arXiv:1509.01149.
Li, W., & Todorov, E. (2004). Iterative linear quadratic regulator design for nonlinear biological movement systems. ICINCO.
Caesar, H., et al. (2021). nuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles. CVPR Workshop.
Guzman-Rivera, A., et al. (2012). Multiple choice learning: Learning to produce multiple structured outputs. NeurIPS.