class Dynamic
Description
Lower-level model optimization procedure
Implements the ll-problem optimization procedure of two explicit gradient based
methods (EGBMs) with lower-level singleton (LLS) assumption, Reverse-mode AutoDiff
method (RAD) [1]
and Truncated RAD method (T-RAD) [2]
, as well as two methods
without LLS, Bi-level Descent Aggregation (BDA) [3]
and Initialization Auxiliary
and Pessimistic Trajectory Truncated Gradient method (IAPTT-GM) [4]
.
It should be noted that this procedure is a mixture of above four methods, which means they are rearranged as two optimization method(RAD\BDA), two truncation auxiliary methods(T-RAD\PTT) and initialization auxiliary method(IA). The two optimization methods are exclusive as well as the two truncation methods, otherwise all methods can be combined freely.
The implemented ll optimization procedure will optimize a wrapper of ll model for further using in the following ul optimization.
Parameters
-
ll_objective: callable
An optimization problem which is considered as the constraint of ll problem.Callable with signature callable(state). Defined based on modeling of the specific problem that need to be solved. Computing the loss of ll problem. The state object contains the following:
- "data"(Tensor) - Data used in the ll optimization phase.
- "target"(Tensor) - Target used in the ll optimization phase.
- "upper_model"(Module) - UL model of the bi-level model structure.
- "lower_model"(Module) - LL model of the bi-level model structure.
-
lower_loop: int
Updating iterations over ll optimization. -
ul_model: Module
UL model in a hierarchical model structure whose parameters will be updated with upper objective. -
ul_objective: callable
The main optimization problem in a hierarchical optimization problem.Callable with signature callable(state). Defined based on modeling of the specific problem that need to be solved. Computing the loss of ul problem. The state object contains the following:
- "data"(Tensor) Data used in the ul optimization phase.
- "target"(Tensor) Target used in the ul optimization phase.
- "upper_model"(Module) UL model of the bi-level model structure.
- "lower_model"(Module) LL model of the bi-level model structure.
-
ll_model: Module
LL model in a hierarchical model structure whose parameters will be updated with ll objective during ll optimization. -
acquire_max_loss (optional): bool, default=False
Optional argument,if set True then will use IAPTT-GM method as ll optimization method. -
alpha (optional): float, default=0
The aggregation parameter for BDA method, where alpha ∈ (0, 1) denotes the ratio of ll objective to ul objective during ll optimizing. -
truncate_iters (optional): int, default=0
Parameter for T-RAD method, defining number of iterations to truncate in the back propagation process during ll optimizing. -
ll_opt (optional): Optimizer, default=None
The original optimizer of ll model.
Methods
optimize(train_data, train_target, auxiliary_model, auxiliary_opt, validate_data, validate_target)
Execute the ll optimization procedure with training data samples using ll objective. The passed in wrapper of ll model will be updated.
Parameters:
-
train_data(Tensor) - The training data used for ll problem optimization.
-
train_target(Tensor) - The labels of the samples in the train data.
-
auxiliary_model(_MonkeyPatchBase) - Wrapper of lower model encapsulated by module higher, will be optimized in ll optimization procedure.
-
auxiliary_opt(DifferentiableOptimizer) - Wrapper of ll optimizer encapsulated by module higher, will be used in ll optimization procedure.
-
validate_data(Tensor, optional, default=None) - The validation data used for ul problem optimization. Needed when using BDA method or IAPTT-GM method.
-
validate_target(Tensor, optional, default=None) - The labels of the samples in the validation data. Needed when using BDA method or IAPTT-GM method.
References
[1]
L. Franceschi, P. Frasconi, S. Salzo, R. Grazzi, and M. Pontil, "Bilevel
programming for hyperparameter optimization and meta-learning", in ICML, 2018
[2]
A. Shaban, C. Cheng, N. Hatch, and B. Boots, "Truncated backpropagation
for bilevel optimization", in AISTATS, 2019.
[3]
R. Liu, P. Mu, X. Yuan, S. Zeng, and J. Zhang, "A generic first-order algorithmic
framework for bi-level programming beyond lower-level singleton", in ICML, 2020.
[4]
R. Liu, Y. Liu, S. Zeng, and J. Zhang, "Towards Gradient-based Bilevel
Optimization with Non-convex Followers and Beyond", in NeurIPS, 2021.