Skip to content

class  Dynamic


Description

Lower-level model optimization procedure

Implements the ll-problem optimization procedure of two explicit gradient based methods (EGBMs) with lower-level singleton (LLS) assumption, Reverse-mode AutoDiff method (RAD) [1] and Truncated RAD method (T-RAD) [2], as well as two methods without LLS, Bi-level Descent Aggregation (BDA) [3] and Initialization Auxiliary and Pessimistic Trajectory Truncated Gradient method (IAPTT-GM) [4].

It should be noted that this procedure is a mixture of above four methods, which means they are rearranged as two optimization method(RAD\BDA), two truncation auxiliary methods(T-RAD\PTT) and initialization auxiliary method(IA). The two optimization methods are exclusive as well as the two truncation methods, otherwise all methods can be combined freely.

The implemented ll optimization procedure will optimize a wrapper of ll model for further using in the following ul optimization.


Parameters

  • ll_objective: callable
    An optimization problem which is considered as the constraint of ll problem.

    Callable with signature callable(state). Defined based on modeling of the specific problem that need to be solved. Computing the loss of ll problem. The state object contains the following:

    • "data"(Tensor) - Data used in the ll optimization phase.
    • "target"(Tensor) - Target used in the ll optimization phase.
    • "upper_model"(Module) - UL model of the bi-level model structure.
    • "lower_model"(Module) - LL model of the bi-level model structure.
  • lower_loop: int
    Updating iterations over ll optimization.

  • ul_model: Module
    UL model in a hierarchical model structure whose parameters will be updated with upper objective.

  • ul_objective: callable
    The main optimization problem in a hierarchical optimization problem.

    Callable with signature callable(state). Defined based on modeling of the specific problem that need to be solved. Computing the loss of ul problem. The state object contains the following:

    • "data"(Tensor) Data used in the ul optimization phase.
    • "target"(Tensor) Target used in the ul optimization phase.
    • "upper_model"(Module) UL model of the bi-level model structure.
    • "lower_model"(Module) LL model of the bi-level model structure.
  • ll_model: Module
    LL model in a hierarchical model structure whose parameters will be updated with ll objective during ll optimization.

  • acquire_max_loss (optional): bool, default=False
    Optional argument,if set True then will use IAPTT-GM method as ll optimization method.

  • alpha (optional): float, default=0
    The aggregation parameter for BDA method, where alpha ∈ (0, 1) denotes the ratio of ll objective to ul objective during ll optimizing.

  • truncate_iters (optional): int, default=0
    Parameter for T-RAD method, defining number of iterations to truncate in the back propagation process during ll optimizing.

  • ll_opt (optional): Optimizer, default=None
    The original optimizer of ll model.


Methods


References

[1] L. Franceschi, P. Frasconi, S. Salzo, R. Grazzi, and M. Pontil, "Bilevel programming for hyperparameter optimization and meta-learning", in ICML, 2018

[2] A. Shaban, C. Cheng, N. Hatch, and B. Boots, "Truncated backpropagation for bilevel optimization", in AISTATS, 2019.

[3] R. Liu, P. Mu, X. Yuan, S. Zeng, and J. Zhang, "A generic first-order algorithmic framework for bi-level programming beyond lower-level singleton", in ICML, 2020.

[4] R. Liu, Y. Liu, S. Zeng, and J. Zhang, "Towards Gradient-based Bilevel Optimization with Non-convex Followers and Beyond", in NeurIPS, 2021.