Skip to content

class  Init


Complete Meta-learning Process with MAML and MAML-based methods

Implements the meta learning procedure of MAML [1] and four MAML based methods, Meta-SGD [2], MT-net [3], Warp-grad [4] and L2F [5].


  • model: Module
    Model containing backbone network and other auxiliary meta modules if using other MAML-based methods.

  • inner_objective: callable
    The inner loop optimization objective.

    Callable with signature callable(state). Defined based on modeling of the specific problem that need to be solved. Computing the loss of inner objective. The state object contains the following:

    • "data"(Tensor) - Data used in inner optimization phase.
    • "target"(Tensor) - Target used in inner optimization phase.
    • "model"(Module) - Meta model to be updated.
    • "updated_weights"(List[Parameter]) - Weights of model updated in inner-loop, will be used for forward propagation.
  • outer_objective: callable The outer optimization objective.

    Callable with signature callable(state). Defined based on modeling of the specific problem that need to be solved. Computing the loss of outer objective. The state object contains the following:

    • "data"(Tensor) - Data used in outer optimization phase.
    • "target"(Tensor) - Target used in outer optimization phase.
    • "model"(Module) - Meta model to be updated.
    • "updated_weights"(List[Parameter]) - Weights of model updated in inner-loop, will be used for forward propagation.
  • inner_learning_rate: float, default=0.01
    Step size for inner optimization.

  • inner_loop: int, default=5
    Num of inner optimization steps.

  • use_second_order (optional): bool, default=True
    Optional argument,whether to calculate precise second-order gradients during inner-loop.

  • learn_lr (optional): bool, default=False
    Optional argument, whether to update inner learning rate during outer optimization, i.e. use MSGD method.

  • use_t (optional): bool, default=False
    Optional argument, whether to using T-layers during optimization,i.e. use MT-net method.

  • use_warp (optional): bool, default=False
    Optional argument, whether to using warp modules during optimization,i.e. use Warp-grad method.

  • use_forget (optional): bool, default=False
    Optional argument, whether to add attenuation to each layers, i.e. use L2F method.



[1] C. Finn, P. Abbeel, S. Levine, "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks", in ICML, 2017.

[2] Z. Li, F. Zhou, F. Chen, H. Li, "Meta-SGD: Learning to Learn Quickly for Few-Shot Learning", in arxiv, 2017.

[3] Y. Lee and S. Choi, "Gradient-Based Meta-Learning with Learned Layer-wise Metric and Subspace", in ICML, 2018.

[4] S. Flennerhag, A. Rusu, R. Pascanu, F. Visin, H. Yin, R. Hadsell, "Meta-learning with Warped Gradient Descent", in ICLR, 2020.

[5] S. Baik, S. Hong, K. Lee, "Learning to Forget for Meta-Learning", in CVPR, 2020.