Research Plan for ETA

Today is a warm, sunny weekend. The smell of spring fills in the air. After a year-long preparation of supporting techniques, I think it's time to draw some new pictures in the field of ETA research starting in December 2021.

In the past 2022, I spent lots of time trying to answer the question: why does a model that works well on the training set degrade significantly in a new environment? Can we predict the degradation before the model's prediction? That's where the incremental learning and data/concept drift came in. In the first several months I focused on incremental learning algorithms, mainly on the Hoeffding tree implementation. However, the deployment style of incremental learning models is quite different from traditional batch learning models, which makes its adoption more difficult in production. The good news is these researches lead me to detect algorithms of concept drift, then generalize to other forms of drift: covariate and label drifts, and finally domain adaption and transform learning.

After the detection of drift, we have to figure out how to fix it. Since the deployment of incremental learning is impractical in the near future, we are trying to optimize the whole framework with hyperparameter tuning, based on a grid search on several dimensions, such as training/validating set, feature set, and models, with the help of AutoML tools, like Flaml, auto scikit-learn, or doing it manually.

Comments

Popular posts from this blog

2023: On the Road

Yet another advice to kids

The Joy of Reading in Natural Light