Approach. We propose a framework for offline pretraining and online finetuning of world models directly in the real world, without reliance on simulators or synthetic data. Our method iteratively collects new data by planning with the learned model, and finetunes the model on a combination of pre-existing data and newly collected data. Our method can be finetuned few-shot on unseen task variations in ≤20 trials by leveraging novel test-time regularization during planning.
Tasks
Qualitative Results
Reach
Pick