In machine learning, fine-tuning is an approach to transfer learning in which the weights of a pre-trained model are trained on new data.[1] Fine-tuning can be done on a subset of the layers of a neural network or on the entire network.[2] In the first case, the layers that are not being fine-tuned are "frozen" and not updated during backpropagation step.
For some architectures, such as convolutional neural networks, it is common to keep the earlier layers frozen because they have been shown to capture lower-level features, unlike the later layers which often focus on high-level features that can be more related to the specific task that the model is trained on.[2][3]
Fine-tuning is also common in natural language processing (NLP), especially in the domain of language modeling. Large language models like OpenAIs GPT-2 can be fine-tuned on downstream NLP tasks to produce better results than the pre-trained model can normally achieve.[4] Models that are pre-trained on large and general corpora are usually fine-tuned by reusing the model's parameters as a starting point and adding a task-specific layer trained from scratch.[5] Fully fine-tuning the model is common as well and often yields better results, but it is a more computationally expensive approach.[4] Fully fine-tuning is also more prone to overfitting and may cause the model to perform worse out-of-distribution.[6]
Fine-tuning is normally done using a supervised learning approach, but there are also techniques to fine-tune a model using weak supervision.[7] Recently, reinforcement learning is also being used to fine-tune language models like ChatGPT (which is a fine-tuned version of GPT-3) and Sparrow through reinforcement learning from human feedback.[8][9]
See also
References
- ^ Quinn, Joanne (2020). Dive into deep learning: tools for engagement. Thousand Oaks, California. p. 551. ISBN 978-1-5443-6137-6. Archived from the original on January 10, 2023. Retrieved January 10, 2023.
- ^ a b "CS231n Convolutional Neural Networks for Visual Recognition". cs231n.github.io. Retrieved 9 March 2023.
- ^ Zeiler, Matthew D; Fergus, Rob (2013). "Visualizing and Understanding Convolutional Networks". arXiv:1311.2901.
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ a b Dingliwal, Saket; Shenoy, Ashish; Bodapati, Sravan; Gandhe, Ankur; Gadde, Ravi Teja; Kirchhoff, Katrin (2021). "Prompt Tuning GPT-2 language model for parameter-efficient domain adaptation of ASR systems". arXiv:2112.08718.
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ Dodge, Jesse; Ilharco, Gabriel; Schwartz, Roy; Farhadi, Ali; Hajishirzi, Hannaneh; Smith, Noah (2020). "Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping". arXiv:2002.06305.
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ Kumar, Ananya; Raghunathan, Aditi; Jones, Robbie; Ma, Tengyu; Liang, Percy (2022). "Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution". arXiv:2202.10054.
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ Yu, Yue; Zuo, Simiao; Jiang, Haoming; Ren, Wendi; Zhao, Tuo; Zhang, Chao (2020). "Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach". arXiv:2010.07835.
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ "Introducing ChatGPT". openai.com. Retrieved 9 March 2023.
- ^ Glaese, Amelia; McAleese, Nat; Trębacz, Maja; Aslanides, John; Firoiu, Vlad; Ewalds, Timo; Rauh, Maribeth; Weidinger, Laura; Chadwick, Martin; Thacker, Phoebe; Campbell-Gillingham, Lucy; Uesato, Jonathan; Huang, Po-Sen; Comanescu, Ramona; Yang, Fan; See, Abigail; Dathathri, Sumanth; Greig, Rory; Chen, Charlie; Fritz, Doug; Elias, Jaume Sanchez; Green, Richard; Mokrá, Soňa; Fernando, Nicholas; Wu, Boxi; Foley, Rachel; Young, Susannah; Gabriel, Iason; Isaac, William; Mellor, John; Hassabis, Demis; Kavukcuoglu, Koray; Hendricks, Lisa Anne; Irving, Geoffrey (2022). "Improving alignment of dialogue agents via targeted human judgements". arXiv:2209.14375.
{{cite journal}}
: Cite journal requires|journal=
(help)