Prompt engineering

concept in artificial intelligence

Encyclopedia from Wikipedia, the free encyclopedia

Prompt engineering is a concept in artificial intelligence, particularly natural language processing (NLP). In prompt engineering, the description of the task is embedded in the input, e.g., as a question instead of it being implicitly given. Prompt engineering typically works by converting one or more tasks to a prompt-based dataset and training a language model with what has been called "prompt-based learning" or just "prompt learning".[1][2] Prompt engineering may work from a large "frozen" pretrained language model and where only the representation of the prompt is learned (i.e., optimized), using methods such as "prefix-tuning" or "prompt tuning".[3][4]

The GPT-2 and GPT-3 language models[5] were important steps in prompt engineering. In 2021, multitask prompt engineering using multiple NLP datasets showed good performance on new tasks.[6] In a method called chain-of-thought (CoT) prompting, few-shot examples of a task are given to the language model and have been shown to improve their ability to reason.[7] CoT prompting can also be done as a zero-shot learning task by prepending text to the prompt that encourages a chain of thought (e.g. "Let's think step by step"), which may also improve the performance of a language model in multi-step reasoning problems.[8] The broad accessibility of these tools were driven by the publication of several open-source notebooks and community-led projects for image synthesis.[9]

A description for handling prompts reported that over 2,000 public prompts for around 170 datasets were available in February 2022.[10]

In 2022, machine learning models like DALL-E 2, Stable Diffusion, and Midjourney were released to the public. These models take text prompts as input and use them to generate images, which introduced a new category of prompt engineering related to text-to-image prompting.[11]

Malicious

Prompt injection is a family of related computer security exploits carried out by getting machine learning models (such as large language model) which were trained to follow human-given instructions to follow instructions provided by a malicious user, which stands in contrast to the intended operation of instruction-following systems, wherein the ML model is intended only to follow trusted instructions (prompts) provided by the ML model's operator.[12][13][14]

Common types of prompt injection attacks are jailbreaking[15] and prompt leaking.[16]

Prompt injection can be viewed as a code injection attack using adversarial prompt engineering. In 2022, the NCC Group has characterized prompt injection as a new class of vulnerability of AI/ML systems.[17]

Prompt injection attacks were first discovered by Preamble, Inc. in May 2022, and a responsible disclosure was provided to OpenAI.[17]

In early 2023, prompt injection was seen "in the wild" in minor exploits against ChatGPT, Bing and similar chatbots, for example to reveal the hidden initial prompts of the systems,[18] or to trick the chatbot into participating in conversations that violate the chatbot's content policy.[19] One of these prompts is known as "Do Anything Now" (DAN) by its practitioners.[20]

On March 1st, 2023, social media user SuenOqnxtof demonstrated a refined version of the attack called Indirect Prompt Injection, in which the malicious prompt text does not even need to be inputted by the end user of an AI system, if the malicious text can instead be placed on the public internet as a dangerous form of data poisoning, which SuenOqnxtof demonstrated in a working attack against Bing Chat. [21][22]

References

  1. ^ Alec Radford; Jeffrey Wu; Rewon Child; David Luan; Dario Amodei; Ilya Sutskever (2019), Language Models are Unsupervised Multitask Learners (PDF), Wikidata Q95726769
  2. ^ Pengfei Liu; Weizhe Yuan; Jinlan Fu; Zhengbao Jiang; Hiroaki Hayashi; Graham Neubig (28 July 2021), Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing (PDF), arXiv:2107.13586, Wikidata Q109286554
  3. ^ Xiang Lisa Li; Percy Liang (August 2021). "Prefix-Tuning: Optimizing Continuous Prompts for Generation" (PDF). Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers): 4582–4597. doi:10.18653/V1/2021.ACL-LONG.353. Wikidata Q110887424.
  4. ^ Brian Lester; Rami Al-Rfou; Noah Constant (November 2021). "The Power of Scale for Parameter-Efficient Prompt Tuning" (PDF). Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: 3045–3059. arXiv:2104.08691. doi:10.18653/V1/2021.EMNLP-MAIN.243. Wikidata Q110887400.
  5. ^ Tom Brown; Benjamin Mann; Nick Ryder; et al. (28 May 2020). "Language Models are Few-Shot Learners" (PDF). arXiv. Advances in Neural Information Processing Systems. arXiv:2005.14165. doi:10.48550/ARXIV.2005.14165. ISSN 2331-8422. S2CID 218971783. Wikidata Q95727440.
  6. ^ Victor Sanh; Albert Webson; Colin Raffel; et al. (15 October 2021), Multitask Prompted Training Enables Zero-Shot Task Generalization (PDF), arXiv:2110.08207, Wikidata Q108941092
  7. ^ Jason Wei; Xuezhi Wang; Dale Schuurmans; Maarten Bosma; Ed Chi; Quoc Viet Le; Denny Zhou (28 January 2022), Chain of Thought Prompting Elicits Reasoning in Large Language Models (PDF), arXiv:2201.11903, doi:10.48550/ARXIV.2201.11903, Wikidata Q111971110
  8. ^ Takeshi Kojima; Shixiang Shane Gu; Machel Reid; Yutaka Matsuo; Yusuke Iwasawa (24 May 2022), Large Language Models are Zero-Shot Reasoners (PDF), arXiv:2205.11916, doi:10.48550/ARXIV.2205.11916, Wikidata Q112124882
  9. ^ Liu, Vivian; Chilton, Lydia (2022). Design Guidelines for Prompt Engineering Text-to-Image Generative Models. ACM Digital Library. Association for Computing Machinery. pp. 1–23. arXiv:2109.06977. doi:10.1145/3491102.3501825. ISBN 9781450391573. S2CID 237513697. Retrieved 26 October 2022.
  10. ^ Stephen H. Bach; Victor Sanh; Zheng-Xin Yong; et al. (2 February 2022), PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts (PDF), arXiv:2202.01279, Wikidata Q110839490
  11. ^ Monge, Jim Clyde (2022-08-25). "Dall-E2 VS Stable Diffusion: Same Prompt, Different Results". MLearning.ai. Retrieved 2022-08-31.
  12. ^ Willison, Simon (12 September 2022). "Prompt injection attacks against GPT-3". simonwillison.net. Retrieved 2023-02-09.
  13. ^ Papp, Donald (2022-09-17). "What's Old Is New Again: GPT-3 Prompt Injection Attack Affects AI". Hackaday. Retrieved 2023-02-09.
  14. ^ Vigliarolo, Brandon (19 September 2022). "GPT-3 'prompt injection' attack causes bot bad manners". www.theregister.com. Retrieved 2023-02-09.
  15. ^ "🟢 Jailbreaking | Learn Prompting".
  16. ^ "🟢 Prompt Leaking | Learn Prompting".
  17. ^ a b Selvi, Jose (2022-12-05). "Exploring Prompt Injection Attacks". NCC Group Research. Retrieved 2023-02-09.
  18. ^ Edwards, Benj (14 February 2023). "AI-powered Bing Chat loses its mind when fed Ars Technica article". Ars Technica. Retrieved 16 February 2023.
  19. ^ "The clever trick that turns ChatGPT into its evil twin". Washington Post. 2023. Retrieved 16 February 2023.
  20. ^ Perrigo, Billy (17 February 2023). "Bing's AI Is Threatening Users. That's No Laughing Matter". Time. Retrieved 15 March 2023.
  21. ^ "Indirect Prompt Injection Threats". github.io. 2023-03-01. Retrieved 2023-03-20.
  22. ^ "Indirect Prompt Injection on Bing Chat". Hacker News. 2023-03-01. Retrieved 2023-03-20.
Original content from Wikipedia, shared with licence Creative Commons By-Sa - Prompt engineering