• lime!@feddit.nu
    link
    fedilink
    English
    arrow-up
    26
    arrow-down
    1
    ·
    edit-2
    2 days ago

    because the system prompt is not configuration, it’s input. it has the same priority as whatever the user types in, and it takes up valuable space in the context window.

    to add onto what pennomi is saying, this also shows that openai doesn’t understand language models. the only actual functionality the llm has is still “given the previous text, what is the most likely character/phoneme/token?”, so rather than (to use an analogy) change the font in their word document they add in a sentence in the middle of the document that says “everything from here is in comic sans”.

    but it’s not surprising that they’d do this. if we’ve learned anything from the claude frontend leak earlier, where their “sentiment analysis” tool for input text was a regex (you literally have a language model! that’s like the only thing it’s good at!), i think it’s pretty clear most of the big players in the llm space have gotten high on their own supply and can’t be expected to actually reason about the operations the system is actually performing.

    • FishFace@piefed.social
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      6
      ·
      2 days ago

      But because the system prompt is part of the context, it figures into the estimation of the most likely next token. So in general putting this kind of stuff in the system prompt does change how well it works.

      • lime!@feddit.nu
        link
        fedilink
        English
        arrow-up
        11
        arrow-down
        1
        ·
        2 days ago

        of course. but the larger the context grows the less it affects the output. there is some ways around this, like moving the system prompt last in the context before every answer, but the very existence of the system prompt to begin with is a hack. what’s really needed is a functional rules-based pre- and post-filtering system for a chatbox to be safe. personally i think the chatbox “style” has played out its role and is living on as a gimmick. actual tooling built with language models is stuff like LSP servers and accessibility software, and that needs rigid configuration.

    • theunknownmuncher@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      4
      ·
      edit-2
      2 days ago

      The system prompt is configuration, and configuration is input. Semantics don’t actually challenge my point.

      • lime!@feddit.nu
        link
        fedilink
        English
        arrow-up
        3
        ·
        2 days ago

        configuration is things like temperature, output cutoff, and tool use. those are out-of-band. the system prompt, being in-band, can not be configuration. it’s like calling a http request configuration for the response.