This is the technology worth trillions of dollars huh

    • KubeRoot@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      16
      ·
      9 hours ago

      That actually sounds like a fun SCP - a word that doesn’t seem to contain a letter, but when testing for the presence of that letter using an algorithm that exclusively checks for that presence, it reports the letter is indeed present. Any attempt to check where in the word the letter is, or to get a list of all letters in that word, spuriously fail. Containment could be fun, probably involving amnestics and widespread societal influence, I also wonder if they could create an algorithm for checking letter presence that can be performed by hand without leaking any other information to the person performing it, reproducing the anomaly without computers.

      • leftzero@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        5 hours ago

        No, LLMs produce the most statistically likely (in their training data) token to follow a certain list of tokens (there’s nothing remotely resembling reasoning going on in there, it’s pure hard statistics, with some error and randomness thrown in), and there are probably a lot more lists where Colorado is followed by Connecticut than ones where it’s followed by Delaware, so they’re obviously going to be more likely to produce the former.

        Moreover, there aren’t going to be many texts listing the spelling of states (maybe transcripts of spelling bees?), so that information is unlikely to be in their training data, and they can’t extrapolate because it’s not really something they do and because they use words or parts of words as tokens, not letters, so they literally have no way of listing the letters of a word if said list is not in their training data (and, again, that’s not something we tend to write, and if we did we wouldn’t include d in Connecticut even if we were reading a misprint). Same with counting how many letters a word has, and stuff like that.

    • I Cast Fist@programming.dev
      link
      fedilink
      English
      arrow-up
      4
      ·
      9 hours ago

      SCP-00WTFDoC (lovingly called “where’s the fucking D of Connecticut” by the foundation workers, also “what the fuck, doc?”)

      People think it’s safe, because it’s “just an invisible D”, not even a dick, just the letter D, and it only manifests verbally when someone tries to say “connecticut” or write it down. When you least expect it, everyone heard “Donnedtidut”, everyone read that thing and a portal to that fucking place opens and drags you in.

    • ripcord@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 hours ago

      Words are full of mystery! Besides the invisible D, Connecticut has that inaudible C…