• 0 Posts
  • 9 Comments
Joined 3 years ago
cake
Cake day: June 12th, 2023

help-circle

  • But you can trust the first model to produce the code you want it to. Or, at least get a baseline of whether it works as expected. To roll back to the simple example of secure (sanitized) user input via a form, the human sets up the testing environment. All the human needs to do is write a script that reads the entered database entry, and hashes the rest of the database / application in memory.

    It should be simple for the first model to use different languages and approaches from strongly typed languages like ada to yolo implementations in Python.

    The adversarial model’s job is to modify the state of the application or database outside of that entry. This should be possible with some of the first models implementations, unless they are already perfect.

    The idea is with enough permutations of implementations at different temperatures and with different input context, an almost infinite number of blue team and red team examples can be iterated on and produced on this one specific problem.

    This approach is already being generalized to produce more high-quality software training data for LLMs than exist in the lexicon of human output.

    This is very hard to do with art or writing. Art is subject, you can not validate the variable automatically or detect subtle variations without context and opinion so easily.

    This is tangental to why Machine Learning works so well for weather data. We can objectively validate the output with historic data, but we can also create synthetic weather data using physically based models. It’s different, but similar in principal.


  • Clunky wording on my part. I mean results can be tested objectively. In creative fields, there are no objective means of testing outputs. In programming, one model can, for example, build a user input field to match requirements, and another model can test it. The success and failure of those test can be measured objectively (do stored inputs fall within the desired domain, does a hash of the memory (sans known changing variables) change?)


  • peanuts4life@lemmy.blahaj.zonetoFuck AI@lemmy.worldOn familiarity
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    14
    ·
    14 days ago

    I don’t know. I worked as a software dev for 5 years and have a BS in CS. I’ve transitioned to ecological work. The progress I’ve seen with Claude Code specifically has convinced me that even moderate gains in intelligence will lead to the functional replacement of several data entry and junior programming jobs.

    It’s true that people overestimate LLM performance in other domains. But, software is easy to generate synthetic data for, objectively testable by adversarial models, etc.


  • I built several nodes. I think it’s most useful as an asset tracking tool, but the battery life isn’t great. Like, I have a couple premade credit card sized nodes. It’s pretty neat to ping them and get their gps. But, to be honest, your money for that application would probably be better spent on an iPhone and apples tags.

    For communication: there are dubiously legal, cheap radios you can get off Amazon that would probably be 100% more useful.

    I did enjoy it, though, and I still have some nodes. Also, it’s illegal to send encrypted messages over mestastic.

    Edit some mestastic nodes double as a low power gps with a screen. These may be useful on their own.



  • I don’t remember the institution, but I remember reading a paper on a simulated trading environment with several ai agents who didn’t know about eachother. The LLMs were pretty conservative with profits and deliberately bought and sold in predictable ways. They all ended up “colluding” with eachother by deliberately not competing.