Sorry, that still doesn’t really make sense to me. If you can’t trust the generative model to produce code that does what it’s supposed to do, then you also can’t trust the adversarial model to perform the tests needed to determine that the code does what it’s supposed to do. So if the results have no meaning, then the fact that you can objectively measure them also has no meaning.
Hetare King
- 0 Posts
- 10 Comments
Programmers are kind of weird in the context of this post, because we tend to pretty consistently think our job is simpler than it really is, despite constantly being proven wrong.
objectively testable by adversarial models
This is an odd thing to say. Adversarial models are still learning models and have all the limitations that implies, including objectivity being far from guaranteed.
Hetare King@piefed.socialto
Technology@lemmy.world•Brussels plots open source push to pry Europe off Big TechEnglish
15·13 days agoThat sounds more like tinkering around the edges to me. Whipping companies like Twitter into behaving, while it absolutely needs to happen, won’t fundamentally change anything about the dependency of Europe to those companies and the pressure the US can exert through that dependency.
Kind of weird thing to compare it to, given that most “new urbanism” is just old, pre-WW2 urbanism.
Seems to me that the “market performance ratio” should weigh a lot heavier. The whole thing that makes something a bubble is that a lot of money is being put into it while very little is coming is coming out and there being very few prospects of that changing in the near enough future outside of religious conviction, yet this metric is the only one suggesting that investments creating real value should matter and it only accounts for 7.5% of the whole score. Then again, the site doesn’t actually properly define what “market performance ratio” means and doesn’t state its sources beyond a vague description.
Also, the person who made this, Mert Demirdelen, is “head of growth and product” at Mobiversite, an AI app maker. His skills listed on LinkedIn include “AI” and “Blockchain”. So maybe not someone who is completely devoid of the desire to invoke a particular impression of the state of the AI economy.
Hetare King@piefed.socialto
Fuck AI@lemmy.world•Question for the community that hates AiEnglish
2·2 months agoThere are valuable uses of learning models, but I’d say they all have the following constraints:
- The relation between input and output is at most 1:1. So the output does not contain any information that cannot be derived from the input.
- The scope is sufficiently constrained so that the error rate can be meaningfully quantified.
- Dealing with the errors (including verifying that there are errors, if needed) takes less effort than just doing everything manually.
Hetare King@piefed.socialto
Fuck AI@lemmy.world•When the bubble pops - How AI will destroy the economyEnglish
6·3 months agoWe’re not currently on a trajectory toward automata like that, at least not with the kind of AI that’s currently heavily being invested in, but even if we were, it would not lead to a positive outcome with the way society is set up right now. The problem is that someone would own the automata and therefore be in complete control over in whose benefit the automata would work. Unless the automata are easy to make (and the patents easily bypassed), making it difficult for someone to monopolise them, it would take a fundamental change to the way the economy works for this to benefit everyone, and that’s not an inevitability.
But this video isn’t really about that, it’s about the much more likely scenario that AI does not end up living up to its promises and the money eventually running out, and what the economic fallout of that will be.
The biggest issue with the three-pedalled bicycle is that humans have, at most, two legs.
I would argue that’s actually the last situation you’d want to use an LLM. With numbers like that, nobody’s going to review each and every letter with the attention things generated by an untrustworthy agent ought to get. This sounds to me like it calls for a template. Actually, it would be pretty disturbing to hear that letters like that aren’t being generated using a template and based on changes in the system.

I’m not sure that the comparison with the weather data works. Tweaking curves to more closely match the test data, and moving around a model’s probability space in the hope that it sufficiently increases the probability of outputting tokens that fixes the code’s problems, seem different enough to me that I don’t know whether the former working well says anything about how well the latter works.
If I understand what you’re describing correctly, the two models aren’t improving each other, like in adversarial learning, but the adversarial model is trying to get the generative model to zone in on output that produces the user’s desired behaviour based on the given test data. But that can only work as well as how much the adversarial model can be relied upon to actually perform the tasks needed to make this happen. So I think my point still stands that the objectivity of your measurements of the test results is only meaningful if the test results themselves are meaningful, which is not guaranteed given what’s doing the testing.
How complex is the adversarial model? If it’s anywhere near the generative model, I don’t think you can have actual meaningful numbers about its reliability that allow you to reason about how meaningful the test results it produces are.