It’s to illustrate the alignment problem. What you literally ask isn’t always what you actually want. This is usually obvious to humans but not necessarily to an AI. If you sit in a self-driving car and tell it to take you to the airport as fast as possible, you might arrive three minutes later covered in vomit with the entire police department after you. That’s obviously not what you wanted, but you got exactly what you asked for.
The paperclip maximizer is a cartoon example of this. If you just ask it to make as many paperclips as possible, that becomes its priority number one and everything gets turned into paperclips and you might not get the chance to tell it this isn’t what you meant.
A kind of real-life example is the story of a city that started paying people for rat tails to eradicate the rat population, only for folks to start breeding rats instead to make money. It’s a classic case of unintended results due to unspecific requirements.
Undecidable in the sense that no solution can exist for that problem class. You can start with the definition of what exactly you’re aligning with, how you measure that, how you derive applicable system evolution constraints from your measurements, and just what humanity is, in the iterative context.
Apart from that we’re already in an out of control winner-takes-all arms race where AI is used by competing nations, including social control and battlefield. Ivory tower is a meal ticket with no practical relevance.
It’s to illustrate the alignment problem. What you literally ask isn’t always what you actually want. This is usually obvious to humans but not necessarily to an AI. If you sit in a self-driving car and tell it to take you to the airport as fast as possible, you might arrive three minutes later covered in vomit with the entire police department after you. That’s obviously not what you wanted, but you got exactly what you asked for.
The paperclip maximizer is a cartoon example of this. If you just ask it to make as many paperclips as possible, that becomes its priority number one and everything gets turned into paperclips and you might not get the chance to tell it this isn’t what you meant.
A kind of real-life example is the story of a city that started paying people for rat tails to eradicate the rat population, only for folks to start breeding rats instead to make money. It’s a classic case of unintended results due to unspecific requirements.
Alignment is undecidable, so no point wasting synapseseconds.
It’s not a matter to decide but a problem to try and solve. In most cases we get to learn from our mistakes but when it comes to AGI we might not.
Or are you suggesting we shouldn’t even think about it but rather just roll the dice and see what happens?
Undecidable in the sense that no solution can exist for that problem class. You can start with the definition of what exactly you’re aligning with, how you measure that, how you derive applicable system evolution constraints from your measurements, and just what humanity is, in the iterative context.
Apart from that we’re already in an out of control winner-takes-all arms race where AI is used by competing nations, including social control and battlefield. Ivory tower is a meal ticket with no practical relevance.