It seems that every few weeks some developer makes this same mistake and a news is published each time.
This keeps happening. I can understand using AI to help code, I don’t understand Claude having so much access to a system.
I don’t feel an inkling of sympathy. Play stupid games, win stupid prizes.
Lmao good.
bad backup vibes there boss? backup was the task?
That’s what version control is for.
You either have a backup or will have a backup next time.
Something that is always online and can be wiped while you’re working on it (by yourself or with AI, doesn’t matter) shouldn’t count as backup.
He did have a backup. This is why you use cloud storage.
The operator had to contact Amazon Business support, which helped restore the data within about a day.
AI or not, I feel like everybody has had “the incident” at some point. After that, you obsessively keep backups.
For me it was a my entire “Junior Project” in college, which was a music album. My windows install (Vista at that time - I know, vista was awful, but it was the only thing that would utilize all 8gb of my RAM because x64 XP wasn’t really a thing) bombed out, and I was like “no biggie, I keep my OS on one drive and all of my projects on the other, I’ll just reformat and reinstall Windows”
Well… I had two identical 250gb drives and formatted the wrong one.
Woof.
I bought an unformat tool that was able to recover mostly everything, but I lost all of my folder structure and file names. It was just like 000001.wav, 000002.wav etc. I was able to re-record and rebuild but man… Never made that mistake again. Like I said. I now obsessively backup. Stacks of drives, cloud storage. Drives in divverent locations etc.
TestDisk has saved my ass before. It’s great at recovering broken partitions. If it’s just a quick format done with no encryption involved, you have a very high chance of having your stuff back. That’s of course if you catch yourself after doing just the format.
Other than that, yeah, I’ve also had my moments. Back in high school not only did I not have money for an external drive - I didn’t even have enough space on my primary one. One time a friend lent me an external drive to do a backup and do a clean reinstall - and I can’t remember the details, but something happened such that the external drive got borked - and said friend had important stuff that was only on that hard drive. Ironically enough it wasn’t even something taking much space - it was text documents that could’ve lived in an email attachment.
AI or not, I feel like everybody has had “the incident” at some point. After that, you obsessively keep backups.
Yup!
Also totally unrelated helpful tip- triple check your inputs and outputs when using dd to clone a drive. dd works great to clone an old drive onto a new blank one. It is equally efficient at cloning a blank drive full of nothing but 0s over an old drive that has some 1s mixed in.
And that’s a great example where a GUI could be way better at showing you what’s what and preventing such errors.
If you’re automating stuff, sure, scripting is the way to go, but for one-off stuff like this seeing more than text and maybe throwing in a confirmation dialogue can’t hurt - and the tool might still be using
ddunderneath.
Nobody wants to point out that Alexey Grigorev changes to being named Gregory after 2 paragraphs?
Slop journalism at its sloppiest. I wouldn’t be surprised to find out that this story was entorely fabricated.
The developer is to blame. Using a cutting edge tool irresponsibly. I have made mistakes using AI to help coding as well, never this bad though. Blaming AI would be like blaming the hammer a roofer was using to hammer nails and slamming their finger accidentally with it. You don’t blame the hammer, you blame the negligence of the roofer.
A developer having the ability to accidentally erase your production db is pretty careless.
An AI agent having the ability to “accidentally” erase your production db is fucking stupid as all fuck.
An AI agent having the ability to accidentally erase your production db and somehow also all the backup media? That requires a special course on complete dribbling fuckwittery.
That’s it Son of Anton is banned.
The lesson: AI cannot bridge an air-gapped backup. This could all be prevented with a crappy portable hard drive from costco.
I am still unable to delete the backup. Trying *nuke tool*.
[Enter nuclear codes]:I was able to remove the backup to eradicate the error both from the production and development environments. But wait a second, the user specified not to lose data. But I just eliminated all versions of the data. The user won’t be happy. Oopsie whoopsie!
The best prevention is not letting it happen in the first place. If your backup is a crappy portable hard drive from costco, you get what you buy, I wouldn’t have much faith on that either.
The best prevention is not letting it happen in the first place.
Ya think?! We’re past that.
Completely unnecessary for you to preemptively assume someone would choose a “crappy” backup from a retail store when in fact such a backup would still likely have saved the day, and any half-decent dev should at least have some kind of RAID backup on site and better yet an offsite one too.
The flaw was not having any backup, not your straw man of a poor quality choice.
That’s why I have multiple.
<insert Padme meme>: You had a backup, right?
Stop giving chat bots tools with this kind of access.
Wrong answer. If you don’t give them access, the alternative (ruling out not using AI because leadership will never go for that) is to hire high school kids to take a task from a manager, ask the ai to do it, then do what the AI says repeatedly to iterate to the solution. The problem with that alt is that it is no better than giving the ai access, and it leaves you with no senior tech people. Instead, you give it access, but only give senior tech people access to the AI. Ones who would know to tell the AI to have a backup of the database, one designed to not let you delete it without multiple people signing off.
Senior tech people aren’t going to spend thier time trying things an AI needs tried to find the solution. So if you don’t give it access, they won’t use it, and eventually they will all be gone. Then you are even further up shit creek than you are now.
The answer overall, is smarter people talking to the AI, and guardrails to stop a single point of failure. The later is nothing new.
What is this insane rambling?
The alternative is that the only thing with access to make changes in your production environment is the CI pipeline that deploys your production environment.
Neither the AI, nor anything else on the developers machine, should have access to make production changes.
Nah. As a tech people, I am not going to give an llm write access to anything in production, period
What are you even talking about?
The answer is no AI. It’s really simple. The costs for ai are not worth the output.
Do you go on an oncall rotation by chance? Because anyone that has to respond to night time pages would not be saying this lol.
I’m in favour of hiring kids to figure out the solution through iteration and doing web searches etc. If they fuck up, then they learn and eventually become better at their job - maybe even becoming a Senior themselves eventually.
I get what you’re saying - Seniors are more likely to use the tools more effectively, but there are many cases of the AI not doing what its told. Its not repeatably consistent like a bash script.
People are better - always.
No risk, no reward. People are desperate for these tools to help them success.
We’ve always been succeeding even without them. I don’t see why would anyone try to work in aiT if they don’t… Want to work lol
Success bigly, even.
We used to say Raid is not a backup. Its a redundancy
Snapshots are not a backup. Its a system restore point.
Only something offsite, off system and only accessible with seperate authentication details, is a backup.
Circa 1997 I was making some innovative new games, employed by a dude who’d put millions of his own money into the company. He was completely nonplussed when I brought him 20 CDs in a sealed box to remove from the building and store off site. He thought I’d lost my damned mind and blew it off as ravings of a stressed dev. I pointed out real threats to our IP including the hardware failures and even so far as the building burning down. 2 years of custom art and code gone. “Unlikely. Relax.”
After I moved on… an ex co-worker who’s still a longtime friend, tells me a different division lost a huge amount of FMV over some whoops-I-destroyed-the-wrong-drive blunder. 20 days to render on an 8 or 10 machine farm. Poof - No backups. In 1997 even with top-of-the-line gear it took an insane investment to render quality 3D.
The friggin’ carelessness irks the shit out of me as I type ahah
3-2-1 Backup Rule: Three copies of data at two different types of storage media, with 1 copy offsite
AND something tested to restore successfully, otherwise it’s just unknown data that might or might not work.
(i.e. reinforcing your point, no disagreements)
AKA Schrödinger’s Backup. Until you have successfully restored from a backup, it is just an amorphous blob of data that may or may not be valid.
I say this as someone who has had backups silently fail. For instance, just yesterday, I had a managed network switch generate an invalid config file for itself. I was making a change on the switch, and saved a backup of the existing settings before changing anything. That way I could easily reset the switch to default and push the old settings to it, if the changes I made broke things. And like an idiot, I didn’t think to validate the file (which is as simple as pushing the file back to the switch to see if it works) before I made any changes.
Sure enough, the change I made broke something, so I performed a factory reset and went to upload that backup I had saved like 20 minutes prior… When I tried to restore settings after the factory reset, the switch couldn’t read the file that it had generated like 20 minutes earlier.
So I was stuck manually restoring the switch’s settings, and what should have been a quick 2 minute “hold the reset button and push the settings file once it has rebooted” job turned into a 45 minute long game of “find the difference between these two photos” for every single page in the settings.
That’s always just one of the worst feelings in the world. This thing is supposed to work and be easy and… nope. Not there. It’s gone. Now you have work to do. heh
Always a fun time when technology decides to just fuck you over for no reason
But the backup software verified the backup!
Schrödinger’s backup
Fukan yes
- D\L all assets locally
- proper 3-2-1 of local machines
- duty roster of other contributors with same backups
- automate and have regular checks as part of production
- also sandbox the stochastic parrot
I remember back when I first started seeing a DR plan with three tiers of restore, 1 hour, 12 hours or 72 hours. I knew that to 1 hour meant a simple redirect to a DB partition that was a real time copy of the active DB, and twelve hours meant that failed, so the twelve hours was a restore point exercise that would mean some data loss, but less than one hour, or something like that.
I had never heard of 72 hours and so raised a question in the meeting. 72 hours meant having physical tapes shipped to the data center, and I believe meant up to 12 (though it could have been 24) hours of data lost. I was impressed by this, because the idea of having a job that ran either daily or twice daily that created tape backups was completely new to me.
This was in the early aughts. Not sure if tapes are still used…
A LTO drive with a non-consumer interface?
We still say that.










