And additionnaly, isn’t there a way to exploit this so we can store more stuff on PCs?
Edit: can’t thank you all individually but thanks to everyone, I learnt something today, appreciate all of your replies!
Because you’re operating system is lying to you, for efficiency sake.
Imagine an old school library, books on the shelves, and a Dewey decimal card catalog index in the center.
You want to delete a book, to make room for future books, so you tell the librarian delete this book. And she removes the card from the card catalog index, and turns to you and says the book is gone!
In this scenario the book is still on the shelf, but the index no longer points to it.
Clearly the book isn’t gone, but from your perspective you don’t have to wait for the book to disappear, and the librarian knows eventually she’s going to clean the shelf, and remove whatever isn’t in the index.
That’s more or less, with a lot of hand wave in, what operating systems do for file systems.
In this analogy, when you add a new book, only then is that “deleted” book actually removed and replaced with the new one. Until then, it just sits there waiting, but since nothing is pointing to it, it’s hard to find.
When someone recovers a file, what they’re doing is going book by book and reconciling the index to see if there’s anything missing. Since this book still exists, it can be recovered.
If you remember the VCR days, imagine your hard drive is a copy of Bambi. You, in preparation for a family event need a tape to store footage of the event on. You decided that you haven’t watched or wanted to watch Bambi in a long time so you designate that tape as the one you’re gonna use when the party day comes.
At this point your hard drive (the copy of Bambi) has been designated as useable space for new data to be written in the future.
Bambi is not lost yet and wont be until you write to that tape, therefore if you wanted to you could watch Bambi in the time between now and the party even though you plan to overwrite it. Once Bambi is overwritten, its no longer recoverable but the interim between now when you designate it as useable space and when the space is used, the data persists.
It’s because hard drives don’t turn every written bit into a 0. Instead it tells the operating system that the region you deleted is free for writing again.
At some point in the future through usage that region will either be corrupted or have something completely different in it (from our perspective though it may read as corrupt it will still work as expected when written into)
If I tell you all the boxes in a warehouse are empty, that doesn’t mean they are. It just means I think they are. You can go and check them manually to see if they’re actually empty or if I was lying or forgot there was stuff in them. The metaphor breaks down a little bit here but if you look at the boxes closely, the ones with dust on top were probably empty for a long time and the ones without were probably emptied recently.
Because of how filesystems work. There’s basically an index that tells the OS what files are stored where on the disk. The quickest way of deletion simply removes the entry in that table. The data is still there, though. So a data recovery program would read the entire disk and try to rebuild the file allocation table or whatever by detecting the beginning and ends of files. This worked better on mechanical drives than SSDs.
Yup, and many security suites will include a tool that writes all 0s or garbage to those sectors so the data can’t be recovered as easily (you really need multiple passes for it to be gone for good).