When I was a senior at Swarthmore College, Herb Wilf came over from U Penn to teach a course in combinatorial algorithms. I was encouraged to attend.
He claimed that choosing a subset of k integers at random from {1..n} should have a log in its complexity, because one needs to sort to detect duplicates. I realized that if one divided [1..n] into k bins, one could detect duplicates within each bin, for a linear algorithm. I chose bubble sort because the average occupancy was 1, so bubble sort gave the best constant.
I described this algorithm to him around 5pm, end of his office hours as he was facing horrendous traffic home. I looked like George Harrison post-Beatles, and probably smelled of pot smoke. Understandably, he didn't recognize a future mathematician.
Around 10pm the dorm hall phone rang, one of my professors relaying an apology for brushing me off. He got it, and credited me with many ideas in the next edition of his book.
Of course, I eventually found all of this and more in Knuth's books. I was disillusioned, imagining that adults read everything. Later I came to understand that this was unrealistic.
interesting, when is this? because that seems obvious today - basically hash-set, right?
Love anecdotes like this! But admittedly I feel a bit lost, so please forgive my ignorance when I ask: why does choosing a subset of k integers at random require deduplication? My naive intuition is that sampling without replacement can be done in linear time (hash table to track chosen elements?). I’m probably not understanding the problem formulation here.