I've never understood how people can claim that learning kana takes a week. It clearly takes more time than that, considering how similar some of the symbols are and a lot of them only differ by double dashes or a stroke (think nu vs me, ne vs re, ro vs ru, chi vs sa, and so on). Then there are the combinations and even if you managed to learn hiragana, you still have to learn katakana.
Oh and I forgot, you have to actually learn how to listen, pronounce and speak them, not just learn a useless romanization mapping. I've heard way too many English speakers just say the romanization with English pronunciation. At that point their learning efforts turn into self sabotage.
In total that's definitively a month of effort, albeit spread out over the first year of learning.
It takes a week to learn the system and to know the existence of at least all the hiragana characters and memorize the sound of some of them.
It takes two weeks to know know the existence of all the kana characters (including katakana), to memorize the sound of enough of them to read some words, and to write some of them.
After a month you should have easily memorized the sound of all of them (maybe a rare one like ム slips by occasionally), be able to write most of them, and be able to read (albeit slowly) anything written in kana.
I think "a week" is slightly optimistic, but I also think "a month" is slightly pessimistic. When I learned hiragana, I spent my free time drilling on RealKana [0]. I'd focus on a new column of the kana table, then bring in columns I'd already practiced, until eventually I could flash reliably on cards drawn from the entire table. This legitimately didn't take much longer than a week, because learning a single column in isolation is very quick, and the real difficulty comes in distinguishing similar kana (as you say). But I was able to drill similar kana by selecting two or three kana that would force me to see them often. (I still struggle slightly with wa, re, and ne, but I definitely know them.)
I also drilled on a drag-n-drop kana table [1] in a few ways -- sometimes I'd start from the kana and try to figure out where they should go in the table, and sometimes I'd go along rows or columns in the table and try to find the kana that belong there. These two directions drill both recognition and recollection.
Proper pronunciation is a cross-cutting concern. As a whole, it's not something you can reasonably learn solely from kana, but the aspects that are relevant are not difficult to pick up. Every kana breaks into one (vowels and N) or two (the rest) phonemes, and for the most part, the way you pronounce those phonemes is consistent across rows and columns of the table (admitting exceptions like "shi" and "tsu"). If you are taught those basics, learning how to pronounce kana is not hard. Training your ear to "hear" distinctions among English allophones, and to distinguish pitch accent from the more familiar stress accent, is much harder, and really has to come from experience, not just kana.
[0]: https://realkana.com/hiragana, wow it's improved since I last used it
[1]: https://ohelo.github.io/usagi-chan/hiragana/