Hi dev,
> Warning: This overwrites data starting at sector 64! Use a dedicated blank USB.
This would be much more usable if you'd put a file system on the USB stick with a sufficiently large, empty, bianco file.
Note, you don't need to implement a file system driver. Just create it with a big enough file and use that file's starting sector instead of 64. You could even update the file's size with the data written (just one more sector write, and again, you don't need to interpret the file system, just patch one integer in a sector).
I'd recommend ext2 (FAT is not good, because it limits files to 2G, and exFAT is in patent hell). Create with mkfs.ext, add a large file with dd, then save the data as a bunch of "db" lines into your asm source. You'd then assemble the file system along with stage1/2 without the need to interpret what you're writing.
Oh, one more thing: some BIOS checks the first bytes of the boot sector as well (not just the last two bytes), so you should start your boot sector with a short jmp and a nop. Also it's not guaranteed that direction flag is cleared, add a cld after cli.
To get some ideas, here's a boot sector that loads an EFI PE/COFF executable from a FAT file system (without interpreting the fs), sets up long mode and executes it:
https://gitlab.com/bztsrc/easyboot/-/blob/main/src/boot_x86....
(Notes: written for the flatassembler, which uses a very very similar Intel syntax like nasm, and the 2nd stage EFI executable is written in C very carefully so it doesn't matter if it's loaded by this boot sector on BIOS or by the UEFI firmware, the same binary just works everywhere.)
As for developing EFI apps, I don't use EDK2, because it's messy and bloated, instead I've written my own UEFI SDK: https://gitlab.com/bztsrc/posix-uefi it's much easier to use, you might find it useful too.