Interesting work floor plans are a great real world testbed for data centric AI because the bottleneck is almost always annotation quality, not model architecture.
We’ve seen similar patterns in document layout and indoor mapping projects: cleaning mislabeled walls/doors, fixing class imbalance (e.g., tiny symbols vs large rooms), and enforcing geometric consistency often gives bigger gains than switching models. For example, simply normalizing scale, snapping lines, and correcting room boundary labels can outperform moving from a basic U-Net to a heavier transformer.
A reproducible pipeline + curated datasets here feels especially valuable for downstream tasks like indoor navigation, energy modeling, or digital twins where noisy labels quickly compound into bad geometry.
Would be curious how you handle symbol ambiguity (stairs vs ramps, doors vs windows) and cross-domain generalization between architectural styles.
Interesting work floor plans are a great real world testbed for data centric AI because the bottleneck is almost always annotation quality, not model architecture.
We’ve seen similar patterns in document layout and indoor mapping projects: cleaning mislabeled walls/doors, fixing class imbalance (e.g., tiny symbols vs large rooms), and enforcing geometric consistency often gives bigger gains than switching models. For example, simply normalizing scale, snapping lines, and correcting room boundary labels can outperform moving from a basic U-Net to a heavier transformer.
A reproducible pipeline + curated datasets here feels especially valuable for downstream tasks like indoor navigation, energy modeling, or digital twins where noisy labels quickly compound into bad geometry.
Would be curious how you handle symbol ambiguity (stairs vs ramps, doors vs windows) and cross-domain generalization between architectural styles.
Nice focus on data quality over model churn.