They learn the value of specific actions in specific contexts based on the rewards they received dur...

EternalFury • last Monday at 8:59 PM • 1 reply • view on HN

They learn the value of specific actions in specific contexts based on the rewards they received during their play time. Specific actions and specific contexts are not transferable for various reasons. John quoted that varying frame rates and variable latency between action and effect really confuse the models.

Replies

nightpool • last Monday at 9:41 PM

Okay, so fuzz the frame rate and latency? That feels very easy to fix.

➕ show 1 reply

alt Hacker News

Replies