- Neuroscience
Recurrent neural network and its readout (cortex–striatum) can learn state representation and value using online random-weight feedback of temporal-difference reward-prediction-error (dopamine) through feedback alignment or biological non-negative-weight constraint-induced loose alignment.