The probability of choosing an alternative in a long sequence of repeated choices is proportional to the total reward derived from that alternative, a phenomenon known as Herrnstein’s matching law. This behavior is remarkably conserved across species and experimental conditions, but its underlying neural mechanisms still are unknown. Here, we propose a neural explanation of this empirical law of behavior. We hypothesize that there are forms of synaptic plasticity driven by the covariance between reward and neural activity and prove mathematically that matching is a generic outcome of such plasticity. Two hypothetical types of synaptic plasticity, embedded in decision-making neural network models, are shown to yield matching behavior in numerical simulations, in accord with our general theorem. We show how this class of models can be tested experimentally by making reward not only contingent on the choices of the subject but also directly contingent on fluctuations in neural activity. Maximization is shown to be a generic outcome of synaptic plasticity driven by the sum of the covariances between reward and all past neural activities.