On-line Learning of Dichotomies: Algorithms and Learning Curves

The performance of on-line algorithms for learning dichotomies is studied. In on-line learning, the number of examples P is equivalent to the learning time, since each example is presented only once. The learning curve, or generalization error as a function of P, depends on the schedule at which the learning rate is lowered. For a target that is a perceptron rule, the learning curve of the perceptron algorithm can decrease as fast as p-1 , if the schedule is optimized. If the target is not realizable by a perceptron, the perceptron algorithm does not generally converge to the solution with lowest generalization error. For the case of unrealizability due to a simple output noise, we propose a new on-line algorithm for a perceptron yielding a learning curve that can approach the optimal generalization error as fast as p-l/2. We then generalize the perceptron algorithm to any class of thresholded smooth functions learning a target from that class. For “well-behaved” input distributions, if this algorithm converges to the optimal solution, its learning curve can decrease as fast as p-l. 

Authors: H. Sompolinsky; Barkai, N.; H S Seung
Year of publication: 1995
Journal: Advances in Neural Information Processing Systems. Cowan J.D., Tesauro G. and Alspector J., editors, 7.

Link to publication:


“Working memory”