A statistical-mechanical theory of learning from examples in layered networks at finite temperature is studied. When the training error is a smooth function of continuously varying weights, the generalization error falls off asymptotically as the inverse number of examples. By analytical and numerical studies of single-layer perceptrons, we show that when the weights are discrete, the generalization error can exhibit a discontinuous transition to perfect generalization. For intermediate sizes of the example set, the state of perfect generalization coexists with a metastable spin-glass state.