One way of regularization is making sure the trained model is sparse so that the majority of it's components are zeros. Those zeros are essentially useless, and your model size is in fact reduced. The reason for using L1 norm to find a sparse solution is due to its special shape.