Why does l1 regularization result in sparse models?
Photo Credit: Courtesy of 3DStockPhoto (result image)
One way of regularization is making sure the trained model is sparse so that the majority of it's components are zeros. Those zeros are essentially useless, and your model size is in fact reduced. The reason for using L1 norm to find a sparse solution is due to its special shape.