ParchGrad Paper Structure
Series of experiments conducted for SIP for GPT
This is a full discussion on the structure of the paper ParchGrad
Our contributions are as follows.
- We are the first to propose channel pruning for convolutional layers in saliency map.
- We theoretically analyze the impact of multiple gradients and propose variance conservation to balance gradients from multiple modules.
- We propose class-wise channel selection methods which largely removes noise in gradients while preserving interpretability.
Original Draft