Do we really care about the number of pipeline stages?
Having multiple clocks per pipeline stage makes sense only if you are going to reduce gate count by reusing hardware resources in the stage
- For blend operation, can reuse multiplier-addition block over two clock cycles to perform full blend operation
Other fragment operations are very simple and control cost to share hardware resources may not be worth it
Probably just want to increase Fragment Rate if we have multiple clocks per fragment to use!!!