Iam currently investigating what the impact in terms of reliability, data savings and performance when using SQL Data Compression on top of SAN compression.
I have done some investigation, but there are not a lot of articles regarding this. At least, they do not anwser the questions that I have. I have found a Dell whitepaper which states to use either one of the options. We do not have a Dell SAN, but that would possibly make this also applicable for our storage unit (Huawei).
Reliability: I have 'heared' stories where data could become corrupt when you use both. I cannot find anything that talks about this, but could be plausible. We could however take this risk with our relatively small datawarehouse. We have ETL timing issues there and if corrupt, it is easy to reload the data from the source.
Data savings: The hardware would probably be better at this. Since SQL also only compresses IN_ROW_DATA, you can always get a higher compression ratio out of the SAN compression.
Performance: Since data is compressed, SQL needs to perform less I/O requests, so in terms of performance we could win a lot here. I don't think that this has anything to do with the compression on storage level, because there are simply just less I/O requests going to the storage unit. The storage unit always needs to return data for every I/O request regardless if it is compressed. If data is compressed and 4 I/O requests are done for the same block, then the storage unit still needs to return 4 blocks of data. So if we can reduce that, then that is a win. Also, SQL writes compressed pages to memory, so we could win performance here as well.
What would be the recommendation regarding the 3 statements above? I cannot find really any good articles regarding this. Maybe some of you know a few?