Hi,
I've been trying to use Sharpcompress to extract 7zip files. It's working well, except that with a solid 7zip archive the extraction speed is horrible (10-20 minutes to extract a 7MB solid archive...)
First I thought, oh, I need to use the forward-only reader for solid archives ... but there isn't one for 7-zip archives.
I had a look through the code and it seems that the problem is that with the solid archive, it decompresses the entire stream from the beginning for every file. So the first file extracts quickly, the second file fairly quickly (it decompresses the first file again in order to skip it) ... but the 100th file in the archive is not so fast. I guess since the archive has ~150 files in it, the first file ends up getting decompressed 150 times, the second file 149 times, ...? :)
I made a quick hack that resolves this - when it asks for a stream from the 7zip folder, the result is cached, and if the next extraction operation needs a file that is further forwards in the same stream, it reuses that stream rather than starting from the beginning again. It works well with my archive at least - 8 seconds to extract rather than >10 minutes - but it would be good to have a fix integrated into the project so I can keep up to date with the nuget version.
I can send you my code if you want (it's <10 lines anyway!) but I don't know if you would prefer to fix it a different way.
Thanks!
I've been trying to use Sharpcompress to extract 7zip files. It's working well, except that with a solid 7zip archive the extraction speed is horrible (10-20 minutes to extract a 7MB solid archive...)
First I thought, oh, I need to use the forward-only reader for solid archives ... but there isn't one for 7-zip archives.
I had a look through the code and it seems that the problem is that with the solid archive, it decompresses the entire stream from the beginning for every file. So the first file extracts quickly, the second file fairly quickly (it decompresses the first file again in order to skip it) ... but the 100th file in the archive is not so fast. I guess since the archive has ~150 files in it, the first file ends up getting decompressed 150 times, the second file 149 times, ...? :)
I made a quick hack that resolves this - when it asks for a stream from the 7zip folder, the result is cached, and if the next extraction operation needs a file that is further forwards in the same stream, it reuses that stream rather than starting from the beginning again. It works well with my archive at least - 8 seconds to extract rather than >10 minutes - but it would be good to have a fix integrated into the project so I can keep up to date with the nuget version.
I can send you my code if you want (it's <10 lines anyway!) but I don't know if you would prefer to fix it a different way.
Thanks!