Data Compression


Advanced search

Message boards : Server backend and mirrors : Data Compression

Author Message
JZD
Send message
Joined: 30 Dec 11
Posts: 89
Credit: 3,109,729
RAC: 5,187
Message 15237 - Posted: 12 Sep 2017, 12:18:21 UTC

At http://burp.renderfarming.net/status.php, a new Data Compression item appears showing some finished sessions. Is this a new feature or bug?

Profile Janus
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 16 Jun 04
Posts: 4479
Credit: 2,094,806
RAC: 0
Message 15238 - Posted: 12 Sep 2017, 16:23:32 UTC
Last modified: 12 Sep 2017, 16:41:04 UTC

Feature, yay! A BARF milestone subproject.

Part of the running cost of BURP is storage. If we are to open up BURP and scale it up with a factor of 10x or 100x in the future then we need to be much more conscious about server-side storage. Currently, for a number of reasons, the strategy has been to store literally everything (including multiple copies of each frame rendered from each render node, both parts and the stitched frames etc.). That way when something goes wrong, it costs very little to recreate it from some deeper storage layer.
In recent years fewer and fewer things have gone wrong where the extra storage was needed, and when the custom Cycles validator project is done there is a very large portion of that kind of "almost but not quite duplicate" storage that can be freed up for other purposes.
In the future the hope is to make the service a lot more slim, so that it scales better without unnecessary waste of storage.

The session data that will be put in long-term storage is going to consist primarily of


  • One canonical copy of each rendered, final frame
  • An encoded preview video file with all the frames
  • The original input file(s) used for the session
  • Some logs, checksums and performance data


In that list the first item is typically the one using the most space. By using lossless compression it is possible to store the exact same data using less space. How much less differs a lot from session to session but it is typically between 5%-50%.

What you are seeing is the new CruncherService going through the backlog of all the old rendered sessions and performing two tasks: validating that the data is still correct and compressing rendered frames using lossless compression.
At the moment it has been granted very few CPU resources and it is running as a background service so it is running fairly slowly - but it will pick up in speed once the new server is in place. The process is fairly I/O heavy so we're keeping it on the server cluster instead of distributing it to the farm via BOINC.

Profile noderaser
Project donor
Avatar
Send message
Joined: 28 Mar 06
Posts: 507
Credit: 1,549,902
RAC: 124
Message 15247 - Posted: 16 Sep 2017, 4:02:30 UTC

Storage problems? I'll be some kind of cloud storage, say a network of mirrors, could be part of a solution there... :P
____________

Profile Janus
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 16 Jun 04
Posts: 4479
Credit: 2,094,806
RAC: 0
Message 15259 - Posted: 25 Sep 2017, 20:27:53 UTC

The initial tests from the past two weeks showed a number of issues with the compression system and how jobs were being scheduled across the server cluster. Most of the performance issues have been ironed out now.

One interesting issue was that the tests themselves were bugged: When a session is compressed the system carries out a test to see that it can decompress the compressed session and get the original data back. The original and decompressed data is compared using strict validation (a variant of our normal validator) before the original data is disconnected (discarded from active storage). However, it turned out that when a frame contained no colourful pixels it could be compressed with greyscale instead of RGB (saving even more space) but when the test tried to re-open the image it would incorrectly interpret the greyscale image as being different and mark the frame as failed.
It turned out to be related to how Java's built-in image reader maps gamma for different kinds of images. By-passing the gamma correction step and reading the values directly from the raw image not only fixed the issue but also had the interesting side-effect of making our normal validator (the one targeted at Blender's internal renderer) a bit faster too.

This data compression background task has had a few additional resources assigned to it and will continue at a slightly faster pace over the next few weeks. It will be interesting to see if it hits another snag or if it will just keep going.

Profile Janus
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 16 Jun 04
Posts: 4479
Credit: 2,094,806
RAC: 0
Message 15266 - Posted: 3 Nov 2017, 16:32:26 UTC

Got the new compute node hardware up and running and the first task assigned to it has been to help with the data compression. We're down to around 1000 sessions in the compression queue until the service has cleared the backlog.

There is a small amount of sessions that have gotten stuck somehow, still trying to figure out exactly why.


Post to thread

Message boards : Server backend and mirrors : Data Compression