Blender (GPU)


Advanced search

Message boards : Number crunching : Blender (GPU)

1 · 2 · Next
Author Message
scole of TSBT
Send message
Joined: 20 Feb 15
Posts: 2
Credit: 540,666
RAC: 0
Message 14055 - Posted: 18 Aug 2015, 18:56:03 UTC

Will any Blender (GPU) WUs be released in the future?

Profile Janus
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 16 Jun 04
Posts: 4483
Credit: 2,094,806
RAC: 0
Message 14056 - Posted: 19 Aug 2015, 9:14:02 UTC - in response to Message 14055.
Last modified: 19 Aug 2015, 9:20:07 UTC

Still crunching the numbers on the last test but so far it looks really bad.

Let me be a little bit more precise: The actual rendering done is absolutely perfect, in fact it seems to be the most persistent rendering mode so far with very few consistency errors (if any?) on both Windows and Linux, but work on credit calculations is really going nowhere.
It was expected that CreditNew would be awry like usual, but even PayDay fails to get consistent results and arrives at a surprising 100% inaccuracy rate (which is a measure of how uncertain it is at the credit estimates it gives). Initially an investigations as to why this could be was focused on finding any errors in the two credit algorithms; errors that could have been triggered by the special case of using GPUs. Later the investigation turned towards the BOINC client resource estimation and reporting system and it turns out that there is simply nothing in BOINC that tracks GPU usage. That part of BOINC simply does not exist.
Since the data that is being fed to the credit algorithms is essentially garbage then we get garbage out of them too - garbage in, garbage out.

Nicolás (2009): [snip] the client doesn't measure "GPU time" at all. Is it even possible to get that information from nvidia APIs? [snip]
David (2009): I couldn't find such an API.
Janus (2015): [snip] is this still the case?
David (2015): It's still the case.
The notion of "GPU time" is not clear, since apps in general use a fraction of the GPU's cores.

This is not just a problem at BURP but a problem for any BOINC-based project that uses CUDA.

STOP! Q&A time! Here's a random Q&A session in the middle of the post!
Q: Does this mean that we will not get proper credit for the test sessions?
A: No, although the BOINC resource estimation is unusable we have a separate set of manual CPU-based estimates for these sessions. During the next PayDay event credit will be granted based on those sources. Upwards of 100 credit was granted per workunit in advance as well.

Q: Can't we just report this as a bug in BOINC and have them fix it?
A: We could but it doesn't change anything for us right now.

And back to the rest of the post:
On a medium-longterm timescale we may have to pursuade BOINC to use resource tracking on the platforms that support it. How many that is is yet unknown. If this turns out to be only nvidia tesla then this is a bad solution (special-casing a single product line is always bad).
A really-longterm solution is to work with the compute API providers (nVidia, AMD, Intel) to include this kind of functionality on a broad range of hardware/software platforms. This is really hard work, there is a lot of inertia in this field - especially because resource tracking is seen as a grid/cloud computing feature and everything in that field has to cost extra.
On a short-term basis I see only these solutions, none of which are ideal:
1) Grant no credit for GPU work.
2) Grant a set amount of credit for a GPU workunit regardless of how long it is. For some sessions it will be really easy to get credit, for others you have to work longer to get the same amount.
3) Use CreditNew or PayDay regardless of the issues. Doing GPU work will be like playing in the lottery. Sometimes you get nothing for hours of work, sometimes you get 200000 credit for short workunits.
4) Stop using GPUs. This "fixes" the problem but at the same time we say "no" to something that looks like a 30-40% increase in total farm rendering capacity.

Regardless of how we decide to proceed, those that want to help can opt in, those that do not can opt out of GPU work.

What do you guys think about this? Are there any good arguments to be had for one solution over the others?

Profile noderaser
Project donor
Avatar
Send message
Joined: 28 Mar 06
Posts: 507
Credit: 1,551,596
RAC: 223
Message 14057 - Posted: 19 Aug 2015, 11:02:13 UTC

Personally, I think that the additional resources that GPUs add to the farm are too important to consider excluding. I, for one, am not as concerned about credit and am willing to temporarily accept a degraded GPU credit model to keep the sessions crunching smoothly. Are there any plans to have sessions run on both CPU and GPU? Because it's so difficult to calculate running time on the GPU, you could use CPU wingmen to help calculate appropriate credit. Due to the apparent high-reliability of the GPU rendering, could it also be used to resolve verification issues between platforms?
____________

scole of TSBT
Send message
Joined: 20 Feb 15
Posts: 2
Credit: 540,666
RAC: 0
Message 14058 - Posted: 19 Aug 2015, 14:08:42 UTC
Last modified: 19 Aug 2015, 15:01:57 UTC

Went back and saw this statement...
" there is simply nothing in BOINC that tracks GPU usage. That part of BOINC simply does not exist"

I'm not a CUDA or OpenCL programmer, but is there nothing in those APIs that can report utilization which can be put it in a task log to use? Ought to be some way for CUDA and OpenCL to give feedback about how many cores were used or other statistics.

Profile Janus
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 16 Jun 04
Posts: 4483
Credit: 2,094,806
RAC: 0
Message 14059 - Posted: 19 Aug 2015, 16:49:46 UTC - in response to Message 14058.
Last modified: 19 Aug 2015, 16:52:34 UTC

is there nothing in those APIs that can report utilization which can be put it in a task log to use? Ought to be some way for CUDA and OpenCL to give feedback about how many cores were used or other statistics.

I know that "process accounting" is available through 2 different APIs and it is used to bill customers on GPU grid/cloud services. Still researching if this is available on non-grid products.

am willing to temporarily accept a degraded GPU credit model

Temporary in this case would probably be measured in years.

Are there any plans to have sessions run on both CPU and GPU? Because it's so difficult to calculate running time on the GPU, you could use CPU wingmen to help calculate appropriate credit. Due to the apparent high-reliability of the GPU rendering, could it also be used to resolve verification issues between platforms

Not enough data yet to say anything about that. It may very well be that GPU renders differently than CPU but is consistent on Linux vs Windows. If that is the case and we decide to keep going with GPU then the most likely scenario is that we split the queue into: Linux, Windows, GPU (Linux+Windows)

Profile DoctorNow
Project donor
Avatar
Send message
Joined: 11 Apr 05
Posts: 392
Credit: 2,168,338
RAC: 22
Message 14060 - Posted: 19 Aug 2015, 19:22:59 UTC - in response to Message 14057.

Janus wrote:
...
It was expected that CreditNew would be awry like usual, but even PayDay fails to get consistent results and arrives at a surprising 100% inaccuracy rate (which is a measure of how uncertain it is at the credit estimates it gives).
...

That surprises me a bit, but given the explanation it's a logical clue.


Nicolás (2009): [snip] the client doesn't measure "GPU time" at all. Is it even possible to get that information from nvidia APIs? [snip]
David (2009): I couldn't find such an API.
Janus (2015): [snip] is this still the case?
David (2015): It's still the case.
The notion of "GPU time" is not clear, since apps in general use a fraction of the GPU's cores.

This is not just a problem at BURP but a problem for any BOINC-based project that uses CUDA.

Ah well, the almighty Nicolás... that was probably during the good ol' Renderfarm@Home times... sorry, getting off-track here...
I'm not sure how many but some of the CUDA-based projects/credits use an algorithm for the credit calculation instead of the GPU time.
Still wondering after all these years that there is no credit system based on the work steps Blender uses inside a WU. This should be normally a secure method, like for example PrimeGrids credit algorithms are based on several factors etc.
I know I asked that a while ago (up to years) and you answered to it, Janus - unfortunately I don't know the answer anymore, but I believe it wasn't a positive one. ;-)

noderaser wrote:
Personally, I think that the additional resources that GPUs add to the farm are too important to consider excluding. I, for one, am not as concerned about credit and am willing to temporarily accept a degraded GPU credit model to keep the sessions crunching smoothly.

I agree on that, excluding it wouldn't be a wise decision. Sessions like Ben's 2626 probably would still be running if there wasn't a GPU-client.

you could use CPU wingmen to help calculate appropriate credit.

Are you sure? I'm already thinking about two extremes:
1. A very fast GPU has a wingman with a very slow CPU.
2. A very slow GPU has a wingman with a very fast CPU.
I'm not sure how this will be possible but working out an algorithm which can make a consistent result which fits both (and the many many cases between) will be probably very hard.
____________
Life is Science, and Science rules. To the universe and beyond
Proud member of BOINC@Heidelberg
My BOINC-Stats

Profile noderaser
Project donor
Avatar
Send message
Joined: 28 Mar 06
Posts: 507
Credit: 1,551,596
RAC: 223
Message 14064 - Posted: 23 Aug 2015, 14:28:55 UTC - in response to Message 14060.
Last modified: 23 Aug 2015, 14:29:17 UTC

DoctorNow wrote:
Ah well, the almighty Nicolás... that was probably during the good ol' Renderfarm@Home times... sorry, getting off-track here...

Sad that it was short-lived, since there were many more artists and crunchers there.

Are you sure? I'm already thinking about two extremes:
1. A very fast GPU has a wingman with a very slow CPU.
2. A very slow GPU has a wingman with a very fast CPU.
I'm not sure how this will be possible but working out an algorithm which can make a consistent result which fits both (and the many many cases between) will be probably very hard.

I was thinking that maybe we could apply some lessons from other projects that make use of GPU and CPU users alongside each other. However, the three projects that I was thinking of (Asteroids, MilkyWay and Moo!) all use flat-credit models for their tasks. So, that probably doesn't provide any useful data for determining credit on a project like BURP. Are there any other projects that are variable-credit, that could yield some useful information?

If i'm not mistaken, the existing BURP credit model already addresses the issue with hosts of different speeds; everyone gets the same amount of credit for a valid task because the product is the same, regardless of the run time. Here's just one example from the top of my task list. Three returned results where the "fast" computer completed the unit in 1,670.68s run/8,956.53 CPU and the "slow" computer did 17,723.28s run/25,411.95 CPU. All three hosts got the same credit award of 239.83.

Here's a few examples from Asteroids; their fixed-credit system might not be directly helpful to this situation, but shows the extreme cases you refer to. One fast GPU with slow CPU (my Phenom 8400) example, and one slow GPU (my GeForce 870M) with fast GPU example. In the last example, my laptop's GPU gets limited by tThrottle, often to a fairly low percentage to manage heat and noise. I would expect that, if it were allowed to run unrestricted, the GPU would probably outperform the CPU in that case--but I can't really say for certain because I can't re-run that particular task. In fact, due to the variable throttling, that particular host quite often gets outpaced by others running CPU applications but not always.
____________

Profile Janus
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 16 Jun 04
Posts: 4483
Credit: 2,094,806
RAC: 0
Message 14369 - Posted: 3 Apr 2016, 12:42:59 UTC

The next release of Blender (2.77) is about to roll out to the farm and GPU support for Cycles seems to be more mature with every release. Other potential renderers out there (mentioning LuxRender for no apparent reason) also have GPU versions that seem quite mature.

We cannot reliably calculate credit for variable size GPU workunits.

How do you guys feel about solution (2) from above? I.e. that every unit is worth something like 100 cobblestones regardless of runtime?

Profile noderaser
Project donor
Avatar
Send message
Joined: 28 Mar 06
Posts: 507
Credit: 1,551,596
RAC: 223
Message 14372 - Posted: 4 Apr 2016, 4:49:42 UTC

For me, the credit is not important, but other people might complain. I'm just ready for more BURP on GPUs!
____________

Profile DoctorNow
Project donor
Avatar
Send message
Joined: 11 Apr 05
Posts: 392
Credit: 2,168,338
RAC: 22
Message 14375 - Posted: 4 Apr 2016, 15:00:34 UTC - in response to Message 14369.

We cannot reliably calculate credit for variable size GPU workunits.

How do you guys feel about solution (2) from above? I.e. that every unit is worth something like 100 cobblestones regardless of runtime?

This could end in an endless debate so I'm not starting a discussion, though I felt 100 are a bit less from what I saw on the WUs and compared to other GPU projects.
But maybe the new version is also a bit better with rendering and runtime so I will wait and see what outcomes of it.
____________
Life is Science, and Science rules. To the universe and beyond
Proud member of BOINC@Heidelberg
My BOINC-Stats

Daniel
Send message
Joined: 19 Feb 13
Posts: 2
Credit: 143,096
RAC: 0
Message 14406 - Posted: 16 Apr 2016, 11:56:09 UTC - in response to Message 14369.

I don't really mind how you calculate the credit.

I agree with noderaser, I'd just like to start putting my GPUs to work on this project.

Profile Janus
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 16 Jun 04
Posts: 4483
Credit: 2,094,806
RAC: 0
Message 14606 - Posted: 10 Jul 2016, 21:04:37 UTC

Ok, we'll try it out for a while then - see how it goes.

Profile DoctorNow
Project donor
Avatar
Send message
Joined: 11 Apr 05
Posts: 392
Credit: 2,168,338
RAC: 22
Message 14607 - Posted: 10 Jul 2016, 22:21:37 UTC

Has something changed on the requirements for the CUDA app?
My GTX 560 doesn't seem to get any GPU tasks, while my GTX 760 is crunching steadily...
____________
Life is Science, and Science rules. To the universe and beyond
Proud member of BOINC@Heidelberg
My BOINC-Stats

Profile Bryan
Send message
Joined: 25 Apr 13
Posts: 1
Credit: 5,540,968
RAC: 0
Message 14608 - Posted: 11 Jul 2016, 1:48:26 UTC

My GTX 1080 errors about 10 seconds into the WU. Is there something special that is needed?

Wilhelm
Send message
Joined: 26 Feb 16
Posts: 7
Credit: 238,670
RAC: 0
Message 14609 - Posted: 11 Jul 2016, 6:06:24 UTC - in response to Message 14607.

it seems that you GTX 560 only has 1 GB of ram, according to the nvidia specs.
and this work unit has a requirement of 1.4 GB ram. the GTX 760 is fine because it has 2 GB of ram.

Profile DoctorNow
Project donor
Avatar
Send message
Joined: 11 Apr 05
Posts: 392
Credit: 2,168,338
RAC: 22
Message 14610 - Posted: 11 Jul 2016, 8:13:58 UTC - in response to Message 14609.

and this work unit has a requirement of 1.4 GB ram. the GTX 760 is fine because it has 2 GB of ram.

Thanks, oh shoot, should've known it - guess I'm too used to see the RAM requirements as for normal memory.
Quite a lot usage for GPU anyway, reminds me of the AP27-Tests for PrimeGrid.
____________
Life is Science, and Science rules. To the universe and beyond
Proud member of BOINC@Heidelberg
My BOINC-Stats

zioriga
Send message
Joined: 1 Mar 05
Posts: 1
Credit: 298,148
RAC: 0
Message 14612 - Posted: 11 Jul 2016, 12:50:08 UTC

My GTX 970 goes in error after 15 seconds
I will not try with my GTX 1080

I'll wait till the problem will be resolved
____________

Wilhelm
Send message
Joined: 26 Feb 16
Posts: 7
Credit: 238,670
RAC: 0
Message 14613 - Posted: 11 Jul 2016, 16:22:23 UTC - in response to Message 14612.

have you checked this threat https://burp.renderfarming.net/forum_thread.php?id=713
as it looks like you are runing your gpu on linux.
and on a side note the 10xx series is not supported as of yet; look here for more info : http://burp.renderfarming.net/forum_thread.php?id=2443

Daniel
Send message
Joined: 19 Feb 13
Posts: 2
Credit: 143,096
RAC: 0
Message 14614 - Posted: 12 Jul 2016, 5:04:54 UTC - in response to Message 14606.

Ok, we'll try it out for a while then - see how it goes.


Thanks Janus, I downloaded some GPU workunits today and they're crunching nicely.

Profile DoctorNow
Project donor
Avatar
Send message
Joined: 11 Apr 05
Posts: 392
Credit: 2,168,338
RAC: 22
Message 14615 - Posted: 12 Jul 2016, 8:22:10 UTC - in response to Message 14609.

it seems that you GTX 560 only has 1 GB of ram, according to the nvidia specs.
and this work unit has a requirement of 1.4 GB ram.

Turns out that my GTX 560 now runs these tasks also since a while, so the 1.4 GB usage doesn't seem to apply to all of the WUs, actually 800 MB are used on the GPU.

Regarding credits: as said earlier 100 are a bit less sometimes, and since all of my WUs from this session currently took at least an hour and more on the GPU, it's indeed less than what the same amount of time gives on CPU.
I would suggest at least 500 per WU here, but of course it's not my decision. ;-)
____________
Life is Science, and Science rules. To the universe and beyond
Proud member of BOINC@Heidelberg
My BOINC-Stats

1 · 2 · Next
Post to thread

Message boards : Number crunching : Blender (GPU)