max number of error/total tasks

Message boards : Number crunching : max number of error/total tasks
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile rilian
Avatar

Send message
Joined: 8 Aug 09
Posts: 37
Credit: 25,517
RAC: 0
Message 10384 - Posted: 1 Apr 2010, 19:25:59 UTC

why are these (max number of error/total tasks) so high? that's a waste of CPU time

http://burp.renderfarming.net/workunit.php?wuid=1593553
I crunch for Ukraine! The best team in DC universe!
ID: 10384 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4568
Credit: 2,100,409
RAC: 68
Message 10386 - Posted: 1 Apr 2010, 20:04:05 UTC - in response to Message 10384.  
Last modified: 1 Apr 2010, 20:11:20 UTC

Usually workunits do not error out due to errors in the input but rather due to errors in a particular host or a particular combination of workunit and host. A while ago we had some hosts run rampant and eat up workunits and immediately return them as errors. This caused almost every workunit to have one or more errors that weren't related to the workunit at all. So the limit was increased.

Futhermore when an error does occour it can be hard to see what has caused the error. Having more sources of debug information is sometimes crucial to finding the bugs.
Also sometimes a bug only shows up for some hosts - for instance, in the workunit you link a single host actually managed to complete the workunit without failing. In these cases it is very nice to have a broad statistical basis for analysis of what went wrong.

If a session produces only errors then it will be cancelled. Typically only very few frames are allowed to run to the limit like this.

A typical workunit will have 3 hosts working on it only.
ID: 10386 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile rilian
Avatar

Send message
Joined: 8 Aug 09
Posts: 37
Credit: 25,517
RAC: 0
Message 10390 - Posted: 1 Apr 2010, 23:39:24 UTC - in response to Message 10386.  

ok, since project is in Beta - sounds fair
I crunch for Ukraine! The best team in DC universe!
ID: 10390 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile rilian
Avatar

Send message
Joined: 8 Aug 09
Posts: 37
Credit: 25,517
RAC: 0
Message 11332 - Posted: 12 May 2012, 16:26:14 UTC - in response to Message 10390.  

hi

why do we need 3 WU quorum in this project? isn't it the excess resource waste ?

thanks

http://burp.renderfarming.net/workunit.php?wuid=1695428
I crunch for Ukraine! The best team in DC universe!
ID: 11332 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4568
Credit: 2,100,409
RAC: 68
Message 11333 - Posted: 12 May 2012, 20:21:09 UTC - in response to Message 11332.  

Currently we're running tests that include the number of instances per workunit. You will see everything from 2 to 5 depending on where you look. For instance the Linux clients have a few workunits with 5 right here:
http://burp.renderfarming.net/workunit.php?wuid=1695410
ID: 11333 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile rilian
Avatar

Send message
Joined: 8 Aug 09
Posts: 37
Credit: 25,517
RAC: 0
Message 11334 - Posted: 13 May 2012, 6:22:42 UTC - in response to Message 11333.  

well, i meant the quorum (3), not initial replication (5)

i thought if 2 machines return the same rendered image, it is enough to validate the WU ?
I crunch for Ukraine! The best team in DC universe!
ID: 11334 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile rilian
Avatar

Send message
Joined: 8 Aug 09
Posts: 37
Credit: 25,517
RAC: 0
Message 11342 - Posted: 18 May 2012, 13:26:55 UTC - in response to Message 11334.  

buMp ?
I crunch for Ukraine! The best team in DC universe!
ID: 11342 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
noderaser
Project donor
Avatar

Send message
Joined: 28 Mar 06
Posts: 516
Credit: 1,567,702
RAC: 0
Message 11346 - Posted: 22 May 2012, 3:25:07 UTC

"Quorum" means the number of hosts with a successfully validated WU that is required before it is considered successful. So, a quorum of 3 means that three machines must return the same rendered image to validate.
Click here to see My Detailed BOINC Stats
ID: 11346 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile DoctorNow
Project donor
Avatar

Send message
Joined: 11 Apr 05
Posts: 403
Credit: 2,189,214
RAC: 143
Message 11347 - Posted: 22 May 2012, 4:52:37 UTC - in response to Message 11334.  

i thought if 2 machines return the same rendered image, it is enough to validate the WU ?

Well, it's somehow easy explained. Compare two pictures with themselves. If you find even only a light pixel difference how can you say which one of them is really correct? So we need at least two pictures which are absolutely the same. That's why quorum 3 is the least minimum.
Life is Science, and Science rules. To the universe and beyond
Proud member of BOINC@Heidelberg
My BOINC-Stats
ID: 11347 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile rilian
Avatar

Send message
Joined: 8 Aug 09
Posts: 37
Credit: 25,517
RAC: 0
Message 11358 - Posted: 25 May 2012, 22:08:20 UTC - in response to Message 11347.  
Last modified: 25 May 2012, 22:11:06 UTC

DoctorNow, does this mean that rendering here is approximate and sometimes pixels get random color ?

I thought if both WUs are calculated in one math limits (lets say X bytes per float) then they should produce the same output

There are pretty much of biology software that is docking proteins on atomic level and it is enough of having 2 WUs to validate

Image data should not be really "exact match" - your eye would not see the difference between JPEG 100% quality and 80%, so why we cannot have quorum at least 2 here ? (i believe we could have quorum 1 as well here).

Janus, would you like to try it (quorum=1) for example on one session and we will see if it is success? Would be nice to free 66% resourses, isn't it ? if not success, try quorum=2 on the same session ? I would be happy to hjelp crunching this

thanks for reading :)

noderaser: thanks for clarifying terms :) but i already know what quorum means
I crunch for Ukraine! The best team in DC universe!
ID: 11358 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4568
Credit: 2,100,409
RAC: 68
Message 11360 - Posted: 26 May 2012, 11:13:15 UTC - in response to Message 11358.  

Blender cycles sessions will be quorum=1 sessions for a while (the validator doesn't support cycles). Quorum=1 means that any result is always accepted, no matter how wrong it is.
ID: 11360 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile rilian
Avatar

Send message
Joined: 8 Aug 09
Posts: 37
Credit: 25,517
RAC: 0
Message 11362 - Posted: 26 May 2012, 11:56:05 UTC - in response to Message 11360.  

cool! let's see if it will work and free the resources
I crunch for Ukraine! The best team in DC universe!
ID: 11362 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile DoctorNow
Project donor
Avatar

Send message
Joined: 11 Apr 05
Posts: 403
Credit: 2,189,214
RAC: 143
Message 11378 - Posted: 4 Jun 2012, 9:01:59 UTC - in response to Message 11358.  

Couldn't answer earlier, was on vacation meanwhile. ;-)

DoctorNow, does this mean that rendering here is approximate and sometimes pixels get random color ?

LOL, certainly not. Normally if everything goes well two rendered pictures should be exact the same here. But of course it can happen that there are pixel differences - I believe there can be several reasons for such a case, but I have to admit I don't know enough about the Blender render code and the rest around it and can't explain it better.
It was the only reason I could think of why we have a quorum of 3, probably only Janus knows the true reason behind . ;-D
Life is Science, and Science rules. To the universe and beyond
Proud member of BOINC@Heidelberg
My BOINC-Stats
ID: 11378 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4568
Credit: 2,100,409
RAC: 68
Message 11379 - Posted: 5 Jun 2012, 8:00:20 UTC

BURP as such is merely a platform for doing rendering via BOINC. Blender is developing alongside of that and one of the developments in Blender is Cycles which is indeed an approximate rendering mechanism. With that rendering engine pixel values are "random" and slowly converge towards the correct value. The same is the case for other physics based renderers.

"Blender Internal" which is the rendering engine that we have typically been using here is a good old bucket renderer. It renders pixels one at a time to their final value (in that layer). Typically no randomness is involved. However, there are situations where randomness can be introduced into the image anyways - libraries used on the platform, 32bit vs 64bit, Linux vs Windows vs OSX, errors in Blender code, processor floating point precision etc.
ID: 11379 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : max number of error/total tasks