Credits with MT app

Message boards : Number crunching : Credits with MT app
Message board moderation

To post messages, you must log in.

AuthorMessage
zombie67 [MM]
Project donor
Avatar

Send message
Joined: 9 Dec 06
Posts: 90
Credit: 2,118,099
RAC: 0
Message 10719 - Posted: 9 Feb 2011, 7:54:43 UTC
Last modified: 9 Feb 2011, 8:18:29 UTC

I have 40 valid tasks now, with the MT app. Credits per (true) core per hour are ranging from 7 to 26. This is across four machines, 3x 8-core and 1x 4-core. All four are Core2 based machines. In general, anything less than 20 credits per core per hour is a very poor score. Most CPU projects award between 20 and 40. Of course, this assumes that cross-project parity is even possible or desirable (it's neither). If the detailed data is interesting to the admins, just PM me and I will send it.
Dublin, California
Team: SETI.USA

ID: 10719 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile DoctorNow
Project donor
Avatar

Send message
Joined: 11 Apr 05
Posts: 403
Credit: 2,183,005
RAC: 5
Message 10721 - Posted: 9 Feb 2011, 9:59:35 UTC
Last modified: 9 Feb 2011, 10:00:21 UTC

Well, as Janus stated here, the credits are based on DAs special internal server software. That implies automatically the cross-project parity. ;-)
The WUs I got since yesterday all are between 20-26 cr/h/core, no downfall to less than 20 yet.
But I had it before on other sessions and I'm also not very lucky about using this credit scheme.
I would rather stick to fixed credits on a basis calculated what Blender work steps are done or something in this way, but I don't know if this can be realized.
Life is Science, and Science rules. To the universe and beyond
Proud member of BOINC@Heidelberg
My BOINC-Stats
ID: 10721 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]
Project donor
Avatar

Send message
Joined: 9 Dec 06
Posts: 90
Credit: 2,118,099
RAC: 0
Message 10724 - Posted: 9 Feb 2011, 16:09:12 UTC - in response to Message 10721.  

Well, as Janus stated here, the credits are based on DAs special internal server software.


Yep, that is why I made this post, to show actual data.

That implies automatically the cross-project parity. ;-)


Hilarious! =;^)
Dublin, California
Team: SETI.USA

ID: 10724 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4559
Credit: 2,097,282
RAC: 0
Message 10727 - Posted: 9 Feb 2011, 21:58:23 UTC

That implies automatically the cross-project parity. ;-)

=)

I would rather stick to fixed credits on a basis calculated what Blender work steps are done or something in this way, but I don't know if this can be realized.

Unfortunately that is not an option due to the nature of 3D rendering: no two workunits are the same. Estimating their "value" requires you to render them first - which kinda beats the point.

There is a lot to be said in the discussion about the new credit system (especially in the case of a project that cannot produce runtime estimates of its workunits ahead of time), but I'll refrain from doing so until I have a proper statistical basis to base my conclusions upon. In other words: Lets give it a shot and see what happens.
ID: 10727 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tomasz R. Gwiazda
Avatar

Send message
Joined: 20 Jul 09
Posts: 6
Credit: 82,543
RAC: 0
Message 10735 - Posted: 10 Feb 2011, 17:27:32 UTC - in response to Message 10727.  

silly question...

are the task ever complete?

after error of app, they start from 0% again

and what about strange cpu usage? using i7 950 , the task is using 13% of processor, after some time it goes to 86-90% (and after few minutes it crashes)
ID: 10735 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4559
Credit: 2,097,282
RAC: 0
Message 10737 - Posted: 10 Feb 2011, 18:11:40 UTC
Last modified: 10 Feb 2011, 18:12:43 UTC

Do you have "leave tasks in memory" enabled in your preferences?
ID: 10737 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tomasz R. Gwiazda
Avatar

Send message
Joined: 20 Jul 09
Posts: 6
Credit: 82,543
RAC: 0
Message 10738 - Posted: 10 Feb 2011, 18:28:10 UTC - in response to Message 10737.  

why should i turn it on?

I every other boinc projects it is off and works fine.
In addition i stop other CPU project during BURP crunch
ID: 10738 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4559
Credit: 2,097,282
RAC: 0
Message 10739 - Posted: 10 Feb 2011, 18:38:41 UTC - in response to Message 10738.  
Last modified: 10 Feb 2011, 18:44:53 UTC

If "leave applications in memory" is off BOINC will stop the science application at random times and remove it from memory to restart it at the latest checkpoint. This causes BURP's applications to restart from the beginning since Blender does not support checkpoints (and currently it also causes the app to crash for some reason).

I every other boinc projects it is off and works fine.

Nope, it is off and you do not notice the effect. Essentially the workunits will complete slightly faster in any project if you allow them to stay in memory - especially if you are running multiple projects or suspending/resuming things alot. BURP is kinda extreme because situations can arise where a workunit will simply never complete because it keeps getting interrupted and has to start over.
Any modern operating system will swap out suspended apps without affecting system performance, so there really is no reason for turning it off (unless you are running short on both memory and swap space).

If you absolutely MUST run without swap you should set "Switch between tasks" to something very high so that applications are never switched out and are allowed to run to completion. I cannot recommend this, though.
ID: 10739 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Tomasz R. Gwiazda
Avatar

Send message
Joined: 20 Jul 09
Posts: 6
Credit: 82,543
RAC: 0
Message 10742 - Posted: 11 Feb 2011, 16:42:03 UTC - in response to Message 10739.  

It's works!

Thanks a lot for patience and working solution :)
ID: 10742 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]
Project donor
Avatar

Send message
Joined: 9 Dec 06
Posts: 90
Credit: 2,118,099
RAC: 0
Message 10777 - Posted: 18 Feb 2011, 16:43:13 UTC

An update on the recent credits. For my 8-core machine, it is averaging 17 credits per hour per core. Still pretty low, but bearable for now, I guess.

For one of my quad core machines, it is averaging only 7 credits per hour per core. Ouch. Almost not worth crunching at that level.

One thing I noticed that looks pretty odd. The CPU time for the 8-core machine should be 8x the run time (assuming 100% efficiency). In reality, it is showing only about 4x on average (range of 3x-5x). So only about 50% utilization. It is about the same for the quad core with 50% utilization on average, but with a much wider range 1x-3x. Most of the tasks have the CPU time just *barely* greater than the run time. So it's like it's using only one of four cores. It's a good thing that there were a couple of very high utilization tasks sprinkled in to bring up the average to half.

All that considered, it seems like we are wasting about half our resources with the MT app. Is that going to improve significantly soon? If not, it seems like it would be better for the project to go back to the single threaded app. Sure, tasks take longer to complete, but you would get 2x the work done over a similar period of time.
Dublin, California
Team: SETI.USA

ID: 10777 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4559
Credit: 2,097,282
RAC: 0
Message 10780 - Posted: 18 Feb 2011, 19:13:56 UTC - in response to Message 10777.  
Last modified: 18 Feb 2011, 19:20:37 UTC

There are several things to notice in this post, I'll try to explain them one-by-one while quoting you.

One thing I noticed that looks pretty odd. The CPU time for the 8-core machine should be 8x the run time (assuming 100% efficiency)

This assumption is not correct. It would have been really nice if the two (calculations/sec and number of cores) were linearly proportional, though. Adding cores currently only affects the "middle" portion of Sunflower workunits directly, not the preprocessing (startup/loading) and postprocessing (shutdown/compositing/compression) portions which can be a significant part of some of the sessions.

Furthermore the workunits are not always CPU-bound. For example they will write out and read in multiple gigabyte-size files to the harddrive. While doing this the CPU is simply waiting for I/O to complete. Similarly, some of the sessions will allocate and de-allocate massive amounts of memory, while this happens and while things are being loaded into memory most memory and I/O controllers involved will be at 100% utilization while the CPU will be doing very little.

BOINC does not measure (nor credit) these other factors in the computer systems. I'm sure that we can both agree that it should - but also that it is non-trivial how it should do that.

Most of the tasks have the CPU time just *barely* greater than the run time.

I guess this is the workunit you are referring to.
This is a good example of what I wrote about above. The actual renderpass consists of copying in a background image and then rendering the trees that are only just visible in the lower right 10% of the view. Such a workunit would consist almost purely of startup and shutdown parts.

All that considered, it seems like we are wasting about half our resources with the MT app. it seems like it would be better for the project to go back to the single threaded app. Sure, tasks take longer to complete, but you would get 2x the work done over a similar period of time.

Do not confuse multihtreading with Sunflower-specific rendering patterns. The fact that the MT clients only allow the rendering passes to scale to multiple cores is particularly visible with the Sunflower sessions. Normal sessions do not use a gigabyte of datafiles with extremely complex scenery linked across multiple libraries - they will not suffer from the same startup "penalties". In the same vein most normal sessions do not use compositing extensively and hence only have to write the rendered result to a file (which is pretty fast when rendering to PNG and smaller resolutions).

Keep in mind, again, that these Sunflower workunits are not "normal". Running two of them in parallel will only be possible on less than 1% of the machines currently attached to the farm. It is true that those 1% will then be running at (possibly) twice the speed, but it simply does not make up for the loss of rendering nodes.

Is that going to improve significantly soon?

Is that going to improve? Yes! Certainly it is, already in Blender 2.5x I am seeing definite improvements in both compositing and scene loading/preparation efficiency. Also, memory and disk controllers are getting faster every day.
Will it change soon? Good question. The Sunflower sessions will be rendered using 2.4x clients - most likely a mix of Blender 2.49b and 2.46. Some of the sessions will be fairly heavy and will scale well with cores (but not the entire workunit, mind you) and some will be smaller and more I/O dependant.

If you like to optimize the amount of work you can get out of the current clients you should make sure that:
* Your BOINC data and slots directories are on very fast media. Personally I use a RAID0 SSD setup and am seeing extreme improvements.
* Memory controllers and I/O controllers should be running at their optimal speeds. Depending on your CPU clock and memory latencies you may have to change the memory clock up or down to get a perfect match. This is outside the scope of this post - but I'm sure that plenty of OC forums will be able to help you here. (Keep in mind that messing with your system clocking may void your warranty). You may be surprised to see that if you incorrectly overclock your CPU, your computer will be running slower (measured in real time) on the Sunflower sessions due to memory and I/O timing being out of sync.
* Cables: Make sure that all I/O cables are of good quality and are firmly attached. My RAID0 setup (above) failed to deliver more than 200MB/s initially due to bad cables.
ID: 10780 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile DoctorNow
Project donor
Avatar

Send message
Joined: 11 Apr 05
Posts: 403
Credit: 2,183,005
RAC: 5
Message 10781 - Posted: 18 Feb 2011, 19:53:02 UTC - in response to Message 10780.  

* Your BOINC data and slots directories are on very fast media. Personally I use a RAID0 SSD setup and am seeing extreme improvements.

Oh dear, I already knew I had better bought an SSD as you stated earlier that the disk-i/o is that much for the sunflower event... ;-)
Life is Science, and Science rules. To the universe and beyond
Proud member of BOINC@Heidelberg
My BOINC-Stats
ID: 10781 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4559
Credit: 2,097,282
RAC: 0
Message 10782 - Posted: 18 Feb 2011, 19:55:19 UTC - in response to Message 10781.  

* Your BOINC data and slots directories are on very fast media. Personally I use a RAID0 SSD setup and am seeing extreme improvements.

Oh dear, I already knew I had better bought an SSD as you stated earlier that the disk-i/o is that much for the sunflower event... ;-)

It also has the added benefit that programs just instantly open when you click on them. Why do good things have to be so expensive?
ID: 10782 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zombie67 [MM]
Project donor
Avatar

Send message
Joined: 9 Dec 06
Posts: 90
Credit: 2,118,099
RAC: 0
Message 10784 - Posted: 19 Feb 2011, 5:44:57 UTC

Thanks for the info. That answers a lot questions, and clears up some misunderstandings on my part. But it also leads to a couple items:

1) BOINC credits are based on CPU time. So if the app has idle CPU time, for whatever reason, BOINC is going to under award credits. The only way I can see to bring credits back up to something reasonable is to apply a multiplier to the credits that would normally be awarded. Is that the plan? And if so, how long are you going to gather data before making a change?

2) You talk about Sunflower vs. normal. Or is that unusually demanding sunflower tasks vs. normal sunflower tasks? It's not clear to me. In any case, what is the plan going forward? Will there be a mix? instead of all sunflower like now? And if so, will there be a preference setting to allow to chose which to crunch?

Thanks!
Dublin, California
Team: SETI.USA

ID: 10784 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Janus
Volunteer moderator
Project administrator
Avatar

Send message
Joined: 16 Jun 04
Posts: 4559
Credit: 2,097,282
RAC: 0
Message 10785 - Posted: 19 Feb 2011, 9:50:58 UTC - in response to Message 10784.  
Last modified: 19 Feb 2011, 10:14:51 UTC

The only way I can see to bring credits back up to something reasonable is to apply a multiplier to the credits that would normally be awarded. Is that the plan? And if so, how long are you going to gather data before making a change?

"Reasonable" is a relative term.
See this and this and the countless cross-project credit issues on the mailing lists. You are welcome to discuss it here and here.

To boil it down: The system is supposed to adapt itself over time. If it doesn't then you are welcome to restart a 4-year old "flamewar" about credit - but not in these forums. Feel free to post back here when you find a reasonably practical solution that can be proven mathematically sane (it is harder than you may think, I've been there trice).

2) You talk about Sunflower vs. normal. Or is that unusually demanding sunflower tasks vs. normal sunflower tasks?

Sunflower sessions vs non-Sunflower sessions (even on Sunflower clients).

In any case, what is the plan going forward? Will there be a mix? instead of all sunflower like now? And if so, will there be a preference setting to allow to chose which to crunch?

See the bottom of this announcement.

Apart from the above there will be improvements from BURP over time - what you are seeing right now are the first careful steps into MT/GPU apps - but we simply do not have the manpower to revolutionize the BOINC credit system while at the same time building a distributed renderfarm; focus will be on one or the other. On the other hand I do not want to squelch talk about the credit system - I firmly believe that it is a good thing that people think about it and come up with better solutions in the future. I hope this is clear.
ID: 10785 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Credits with MT app