Milkyway issues

Get physical with some of these projects? Asteroids, Cosmology, Einstein, LHC, MilkyWay, Test4Theory,
Post Reply
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Milkyway issues

Post by homefarm »

HI All

All my CPU threads on WCG Open Pandemics. (96)

Surprising they haven't implemented GPU routines, offering such an immense improvement in thru put. Different skill set I suppose.

BTW I tried Milky Way with my GPUs and 60% fell over with inconclusives. If anyone knows why I'd be grateful. They all (5) returned thousands of perfect results on PG sub projects. The MW results took about 58 secs each, but until I solve the problem I won't waste time on MW.
Woodles
UBT Contributor
Posts: 11757
Joined: Thu Dec 20, 2007 12:00 am
Location: Cambridgeshire

Re: Milkyway issues

Post by Woodles »

Hi Richard,

Same here :)

Re. Milkyway. You only have one host listed since June last year and it has a 100% success rate. Were you thinking of a different project?

Mark
chriscambridge
Active UBT Contributor 1+ yr
Posts: 2178
Joined: Mon Aug 08, 2016 1:56 pm
Location: UK

Re: Milkyway issues

Post by chriscambridge »

Surprising they haven't implemented GPU routines
I think I read in the forums that the Tookit/API they are using for this research can do GPU crunching, but to begin just to get the project out there as quick as possible they only setup CPUs; but were also actively looking at GPUs for future revisions.

Not sure about MW as I havent run this for a while. But if more than one person is getting the same issue then it must be to do with their existing dataset/model, rather than one's setup.
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Re: Milkyway issues

Post by homefarm »

Woodles wrote: Fri May 15, 2020 12:05 pm Hi Richard,

Same here :)

Re. Milkyway. You only have one host listed since June last year and it has a 100% success rate. Were you thinking of a different project?

Mark
HI

As far as I know only 3 Milky Ways, 1. Galaxy, 2 Choccy bar, 3 The Boinc app I tried ;-)

BUT, guess what, I just logged in and there's 30 valid tasks, GPU & CPU. When I was running them they reported as inconclusive. I presumed that was a failure, but maybe it just their way of handling uploads?

I'm running some tasks currently and I'll report back.
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Re: Milkyway issues

Post by homefarm »

OK, MW GPU tasks just uploaded, about 60 secs each run time, and they all went inconclusive.

I'm running 2 GPUs in parallel so I wonder if that's twisting its knickers?

Afterburner shows only 1 GPU working, around 19% usage (Titan V - GV100). The other GPU is not running MW tasks, 1%, 27 degrees C. All are inconclusive and then getting sorted later. Furthermore it only feeds me tasks by update, once they are complete, there's no inflight refuelling, I have to update MW to get more tasks, usually 9 tasks arrive then no replenishments.
Woodles
UBT Contributor
Posts: 11757
Joined: Thu Dec 20, 2007 12:00 am
Location: Cambridgeshire

Re: Milkyway issues

Post by Woodles »

Hi Richard,

If you click on the workunit for each inconclusive task you'll find that the wingman hasn't returned the task yet. Only your result, nothing to check it against = inconclusive.

Look at the Validate state for each task - "Checked, but no consensus yet"

Doesn't look like there's anything wrong with your work, you're just too fast! :D

All your tasks do seem to be running on just your Titan even though Milkyway sees your 2080 ti as well. There are other people running a 2080 ti so the project can use them. Silly question but I assume you've got <use_all_gpus> set in cc_config?

There's a thread on the message boards about not getting work for ten minutes after an update - https://milkyway.cs.rpi.edu/milkyway/fo ... =6&start=0 Looks like it's just the way that the Milkway server is set up.

Mark
ChelseaOilman
TeAm AnandTech team member
Posts: 807
Joined: Sat May 02, 2020 6:14 pm
Location: Texas/Colorado

Re: Milkyway issues

Post by ChelseaOilman »

homefarm, go to this post in the thread Woodles linked to.
https://milkyway.cs.rpi.edu/milkyway/fo ... 9272#69272

Download the boinc.exe file using the link in that post. It's version 7.15.0.0
Then just stop the boinc client and switch boinc.exe files.

And don't worry about the inconclusives.

I don't know about the GPUs your running MW on but on my old AMD 280X GPUs and on my new Radeon VII GPUs I run 4 tasks at a time on each GPU. 4 tasks complete in around 45 seconds on the VIIs, not just 1. It's why I have the VIIs and 3 billion credits in MW.
ChelseaOilman
TeAm AnandTech team member
Posts: 807
Joined: Sat May 02, 2020 6:14 pm
Location: Texas/Colorado

Re: Milkyway issues

Post by ChelseaOilman »

An admin should probably move all these MW posts to the MW section.
Woodles
UBT Contributor
Posts: 11757
Joined: Thu Dec 20, 2007 12:00 am
Location: Cambridgeshire

Re: Milkyway issues

Post by Woodles »

ChelseaOilman wrote: Fri May 15, 2020 4:29 pmAn admin should probably move all these MW posts to the MW section.
You think we're a professional bunch?! :o
ChelseaOilman
TeAm AnandTech team member
Posts: 807
Joined: Sat May 02, 2020 6:14 pm
Location: Texas/Colorado

Re: Milkyway issues

Post by ChelseaOilman »

Woodles wrote: Fri May 15, 2020 5:17 pm You think we're a professional bunch?! :o
LOL
The one thing I really miss from my old team is the Slack/Discord channel all the regulars conversed in. It does take away a bit from the forum though.
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Re: Milkyway issues

Post by homefarm »

Hi, I am running a batch file command to update MW, every 256 secs. It seems to keep higher work levels, but it's not perfect.
I don't want to download old versions of boinc.exe as it may upset other projects. (BTW ChelseaOM, thanks for the suggestion)
I am running Folding again, on GPUs only; and WCG open pandemics for the CPUs. I have also reduced CPU usage to 50%, to keep heat levels down.
Both my GPUs are blowers so that helps things even if it is noisier.
ChelseaOilman
TeAm AnandTech team member
Posts: 807
Joined: Sat May 02, 2020 6:14 pm
Location: Texas/Colorado

Re: Milkyway issues

Post by ChelseaOilman »

It's not an old version of boinc.exe really. It's a specially tweaked version just for MW. You can easily switch back to the newer version anytime just by stopping the client and switching boinc.exe. You don't need to use the batch file to hammer the server with this tweaked exe. It works better than the batch file. You get tasks continuously with it, no batch file needed. I get more PPD using it. I run other projects on it as well and haven't noticed any issues. Currently running Einstein.
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Re: Milkyway issues

Post by homefarm »

ChelseaOilman wrote: Fri May 15, 2020 9:43 pm It's not an old version of boinc.exe really. It's a specially tweaked version just for MW. You can easily switch back to the newer version anytime just by stopping the client and switching boinc.exe. You don't need to use the batch file to hammer the server with this tweaked exe. It works better than the batch file. You get tasks continuously with it, no batch file needed. I get more PPD using it. I run other projects on it as well and haven't noticed any issues. Currently running Einstein.
OK, thanks, I'll look again tomorrow. BFN.
ChelseaOilman
TeAm AnandTech team member
Posts: 807
Joined: Sat May 02, 2020 6:14 pm
Location: Texas/Colorado

Re: Milkyway issues

Post by ChelseaOilman »

You probably need to run more tasks on your GPUs at the same time as well. MW doesn't load up the GPU. I run 4 at a time on each GPU, NP. They run almost as fast as running a single task.
ChelseaOilman
TeAm AnandTech team member
Posts: 807
Joined: Sat May 02, 2020 6:14 pm
Location: Texas/Colorado

Re: Milkyway issues

Post by ChelseaOilman »

Each of my Radeon VIIs do more than 1.5 million PPD. I have 10 VIIs and have hit 18 million PPD with them. I have 2 more VIIs coming to make the total 12.
Woodles
UBT Contributor
Posts: 11757
Joined: Thu Dec 20, 2007 12:00 am
Location: Cambridgeshire

Re: Milkyway issues

Post by Woodles »

FWIW I run Boinc versions as old as 7.4.22 :) My most 'modern' version is 7.10.2 :D And I've never had any problems with any projects.

If it ain't broke, don't fix it!
Woodles
UBT Contributor
Posts: 11757
Joined: Thu Dec 20, 2007 12:00 am
Location: Cambridgeshire

Re: Milkyway issues

Post by Woodles »

ChelseaOilman wrote: Fri May 15, 2020 9:58 pmEach of my Radeon VIIs do more than 1.5 million PPD. I have 10 VIIs and have hit 18 million PPD with them. I have 2 more VIIs coming to make the total 12.
Impressive, I did 1.4 million total on my best day ever ... and that was with bunkering! :0
UBT - Timbo
UBT Forum Admin
Posts: 9673
Joined: Mon Mar 13, 2006 12:00 am
Location: NW Midlands
Contact:

Re: Milkyway issues

Post by UBT - Timbo »

ChelseaOilman wrote: Fri May 15, 2020 4:29 pm An admin should probably move all these MW posts to the MW section.
...and if this was a really well-run forum, an admin would also correct the name of the project in the subject line of each of these posts...

oops - they have ;-)

regards
Tim
ChelseaOilman
TeAm AnandTech team member
Posts: 807
Joined: Sat May 02, 2020 6:14 pm
Location: Texas/Colorado

Re: Milkyway issues

Post by ChelseaOilman »

Radeon VII = Double Precision 3.5 TFLOPS (1/4 rate)

That's the name of the game with Milkyway.

Nvidia and most other GPUs are 1/16 rate DP.
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Re: Milkyway issues

Post by homefarm »

HI all
I now have 7.15 boinc.exe in program files/boinc

The boincmgr.exe is still 7.16.5, but I presume that's OK?

In the milky way projects folder, I have
<app_config>
<app>
<name>milkyway</name>
<gpu_versions>
<gpu_usage>0.1</gpu_usage>
<cpu_usage>0.05</cpu_usage>
</gpu_versions>
</app>
</app_config>

It's running 7 tasks on the GV-100 ( which is high in DP tasks, like VII). Duration 60 secs or so for 7 tasks.

It's getting up to 95% usage on the Titan V, but doesn't use the 2080ti second GPU even with config gpus set to 1.

The remaining issue is that MW , under Boinc 7.15, still does not load tasks properly, after running a set of 7 it goes back to deferment cycling, but does not reload tasks. I have it set to 3 days work loading as well.

Am I missing something pls? ( my 73 year old brain can't seem to get any further ;-) )

As a last resort I have run the batch file update with 256 sec delay, and that gets me a batch of 7 tasks the 1 min. then it defers again before getting another 7 tasks.

Sorry to be a bore with this problem, and thanks all of you for your help. GREAT FORUM!
UBT - Timbo
UBT Forum Admin
Posts: 9673
Joined: Mon Mar 13, 2006 12:00 am
Location: NW Midlands
Contact:

Re: Milkyway issues

Post by UBT - Timbo »

Hi Richard

I doubt the version of BOINC is going to be a problem - as the Manager program is separate to the boinc program and as long as both of their respective filenames are what each expects to "see" (irrespective of version number) then neither will care too much.

re: app_config

This is my version:
<app_config>
<app>
<name>milkyway</name>
<max_concurrent>2</max_concurrent>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>0.75</cpu_usage>
</gpu_versions>
</app>
</app_config>
This for an older CPU and older (single) GPU, both of which are less powerful than yours. So, I doubt that this is the issue - but if you want to edit your file to the values shown in mine then try it - so, just suspend any MW tasks, find the "app_config.xml" file within the ProgramData/BOINC/projects/milkyway.cs.rpi.edu_milkyway folder and right click on the filename and select "Notepad" to Open it, change the values and then save it.

Note that "max_concurrent" will limit the max number of concurrent tasks but if you are not using CPUs for MW, then this is fine...but in the future if you do use CPUs for this project you'll only get 2 tasks at once.

Then go back to BOINC Manager and click on "Options> Read config files" and then re-allow any MW tasks.

Your current setting of <gpu_usage>0.1</gpu_usage> is allowing 10 instances of the projects tasks to run on each GPU...that may be too much even for a high end GPU. So, using <gpu_usage>1.0</gpu_usage> for now will at least get you going.

Then go back to BOINC Manager and click on "Options> Read config files" and then re-allow any MW tasks.

If that works in terms of allowing tasks to flow, the next step is to change the <gpu_usage>1.0</gpu_usage> value to <gpu_usage>0.5</gpu_usage> - this will then allow 2 instances of the project tasks to run concurrently.

Then go back to BOINC Manager and click on "Options> Read config files" and then re-allow any MW tasks.

re: app_info.xml

Also: you may want an "app_info.xml" file within the MW project directory which contains the following code:
<app_info>
<app>
<name>milkyway</name>
</app>

<app_version>
<app_name>milkyway</app_name>
<coproc>
<type>NVIDIA</type>
<count>1.0</count>
</coproc>
</app_version>
</app_info>
re: GPU settings

This time, suspend ALL crunching and close BOINC Manager down.

Within the ProgramData/BOINC directory is a file called "cc_config.xml". Once you find this, open it with Notepad and look within the file for the following line:
<use_all_gpus>1</use_all_gpus>
If it's there (exactly as shown above) then BOINC should use both GPUs for crunching. If the value is "0" then change it to "1".
Then save the file, close Notepad and then resume BOINC Manager.

re: Deferring tasks

In simple terms, BOINC has a built-in mechanism (the client-side "scheduler") where it "knows" your crunching history and it assigns a "weighting" to each project. This then means that you are not in full control of which tasks are given "priority"...and it will stay that way until BOINC Manager "decides" that you have now crunched enough tasks of each project to give them all equal "weight".

https://boinc.berkeley.edu/wiki/REC-based_scheduler

The advice here is to try a couple of things:

1) Try setting ALL projects to "Won't get new tasks" and then RESET each of the projects listed with BOINC Manager - do this once you have NO tasks to crunch (otherwise it'll simply abort them all and if you are halfway through a task then any crunching would be wasted).

2) Set the cache to be a bit higher than you would normally use on the "troublesome project" (in this case MW) as the revised schedular (in v7.x of BOINC) needs about 10+ tasks to reset it's internal settings for the "REC" (Recent Estimated Credit).

Then only allow MW to download tasks and give it a couple of hours.

See if any of the above helps.

regards
Tim

PS: If any other forum members can correct any mistakes I've made please mention it and I'll update the above.
ChelseaOilman
TeAm AnandTech team member
Posts: 807
Joined: Sat May 02, 2020 6:14 pm
Location: Texas/Colorado

Re: Milkyway issues

Post by ChelseaOilman »

Unless Boinc Manager is showing 7.15.0.0 in the bottom right corner I don't think you have it set up correctly. You have to stop the client, Rename or remove the 7.16.5 boinc.exe file and replace it with the 7.15.0.0 boinc.exe file before restarting the client. They can't both run at the same time, I don't think. It's one or the other.
homefarm wrote: Sat May 16, 2020 9:59 am Am I missing something pls? ( my 73 year old brain can't seem to get any further ;-) )
I'm 72 and struggle with this stuff as well. Especially Linux! lol
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Re: Milkyway issues

Post by homefarm »

ChelseaOilman wrote: Sat May 16, 2020 2:53 pm Unless Boinc Manager is showing 7.15.0.0 in the bottom right corner I don't think you have it set up correctly. You have to stop the client, Rename or remove the 7.16.5 boinc.exe file and replace it with the 7.15.0.0 boinc.exe file before restarting the client. They can't both run at the same time, I don't think. It's one or the other.
homefarm wrote: Sat May 16, 2020 9:59 am Am I missing something pls? ( my 73 year old brain can't seem to get any further ;-) )
I'm 72 and struggle with this stuff as well. Especially Linux! lol
HI, Yes definitely 7.15.0. Also the files were replaced as per instructions.

I worked all morning on this, I think I freaked out the MW server, so it was punishing me. Bit like Oliver Twist " Please, sir, may I have some more please?"

Having a break to do some farming, and selling a young bull, I am now in peak of mental ability :?
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Re: Milkyway issues

Post by homefarm »

UBT - Timbo wrote: Sat May 16, 2020 11:23 am Hi Richard

I doubt the version of BOINC is going to be a problem - as the Manager program is separate to the boinc program and as long as both of their respective filenames are what each expects to "see" (irrespective of version number) then neither will care too much.

re: app_config

This is my version:
<app_config>
<app>
<name>milkyway</name>
<max_concurrent>2</max_concurrent>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>0.75</cpu_usage>
</gpu_versions>
</app>
</app_config>
This for an older CPU and older (single) GPU, both of which are less powerful than yours. So, I doubt that this is the issue - but if you want to edit your file to the values shown in mine then try it - so, just suspend any MW tasks, find the "app_config.xml" file within the ProgramData/BOINC/projects/milkyway.cs.rpi.edu_milkyway folder and right click on the filename and select "Notepad" to Open it, change the values and then save it.

Note that "max_concurrent" will limit the max number of concurrent tasks but if you are not using CPUs for MW, then this is fine...but in the future if you do use CPUs for this project you'll only get 2 tasks at once.

Then go back to BOINC Manager and click on "Options> Read config files" and then re-allow any MW tasks.

Your current setting of <gpu_usage>0.1</gpu_usage> is allowing 10 instances of the projects tasks to run on each GPU...that may be too much even for a high end GPU. So, using <gpu_usage>1.0</gpu_usage> for now will at least get you going.

Then go back to BOINC Manager and click on "Options> Read config files" and then re-allow any MW tasks.

If that works in terms of allowing tasks to flow, the next step is to change the <gpu_usage>1.0</gpu_usage> value to <gpu_usage>0.5</gpu_usage> - this will then allow 2 instances of the project tasks to run concurrently.

Then go back to BOINC Manager and click on "Options> Read config files" and then re-allow any MW tasks.

re: app_info.xml

Also: you may want an "app_info.xml" file within the MW project directory which contains the following code:
<app_info>
<app>
<name>milkyway</name>
</app>

<app_version>
<app_name>milkyway</app_name>
<coproc>
<type>NVIDIA</type>
<count>1.0</count>
</coproc>
</app_version>
</app_info>
re: GPU settings

This time, suspend ALL crunching and close BOINC Manager down.

Within the ProgramData/BOINC directory is a file called "cc_config.xml". Once you find this, open it with Notepad and look within the file for the following line:
<use_all_gpus>1</use_all_gpus>
If it's there (exactly as shown above) then BOINC should use both GPUs for crunching. If the value is "0" then change it to "1".
Then save the file, close Notepad and then resume BOINC Manager.

re: Deferring tasks

In simple terms, BOINC has a built-in mechanism (the client-side "scheduler") where it "knows" your crunching history and it assigns a "weighting" to each project. This then means that you are not in full control of which tasks are given "priority"...and it will stay that way until BOINC Manager "decides" that you have now crunched enough tasks of each project to give them all equal "weight".

https://boinc.berkeley.edu/wiki/REC-based_scheduler

The advice here is to try a couple of things:

1) Try setting ALL projects to "Won't get new tasks" and then RESET each of the projects listed with BOINC Manager - do this once you have NO tasks to crunch (otherwise it'll simply abort them all and if you are halfway through a task then any crunching would be wasted).

2) Set the cache to be a bit higher than you would normally use on the "troublesome project" (in this case MW) as the revised schedular (in v7.x of BOINC) needs about 10+ tasks to reset it's internal settings for the "REC" (Recent Estimated Credit).

Then only allow MW to download tasks and give it a couple of hours.

See if any of the above helps.

regards
Tim

PS: If any other forum members can correct any mistakes I've made please mention it and I'll update the above.
Hi Tim, Thank you so much for your copious notes and suggestions, much appreciated. I will now re-do the app_config etc.
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Re: Milkyway issues

Post by homefarm »

Hi Tim

1. Removed MW entirely.
2. Rejoined MW. Using 7.15.0 boinc.exe
3. app_config.xml as below
<app_config>
<app>
<name>milkyway</name>
<max_concurrent>6</max_concurrent>
<gpu_versions>
<gpu_usage>0.1666666</gpu_usage>
<cpu_usage>.5</cpu_usage>
</gpu_versions>
</app>
</app_config>

4. MW downloads a batch of 6 tasks, then runs 6 on GV100 only, takes up to 85% usage of GPU, run time 56 secs for 6 tasks simultaneously.
5. MW then goes to sleep, no reloads without "update"
6. Leaving it to run for a while

Addendum: The other 7890XE with 2 GPUs , running 7.15.0, has developed a cache of tasks (?)
ChelseaOilman
TeAm AnandTech team member
Posts: 807
Joined: Sat May 02, 2020 6:14 pm
Location: Texas/Colorado

Re: Milkyway issues

Post by ChelseaOilman »

This is my app_config.xml file for MW to run 4 tasks per GPU:

Code: Select all

<app_config>
<app>
<name>milkyway</name>
<gpu_versions>
<gpu_usage>0.25</gpu_usage>
<cpu_usage>0.50</cpu_usage>
</gpu_versions>
</app>
</app_config>
It's pretty simple. I'm not using anything else. With MW you may be able to use less CPU, it doesn't need much.
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Re: Milkyway issues

Post by homefarm »

ChelseaOilman wrote: Sat May 16, 2020 9:27 pm This is my app_config.xml file for MW to run 4 tasks per GPU:

Code: Select all

<app_config>
<app>
<name>milkyway</name>
<gpu_versions>
<gpu_usage>0.25</gpu_usage>
<cpu_usage>0.50</cpu_usage>
</gpu_versions>
</app>
</app_config>
It's pretty simple. I'm not using anything else. With MW you may be able to use less CPU, it doesn't need much.
Hi, Now running 6 on the Titan V CEO, exactly as your app_config, but updated numeric values.

My only issue seems it's reluctance to send more tasks, even on 7.15.0

<app_config>
<app>
<name>milkyway</name>
<max_concurrent>6</max_concurrent>
<gpu_versions>
<gpu_usage>0.15</gpu_usage>
<cpu_usage>.2</cpu_usage>
</gpu_versions>
</app>
</app_config>
ChelseaOilman
TeAm AnandTech team member
Posts: 807
Joined: Sat May 02, 2020 6:14 pm
Location: Texas/Colorado

Re: Milkyway issues

Post by ChelseaOilman »

Maybe MW has to figure out it's dealing with a reliable host before it starts sending extra tasks. I always get a bunch.
I never do MW CPU tasks if that has any effect.
Also, since using the 7.15.0.0 client I never use the batch file to hammer the server and my cache is set to 0.1
UBT - Timbo
UBT Forum Admin
Posts: 9673
Joined: Mon Mar 13, 2006 12:00 am
Location: NW Midlands
Contact:

Re: Milkyway issues

Post by UBT - Timbo »

homefarm wrote: Sat May 16, 2020 9:35 pm Hi, Now running 6 on the Titan V CEO, exactly as your app_config, but updated numeric values.

My only issue seems it's reluctance to send more tasks, even on 7.15.0

<app_config>
<app>
<name>milkyway</name>
<max_concurrent>6</max_concurrent>
<gpu_versions>
<gpu_usage>0.15</gpu_usage>
<cpu_usage>.2</cpu_usage>
</gpu_versions>
</app>
</app_config>
Hi Richard

A quick question for you:

Did you try the "<gpu_usage>1.0</gpu_usage>" setting as I mentioned in my "long message" above?

If so, did BOTH GPUs get work?

Because I'm wondering this:

If you had the original setting as this: "<gpu_usage>0.1</gpu_usage>" then this might try to run 10 tasks on EACH GPU...as BOINC Manager doesn't know the capabilities of both GPUs, unless you tell it...and I doubt the 2080ti can run 10 MW tasks at once

Using my setting (of 1.0) would only run ONE task per GPU and in theory that should work.

If you are now running "<gpu_usage>0.15</gpu_usage>" that will try to run 6 tasks on BOTH GPUs...and I don't know if the 2080ti can cope with even concurrent 6 tasks.

Obviously if both GPUs were identical, then this shouldn't matter...but they aren't the same and so, one might have to create a specific "app_info.xml" that gives the parameters for both GPUs seperately...

re: getting new tasks.

As mentioned, you need to let BOINCs scheduler adjust to your new settings and this it can only do, once it has processed and completed some MW tasks...so, set the GPU usage to 1.0, allow new tasks to download from MW ONLY and then leave it for a few hours...as it needs time to re-configure the schedular to allow more tasks to download.

regards
Tim
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Re: Milkyway issues

Post by homefarm »

Hi Tim
Yes I tried it, and it limited the tasks but still only finds 1 GPU.

I have detached the MW project on the problem PC, but left the other 2 PCs attached and running.

Next week I will start to do a new build, and will transfer the 2080ti to that new PC. That will make life simpler, and possibly more productive for MW.

Ironically MW is not currently a task I would choose to run for my own interest, but was trying to prepare the project for use in a Sprint. At least I still have 2 PCs happily running MW.

This 7890XE successfully runs Folding on background low priority, whilst also running Prime Grid in Boinc. The 2 projects share the GPUs, and make full use of their power. It's a good combination for me, allowing a beneficial project (Folding ie Covid-19) to run with a fun project (prime Grid).

Many thanks for all the tips, and advice.. :)

Quote for today

"It is the long history of humankind (and animal kind, too) that those who learned to collaborate and improvise most effectively have prevailed."
– Charles Darwin
UBT - Timbo
UBT Forum Admin
Posts: 9673
Joined: Mon Mar 13, 2006 12:00 am
Location: NW Midlands
Contact:

Re: Milkyway issues

Post by UBT - Timbo »

homefarm wrote: Sun May 17, 2020 5:19 am Hi Tim
Yes I tried it, and it limited the tasks but still only finds 1 GPU.
Hi Richard

OK - well, the GPU config was only to fix whether you could process MW tasks on both GPUs - this "tip" didn't really have anything to do with the task limiting you were experiencing, which I think is down to other factors to do with the schedular.
homefarm wrote: Sun May 17, 2020 5:19 am I have detached the MW project on the problem PC, but left the other 2 PCs attached and running.

Next week I will start to do a new build, and will transfer the 2080ti to that new PC. That will make life simpler, and possibly more productive for MW.

Ironically MW is not currently a task I would choose to run for my own interest, but was trying to prepare the project for use in a Sprint. At least I still have 2 PCs happily running MW.

This 7890XE successfully runs Folding on background low priority, whilst also running Prime Grid in Boinc. The 2 projects share the GPUs, and make full use of their power. It's a good combination for me, allowing a beneficial project (Folding ie Covid-19) to run with a fun project (prime Grid).

Many thanks for all the tips, and advice.. :)

Quote for today

"It is the long history of humankind (and animal kind, too) that those who learned to collaborate and improvise most effectively have prevailed."
– Charles Darwin
Yup - and fair enough.

Trying to squeeze as much out of a PC, by loading it with various extra's such as multiple (and different) GPUs is always tricky...I know I had some issues when going back a few years, I had 2x GTX580's in the same PC and eventually I got both crunching concurrently...but it took a while to figure out the tweaks needed.

Ideally, of course, when BOINC Manager is installed, it *should* examine all your working hardware and, as you add projects (and other hardware), it *should* re-configure itself and make changes to the various XML files it uses... It should also have a built-in editor, so one can tweak these settings "on the fly", instead of having to use an external editor and keeping fingers crossed that it works.

But BM hasn't evolved to that point and so we're stuck with making manual changes and having to trawl websites finding fixes when things DON'T work as they should. :-(

Hope the new build goes well. :-)

regards
Tim
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Re: Milkyway issues

Post by homefarm »

UBT - Timbo wrote: Sun May 17, 2020 9:31 am
homefarm wrote: Sun May 17, 2020 5:19 am Hi Tim
Yes I tried it, and it limited the tasks but still only finds 1 GPU.
Hi Richard

OK - well, the GPU config was only to fix whether you could process MW tasks on both GPUs - this "tip" didn't really have anything to do with the task limiting you were experiencing, which I think is down to other factors to do with the schedular.
homefarm wrote: Sun May 17, 2020 5:19 am I have detached the MW project on the problem PC, but left the other 2 PCs attached and running.

Next week I will start to do a new build, and will transfer the 2080ti to that new PC. That will make life simpler, and possibly more productive for MW.

Ironically MW is not currently a task I would choose to run for my own interest, but was trying to prepare the project for use in a Sprint. At least I still have 2 PCs happily running MW.

This 7890XE successfully runs Folding on background low priority, whilst also running Prime Grid in Boinc. The 2 projects share the GPUs, and make full use of their power. It's a good combination for me, allowing a beneficial project (Folding ie Covid-19) to run with a fun project (prime Grid).

Many thanks for all the tips, and advice.. :)

Quote for today

"It is the long history of humankind (and animal kind, too) that those who learned to collaborate and improvise most effectively have prevailed."
– Charles Darwin
Yup - and fair enough.

Trying to squeeze as much out of a PC, by loading it with various extra's such as multiple (and different) GPUs is always tricky...I know I had some issues when going back a few years, I had 2x GTX580's in the same PC and eventually I got both crunching concurrently...but it took a while to figure out the tweaks needed.

Ideally, of course, when BOINC Manager is installed, it *should* examine all your working hardware and, as you add projects (and other hardware), it *should* re-configure itself and make changes to the various XML files it uses... It should also have a built-in editor, so one can tweak these settings "on the fly", instead of having to use an external editor and keeping fingers crossed that it works.

But BM hasn't evolved to that point and so we're stuck with making manual changes and having to trawl websites finding fixes when things DON'T work as they should. :-(

Hope the new build goes well. :-)

regards
Tim
The real oddity is that the other 7890XE, running 1 2080ti and 1 2060, is loaded up, and simultaneously running 8 tasks across the 2 GPUs. Boinc is still 7.14.2, and I made a cc_config change for more than 1 gpu; that's it.

So when MW isn't looking, I am going to sneak up and change the 2060 for a 2080ti from the problem PC. Hopefully it will carry on with a cache and 8 tasks without realising I've switched its coprocessor :pray:
UBT - Timbo
UBT Forum Admin
Posts: 9673
Joined: Mon Mar 13, 2006 12:00 am
Location: NW Midlands
Contact:

Re: Milkyway issues

Post by UBT - Timbo »

homefarm wrote: Sun May 17, 2020 12:06 pm The real oddity is that the other 7890XE, running 1 2080ti and 1 2060, is loaded up, and simultaneously running 8 tasks across the 2 GPUs. Boinc is still 7.14.2, and I made a cc_config change for more than 1 gpu; that's it.

So when MW isn't looking, I am going to sneak up and change the 2060 for a 2080ti from the problem PC. Hopefully it will carry on with a cache and 8 tasks without realising I've switched its coprocessor :pray:
Hi Richard

I can undersand your frustration as it does sound crazy that one dual-GPU set up works and the other doesn't.

One thing to note is that if BOINC detects a hardware change it can sometimes abort all the tasks...so I'd be tempted to stop allowing any new work to download (on this other PC) and then run down all the existing tasks and THEN change the GPU's over.

(Or of course you can change it anyways, but don't do it "mid-crunch" while tasks are still being crunched as you could lose those tasks and any time spent on them to date would be wasted).

regards
Tim
ChelseaOilman
TeAm AnandTech team member
Posts: 807
Joined: Sat May 02, 2020 6:14 pm
Location: Texas/Colorado

Re: Milkyway issues

Post by ChelseaOilman »

Once you move the 2080 Ti out you'll have room for another Titan V in that box. Might empty your bank account unless you can find a good deal on a used one though.

I never mix GPUs in any of my computers. All my boxes have dual GPUs and GPUs in each box are identical. Saves a lot of hassle.
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Re: Milkyway issues

Post by homefarm »

ChelseaOilman wrote: Sun May 17, 2020 2:11 pm Once you move the 2080 Ti out you'll have room for another Titan V in that box. Might empty your bank account unless you can find a good deal on a used one though.

I never mix GPUs in any of my computers. All my boxes have dual GPUs and GPUs in each box are identical. Saves a lot of hassle.
What a good idea :lol:

This one was on eBay, as new unused. Once I received it and checked it out I found it was the Titan V CEO JH version. It's Substantially uprated in almost every way, and very similar to the Tesla GV100, except it does not have ECC RAM, thus making it a consumer/semi-professional model. I got lucky!

I will take your advice and double up on the 2080ti, and move the 2060 to an office I9-9900K, with a MEG mobo that doesn't have an output for the intel video processor. I can still run remotely on it.
Last edited by homefarm on Sun May 17, 2020 2:28 pm, edited 1 time in total.
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Re: Milkyway issues

Post by homefarm »

UBT - Timbo wrote: Sun May 17, 2020 1:30 pm
homefarm wrote: Sun May 17, 2020 12:06 pm The real oddity is that the other 7890XE, running 1 2080ti and 1 2060, is loaded up, and simultaneously running 8 tasks across the 2 GPUs. Boinc is still 7.14.2, and I made a cc_config change for more than 1 gpu; that's it.

So when MW isn't looking, I am going to sneak up and change the 2060 for a 2080ti from the problem PC. Hopefully it will carry on with a cache and 8 tasks without realising I've switched its coprocessor :pray:
Hi Richard

I can undersand your frustration as it does dound crazy that one dual-GPus et up works and the other doesn't.

One thing to note is that if BOINC detects a hardware change it can sometimes abort all the tasks...so I'd be tempted to stop allowing any new work to download (on this other PC) and then run down all the existing tasks and THEN change the GPU's over.

(Or of course you can change it anyways, but don't do it "mid-crunch" while tasks are still being crunched as you could lose those tasks and any time spent on them to date would be wasted).

regards
Tim
Yes I agree I'll close down gracefully
homefarm
UBT Contributor
Posts: 359
Joined: Thu Dec 15, 2016 6:41 pm
Contact:

Re: Milkyway issues

Post by homefarm »

Hi all,

https://boinc.berkeley.edu/forum_thread ... =&start=20

Massive discussions re the problem I and many others have experienced, with task supply.
I have enjoyed asking, learning and applying the suggestions herein and elsewhere.
I have come to the decision that the PCs working properly can continue with MW, if MW is in Sprints. Otherwise it's cheerio MW, hello "new friend that can feed GPUs properly".
I am running 7.15.0 now (again) and letting it sort itself out.
Should I find anything helpful to forum members I'll post it, but perhaps enough said for now.
Cheers all, have a nice day!
Post Reply