The future for (BOINC) Volunteer Computing..

Post by **chriscambridge** » Mon Jul 10, 2017 12:54 am

Hi Everyone,

I just came across this statement + researching funding application from David Anderson/BOINC, which states the way BOINC VC is about to change and be upgraded, or at least hope's to.

--

Note from Chris Cambridge..

Interestingly, with the GridCoin community voting to removed GridCoin team membership, so anyone in any team could earn GRC Coins (soon) for crunching BOINC projects (which can be exchanged to Bitcoin etc, which can then be sold to £/$/€) , its not at all surprising that BOINC is now looking to monetize and reward data processors (crunchers/researchers).

From Open Statement: TBD will act as an allocator of computing power. This will be based in part on user preferences, but there will of necessity also be a higher-level allocation policy, decided on by an organization. The decision process should include merit and need; it may include politics and money as well

From Funding Proposal: Finally, we will do work aimed at increasing volunteer recruitment and retention, including a) conduct PR activities; b) integrate with social media such as Twitter and Facebook; c) develop new incentive and reward features; and d) work with corporate partners (Steam, Blizzard, EE, HTC) to target consumer product areas such as computer game systems and GPU-equipped mobile devices.

--

TBD: a new model for volunteer computing
David Anderson, 1 June 2017

http://boinc.berkeley.edu/tbd.php

After 20 years, volunteer computing has had successes, but has not approached its potential.

VC was supposed to enable ground-breaking research by providing more computing power than was available or affordable otherwise. This has happened but only to a small extent. The set of VC projects has been small and essentially static for 10 years. Of the scientists who use high-throughput computing and could benefit from VC, only a tiny fraction actually do.

VC was supposed to greatly increase global public interest in science; this has happened but only to a small extent. The volunteer population is almost entirely from a single demographic (older, IT-savvy males) and has been gradually shrinking for ~10 years.
These problems can be traced to BOINC's original structural model: the "project ecosystem" model. In this model, there's a dynamic ecosystem of competing projects, the public learns about them and make informed choices, the best projects get the most computing power, and the public learns and gets excited about science. BOINC is designed to encourage this model (e.g. cross-project IDs and credit).

The model was based on several assumptions:

It's sufficiently easy to create and operate a BOINC project that almost any computational science research group can do it.

Other than providing the software and a list of projects, BOINC should have no centralized functions or control; projects are autonomous.

Volunteers will evaluate the projects (by reading their web sites) and make rational decisions about which ones to support. Furthermore, they will do this repeatedly as new projects arise.

Projects will compete for volunteers by making compelling web sites that explain and promote their research.

The model didn't work as envisioned, for a number of reasons:

Creating and operating a VC project is harder than we realized: it requires a combination of resource and skills (Win/Mac programming, sysadmin, DB admin, web design, PR/outreach) that few academic research groups have.

For a research group, trying to use VC is a risk. There's a substantial investment, with no guarantee of any return, since no one may volunteer. Adding a VC component to a grant proposal adds uncertainty and weakens the proposal.

The computing needs of many research groups are sporadic - e.g. they need a big chunk of throughput every now and then. For such groups, buying computing time on a commercial cloud may be cheaper than using VC.

Attracting volunteers is a marketing exercise. It's difficult to do effective marketing when there are dozens of competing brands (i.e. projects names).

Most volunteers aren't interested willing to survey and assess a large set of projects once, much less repeatedly.

We made little effort to interface, technically or politically, with the mainstream HPC/HTC world (Grid, Supercomputing, Condor, etc.). They came to view VC in negative ways: as a threat, a gimmick, etc. Around 2006 there was a brief and small interest in VC in academic computer world. Since then, nothing: no conferences on distributed computing list VC as a topic of interest. This has been damaging to VC; e.g. no one is working on solving the hard problems that arise in VC (such as how to grant credit).
A new model

I think we need to take what we've learned from the project ecosystem model and make a new and better model. I brought up this idea at the 2013 BOINC workshop, and proposed a model consisting of two related parts:

Partner with existing HTC computing providers such as supercomputing centers and science portals to add BOINC-based back ends. These projects would be operated by the provider's staff. Tens of thousands of scientists use such computing providers. These scientists would benefit from lower queueing delays, higher throughput and lower cost. But they wouldn't need to do anything; they wouldn't even need to know that VC is being used.

Create an account manager (let's called it "TBD" for now) acting as the primary volunteer interface. TBD lets volunteers express their preferences in terms of keywords (scientific areas and locations) rather than selecting specific projects. Based on these preferences, and corresponding keywords of projects and applications, TBD dynamically assigns computers to a set of vetted projects, which would include both existing (single-group) projects, as well as the new computing-center projects.
On a technical level, the new model is enabled by our ability (thanks to Rom Walton and various people from CERN) to run jobs in virtual machines, and recent refinements by Marius Milea to support Docker on top of this. This makes it possible for HTC providers like TACC and nanoHUB (which already use Docker for app deployment) to run hundreds of existing applications with no porting or other per-app work.

This model addresses most of the problems with the previous one. Notes about the new model:

It doesn't interfere with or preclude existing BOINC activities. Current projects continue as they are. Scientists can create new single-group projects if they want. Volunteers can attach individual projects as they currently do, or use existing account managers like BAM! and Gridrepublic.

TBD will act as an allocator of computing power. This will be based in part on user preferences, but there will of necessity also be a higher-level allocation policy, decided on by an organization. The decision process should include merit and need; it may include politics and money as well. NSF has an organization - XSEDE - that does this for NSF-funded computing resources. I'm in contact with XSEDE, and hope to include them in TBD. Involving NSF in the process is important; but this project needs to be international. This part of the model needs to be worked out at a high level.

The model focuses on large HPC-provider-level projects, but it actually encourages single-group projects, since they can apply for an allocation from TBD and be assured of computing power prior to making any investment.

TBD can serve as a brand for VC marketing purposes. It will also provide a basis for corporate partnerships; if technology or game companies want to support VC, they can support TBD rather then having to select individual projects.
Funding status

In 2014 I started thinking seriously about the new model, and I teamed up with two mainstream computing providers as test cases:

nanoHUB, at Purdue University, which is a nanoscience portal. It provides web interfaces to computational tools, used by thousands of scientists, many of which create HTC workloads well-suited to VC.

Texas Advanced Computing Center (TACC) is a major supercomputer center. A good fraction of their jobs are well-suited to VC.
Our goal is to create success stories that inspire all HTC providers to add VC back ends.

In 2014 we sent a proposal to NSF; it got good reviews but was rejected. We revised and resubmitted the next year, and in 2016 we were given 1 year of funding and encouraged to re-apply again. We did, and recently learned that our latest proposal was funded for 3 years, starting this month. Yippee!

We didn't get all the money we asked for, which is par for the course. We got enough to pay my salary, and 50% salary for my collaborators at Purdue and TACC. I had hoped to be able to hire a web designer here at UCB; maybe I can find other sources of money to do this.

Relationship to BOINC

In 2016 BOINC became a community-run project; I don't control it. The new project, TBD, will be separate from BOINC. I hope that the BOINC community likes and supports TBD, but some people might not, and I don't want to step on their toes. Of course, I'm interested in hearing comments and criticisms about TBD, and in discussing it.

I've been mostly MIA from BOINC for the last couple of years, because I've been working full-time on other projects. I apologize for this. With this new funding, I'll be able to devote a good chunk of my time to managing and contributing to BOINC, e.g. setting up a functional release management process.

I suspect that relatively few current volunteers will use TBD; it's more for new users with wider demographics. So current projects won't lose computing power, and they should get additional power from TBD. Long term, I think something like TBD is our only hope for going from a few 100K volunteers to millions or tens of millions. And such a rising tide will float all of our ships.

To implement TBD, I'll need to add some features to BOINC, e.g.:

The client will pass credit estimate information to account managers.

Account managers can send clients opaque data to be passed in scheduler requests (preference keywords in this case).

The scheduler will have a "keyword matching" option that takes user and job keywords into account. E.g. it will preferentially send biomed jobs to volunteers who want to support biomed.
These features will have no impact on existing projects.

The BOINC web site will link to TBD as well as BAM! and GridRepublic.

The TBD source code will be released under LGPLv3, and will be stored on Github. We'll welcome code contributions.

Names
I've been through a few names for TBD. The latest proposal calls it "Science United". This is OK, but it's a bit long and uninteresting. Also it conjures the ill-fated "United Devices", an early attempt at commercializing VC.

I thought about names starting with "Sci" and came up with:

"Sciborg": volunteers are assimilated into a collective intelligence. Too ominous.

"Sciphon": like we're siphoning off computing power. Has connotations of stealing gasoline.

"SciOn": where the "O" is the power-button icon. Power up Science! I like this one, though Scion is also a former car brand.

The bottom line: computer nerds shouldn't invent brand names. Hopefully I can get help from marketing/branding experts from the business world. The UCB business school teaches classes in this sort of thing.

Funding Proposal for above from NSF:

(Full text: http://boinc.berkeley.edu/nsf_16_proposal.pdf)
(Summary as below: http://boinc.berkeley.edu/nsf_16_summary.pdf)

--

Overview:

Volunteer computing (VC) uses consumer devices such as PCs and smartphones for scientific computing. Currently 500,000 devices participate, with 2.3 million CPU cores and 290,000 GPUs. VC supplies 44 PetaFLOPS of computing power, and it could provide ExaFLOPS. VC has supplied computing power worth hundreds of millions of dollars, but at a much lower cost to funding agencies, because hardware and energy costs are borne by volunteers. Using the BOINC middleware system, VC
can support high-throughput computing (HTC) applications in all areas of science. It has enabled breakthroughs in several areas, published in journals such as Science and Nature. However, VC’s potential is largely untapped. We propose work that will make VC a central part of the U.S. research cyberinfrastructure and bring its computing power to many thousands of scientists.

We will complete and extend current work adding BOINC-based VC back ends to existing computing providers: nanoHUB and the Texas Advanced Computing Center (TACC. Qualifying HTC jobs will be automatically migrated to VC and run on consumer devices. This will benefit thousands of scientists using nanoHUB and TACC by increasing HTC capacity and freeing HPC resources for other jobs. Scientists won’t have to learn or change anything – they’ll just get faster turnaround. These key
integrations will provide software building blocks by which other computing providers (HPC centers and science gateways) can add VC back ends.

Secondly, we will develop “Science@home”, a new Web-based system in which volunteers register for scientific areas (such as biomedicine, environmental research, or astrophysics) rather than for specific projects. This approach simplifies marketing and PR by creating a single “brand”; it provides a flexible mechanism for allocating computing power; and it maintains a coherent public interface as the number of projects and applications increases.

Finally, we will do work aimed at increasing volunteer recruitment and retention, including a) conduct PR activities; b) integrate with social media such as Twitter and Facebook; c) develop new incentive and reward features; and d) work with corporate partners (Steam, Blizzard, EE, HTC) to target consumer product areas such as computer game systems and GPU-equipped mobile devices.

Intellectual merit:

The proposed activity will significantly increase the computing resources available to scientists in areas such as nanotechnology, proteomics, genomics, climate modeling, epidemiology, cancer research, bio-fuels, and astrophysics. It will thereby both accelerate current research and enable new research in these fields. It will also address technical problems in the integration of heterogeneous and untrusted computing resources with existing HTC systems, and in scalable multi-objective resource allocation.

Broader impact:

VC is the most widespread form of “citizen science”. It lets the public participate in cutting-edge scientific research projects. This increases public awareness of and interest in science, creates science-centered online communities, and creates a powerful channel for public outreach and education. The proposed activity will strengthen this impact by increasing the number of volunteers, broadening the range of research they can support and follow, and connecting thousands of scientists to this unprecedented pool of people and computing power.

Post by **UBT - Timbo** » Mon Jul 10, 2017 10:45 am

Hi Chris

Excellent find.

I'd been wondering what had happened to David Anderson after he "left" BOINC and turned it over to the "community".

Seems his brain has been working quite well...and it might work better now he has got some funding for his salary (oops !!).

But seriously: BOINC has done well, but it does need to evolve...every project has been losing volunteers for years...and there are so many things that BOINC tried to do, but failed...esp in terms of communications from the top of the BOINC tree as well as many other issues.

One hopes that "TBD" or "SciOn" or whatever it is called, will resolve these issues and can get more people interested in the Science side of things.

I'll be keeping my fingers crossed something "good" comes out of it soon.

regards
Tim

Post by **Woodles** » Mon Jul 10, 2017 11:18 am

Very interesting Chris.

TBD will act as an allocator of computing power. This will be based in part on user preferences, but there will of necessity also be a higher-level allocation policy, decided on by an organization. The decision process should include merit and need; it may include politics and money as well. NSF has an organization - XSEDE - that does this for NSF-funded computing resources. I'm in contact with XSEDE, and hope to include them in TBD. Involving NSF in the process is important; but this project needs to be international. This part of the model needs to be worked out at a high level.

I wouldn't be happy with my preferences being overridden by a company based on political and monetary concerns, even less so if it was an American Government one!

TBD can serve as a brand for VC marketing purposes. It will also provide a basis for corporate partnerships; if technology or game companies want to support VC, they can support TBD rather then having to select individual projects.

So TBD are going to be profiting from the users time and money?

An interesting concept and could be promising but as it's currently described, I doubt that this is going to be for me.

Mark

Post by **chriscambridge** » Sat Jul 22, 2017 4:26 pm

After doing more research, I doubt this will be for our account either.

I think I will stick and wait until UBT has a GRC Pool for crunching BOINC projects.

UK BOINC Team Forum

The future for (BOINC) Volunteer Computing..

The future for (BOINC) Volunteer Computing..

Re: The future for (BOINC) Volunteer Computing..

Re: The future for (BOINC) Volunteer Computing..

Re: The future for (BOINC) Volunteer Computing..