Discussion:
[openstack-dev] [cyborg] [nova] Cyborg quotas
Nadathur, Sundar
2018-05-16 17:01:44 UTC
Permalink
Hi,
   The Cyborg quota spec [1] proposes to implement a quota (maximum
usage) for accelerators on a per-project basis, to prevent one project
(tenant) from over-using some resources and starving other tenants.
There are separate resource classes for different accelerator types
(GPUs, FPGAs, etc.), and so we can do quotas per RC.

The current proposal [2] is to track the usage in Cyborg agent/driver. I
am not sure that scheme will work, as I have indicated in the comments
on [1]. Here is another possible way.

* The operator configures the oslo.limit in keystone per-project
per-resource-class (GPU, FPGA, ...).
o Until this gets into Keystone, Cyborg may define its own quota
table, as defined in [1].
* Cyborg implements a table to track per-project usage, as defined in [1].
* Cyborg provides a filter for the Nova scheduler, which checks
whether the project making the request has exceeded its own quota.
o If so, it removes all candidates, thus failing the request.
o If not, it updates the per-project usage in its own DB. Since
this is an out-of-tree filter, at least to start with, it should
be ok to directly update the db without making REST API calls.

IOW, the resource usage tracking and enforcement are done as part of the
request scheduling, rather than done at the compute node.

If there are better ways, or ways to avoid a filter, please LMK.

[1] https://review.openstack.org/#/c/560285/
[2] https://review.openstack.org/#/c/564968/

Thanks.

Regards,
Sundar
Jay Pipes
2018-05-16 17:24:24 UTC
Permalink
Hi,
   The Cyborg quota spec [1] proposes to implement a quota (maximum
usage) for accelerators on a per-project basis, to prevent one project
(tenant) from over-using some resources and starving other tenants.
There are separate resource classes for different accelerator types
(GPUs, FPGAs, etc.), and so we can do quotas per RC.
The current proposal [2] is to track the usage in Cyborg agent/driver. I
am not sure that scheme will work, as I have indicated in the comments
on [1]. Here is another possible way.
* The operator configures the oslo.limit in keystone per-project
per-resource-class (GPU, FPGA, ...).
o Until this gets into Keystone, Cyborg may define its own quota
table, as defined in [1].
* Cyborg implements a table to track per-project usage, as defined in [1].
Placement already stores usage information for all allocations of
resources. There is already even a /usages API endpoint that you can
specify a project and/or user:

https://developer.openstack.org/api-ref/placement/#list-usages

I see no reason not to use it.

There is already actually a spec to use placement for quota usage checks
in Nova here:

https://review.openstack.org/#/c/509042/

Probably best to have a look at that and see if it will end up meeting
your needs.
* Cyborg provides a filter for the Nova scheduler, which checks
whether the project making the request has exceeded its own quota.
Quota checks happen before Nova's scheduler gets involved, so having a
scheduler filter handle quota usage checking is pretty much a non-starter.

I'll have a look at the patches you've proposed and comment there.

Best,
-jay

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-***@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/ma

Loading...