Discussion:
[libvirt] [PATCH 2/2] lxc: Only delegate VIR_CGROUP_CONTROLLER_SYSTEMD to containers
(too old to reply)
Richard Weinberger
2014-02-11 22:51:26 UTC
Permalink
Due to security concerns we delegate only VIR_CGROUP_CONTROLLER_SYSTEMD
to containers.
Currently it is not safe to allow a container access to a resource controller.

Signed-off-by: Richard Weinberger <***@nod.at>
---
src/lxc/lxc_container.c | 3 ++-
src/util/vircgroup.c | 5 ++++-
src/util/vircgroup.h | 3 ++-
3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c
index c6bdc8c..abd2db4 100644
--- a/src/lxc/lxc_container.c
+++ b/src/lxc/lxc_container.c
@@ -1654,7 +1654,8 @@ static int lxcContainerSetupPivotRoot(virDomainDefPtr vmDef,

/* Now we can re-mount the cgroups controllers in the
* same configuration as before */
- if (virCgroupIsolateMount(cgroup, "/.oldroot/", sec_mount_options) < 0)
+ if (virCgroupIsolateMount(cgroup, "/.oldroot/", sec_mount_options,
+ (1 << VIR_CGROUP_CONTROLLER_SYSTEMD)) < 0)
goto cleanup;

/* Mounts /dev */
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c
index b2666e8..c75de9f 100644
--- a/src/util/vircgroup.c
+++ b/src/util/vircgroup.c
@@ -3166,7 +3166,7 @@ virCgroupGetFreezerState(virCgroupPtr group, char **state)

int
virCgroupIsolateMount(virCgroupPtr group, const char *oldroot,
- const char *mountopts)
+ const char *mountopts, int controllers)
{
int ret = -1;
size_t i;
@@ -3197,6 +3197,9 @@ virCgroupIsolateMount(virCgroupPtr group, const char *oldroot,
}

for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) {
+ if (!((1 << i) & controllers))
+ continue;
+
if (!group->controllers[i].mountPoint)
continue;

diff --git a/src/util/vircgroup.h b/src/util/vircgroup.h
index 6e00f28..c005d28 100644
--- a/src/util/vircgroup.h
+++ b/src/util/vircgroup.h
@@ -221,7 +221,8 @@ int virCgroupKillPainfully(virCgroupPtr group);

int virCgroupIsolateMount(virCgroupPtr group,
const char *oldroot,
- const char *mountopts);
+ const char *mountopts,
+ int controllers);

bool virCgroupSupportsCpuBW(virCgroupPtr cgroup);
--
1.8.4.5
Daniel P. Berrange
2014-02-13 17:16:48 UTC
Permalink
Post by Richard Weinberger
Due to security concerns we delegate only VIR_CGROUP_CONTROLLER_SYSTEMD
to containers.
Currently it is not safe to allow a container access to a resource controller.
We *do* want to allow all controllers to be visible to the container.
eg it is valid for them to have read access to view things like block
I/O and CPU accounting information. We just don't want to make it writable
for usernamespaces.

I've adjusted your first patch and reposted with you on CC, if you can
confirm it works for you.

Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Richard Weinberger
2014-02-14 07:49:07 UTC
Permalink
Post by Daniel P. Berrange
Post by Richard Weinberger
Due to security concerns we delegate only VIR_CGROUP_CONTROLLER_SYSTEMD
to containers.
Currently it is not safe to allow a container access to a resource controller.
We *do* want to allow all controllers to be visible to the container.
eg it is valid for them to have read access to view things like block
I/O and CPU accounting information. We just don't want to make it writable
for usernamespaces.
Okay. But what if one does not enable user namespaces?
Then the controllers are writable within the container.
Post by Daniel P. Berrange
I've adjusted your first patch and reposted with you on CC, if you can
confirm it works for you.
Will test it today. :)

Thanks,
//richard
Daniel P. Berrange
2014-02-14 10:30:44 UTC
Permalink
Post by Richard Weinberger
Post by Daniel P. Berrange
Post by Richard Weinberger
Due to security concerns we delegate only VIR_CGROUP_CONTROLLER_SYSTEMD
to containers.
Currently it is not safe to allow a container access to a resource controller.
We *do* want to allow all controllers to be visible to the container.
eg it is valid for them to have read access to view things like block
I/O and CPU accounting information. We just don't want to make it writable
for usernamespaces.
Okay. But what if one does not enable user namespaces?
Then the controllers are writable within the container.
If you don't enable user namespaces, then containers should be considered
insecure unless all processes run non-root and all your filesystems are
mounted no-setuid to prevent escalation fo privileges back to root, or you
have SELinux applying controls.

So once ypou have the requirement that security depends on being non-root
then the cgroups are no longer writable, except when your consider is
already insecure for other reasons.

Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Richard Weinberger
2014-02-14 11:11:13 UTC
Permalink
Post by Daniel P. Berrange
Post by Richard Weinberger
Post by Daniel P. Berrange
Post by Richard Weinberger
Due to security concerns we delegate only VIR_CGROUP_CONTROLLER_SYSTEMD
to containers.
Currently it is not safe to allow a container access to a resource controller.
We *do* want to allow all controllers to be visible to the container.
eg it is valid for them to have read access to view things like block
I/O and CPU accounting information. We just don't want to make it writable
for usernamespaces.
Okay. But what if one does not enable user namespaces?
Then the controllers are writable within the container.
If you don't enable user namespaces, then containers should be considered
insecure unless all processes run non-root and all your filesystems are
mounted no-setuid to prevent escalation fo privileges back to root, or you
have SELinux applying controls.
Yeah, I hope all users know that too. Do you plan to support non-user namespace
container in future?

Maybe one should communicate this to docker.io folks as well. *scnr*
Post by Daniel P. Berrange
So once ypou have the requirement that security depends on being non-root
then the cgroups are no longer writable, except when your consider is
already insecure for other reasons.
Yep.

Thanks,
//richard
Daniel P. Berrange
2014-02-14 11:18:14 UTC
Permalink
Post by Richard Weinberger
Post by Daniel P. Berrange
Post by Richard Weinberger
Post by Daniel P. Berrange
Post by Richard Weinberger
Due to security concerns we delegate only VIR_CGROUP_CONTROLLER_SYSTEMD
to containers.
Currently it is not safe to allow a container access to a resource controller.
We *do* want to allow all controllers to be visible to the container.
eg it is valid for them to have read access to view things like block
I/O and CPU accounting information. We just don't want to make it writable
for usernamespaces.
Okay. But what if one does not enable user namespaces?
Then the controllers are writable within the container.
If you don't enable user namespaces, then containers should be considered
insecure unless all processes run non-root and all your filesystems are
mounted no-setuid to prevent escalation fo privileges back to root, or you
have SELinux applying controls.
Yeah, I hope all users know that too. Do you plan to support non-user namespace
container in future?
Maybe one should communicate this to docker.io folks as well. *scnr*
Yep, I've gone into this in much detail with Red Hat folks who are
working with Docker on their container impl, so they at least know
the risks in what they're going....

Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Loading...