Discussion:
[libvirt] [PATCH 0/2] Fix a couple get all domain stats issues
John Ferlan
2018-11-27 16:23:21 UTC
Permalink
One is longer term (patch1), while the other is sourced in this
release (4.10.0) when IOThread stats were added.

John Ferlan (2):
qemu: Save qemuDomainGetStats error
qemu: Don't fail stats collection due to IOThread capability

src/qemu/qemu_driver.c | 28 ++++++++++++++++++++--------
1 file changed, 20 insertions(+), 8 deletions(-)
--
2.17.2
John Ferlan
2018-11-27 16:23:22 UTC
Permalink
During qemuConnectGetAllDomainStats if qemuDomainGetStats causes
a failure, then when collecting more than one domain's worth of
statistics the loop in virDomainStatsRecordListFree would call
virDomainFree which would call virResetLastError effectively wiping
out the reason we failed leaving the caller with no idea why the
collection failed.

To fix this, let's save the error and restore it prior to return
so that a caller such as 'virsh domstats' doesn't get the generic
"error: An error occurred, but the cause is unknown".

Signed-off-by: John Ferlan <***@redhat.com>
---
src/qemu/qemu_driver.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 7d9e17e72c..2fb8eef609 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -21092,6 +21092,7 @@ qemuConnectGetAllDomainStats(virConnectPtr conn,
unsigned int flags)
{
virQEMUDriverPtr driver = conn->privateData;
+ virErrorPtr save_err = NULL;
virDomainObjPtr *vms = NULL;
virDomainObjPtr vm;
size_t nvms;
@@ -21160,6 +21161,7 @@ qemuConnectGetAllDomainStats(virConnectPtr conn,
if (flags & VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING)
domflags |= QEMU_DOMAIN_STATS_BACKING;
if (qemuDomainGetStats(conn, vm, stats, &tmp, domflags) < 0) {
+ save_err = virSaveLastError();
if (HAVE_JOB(domflags) && vm)
qemuDomainObjEndJob(driver, vm);

@@ -21184,6 +21186,10 @@ qemuConnectGetAllDomainStats(virConnectPtr conn,
cleanup:
virDomainStatsRecordListFree(tmpstats);
virObjectListFreeCount(vms, nvms);
+ if (save_err) {
+ virSetError(save_err);
+ virFreeError(save_err);
+ }

return ret;
}
--
2.17.2
Erik Skultety
2018-11-28 11:49:14 UTC
Permalink
Post by John Ferlan
During qemuConnectGetAllDomainStats if qemuDomainGetStats causes
a failure, then when collecting more than one domain's worth of
statistics the loop in virDomainStatsRecordListFree would call
virDomainFree which would call virResetLastError effectively wiping
A (probably) silly question, but why do we have to call virResetLastError as
within virDomainFree in the first place?

Erik
Daniel P. Berrangé
2018-11-28 12:08:51 UTC
Permalink
Post by Erik Skultety
Post by John Ferlan
During qemuConnectGetAllDomainStats if qemuDomainGetStats causes
a failure, then when collecting more than one domain's worth of
statistics the loop in virDomainStatsRecordListFree would call
virDomainFree which would call virResetLastError effectively wiping
A (probably) silly question, but why do we have to call virResetLastError as
within virDomainFree in the first place?
Well virDomainFree can return an error just like any other API can :-)
See virCheckDomainReturn(...)

Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Erik Skultety
2018-11-28 12:26:17 UTC
Permalink
Post by Daniel P. Berrangé
Post by Erik Skultety
Post by John Ferlan
During qemuConnectGetAllDomainStats if qemuDomainGetStats causes
a failure, then when collecting more than one domain's worth of
statistics the loop in virDomainStatsRecordListFree would call
virDomainFree which would call virResetLastError effectively wiping
A (probably) silly question, but why do we have to call virResetLastError as
within virDomainFree in the first place?
Well virDomainFree can return an error just like any other API can :-)
See virCheckDomainReturn(...)
In which case we'd get into virRaiseErrorFull which already resets the error
object inside, that's why I'm asking whether there's another reason for it.

Erik
Daniel P. Berrangé
2018-11-28 12:28:21 UTC
Permalink
Post by Erik Skultety
Post by Daniel P. Berrangé
Post by Erik Skultety
Post by John Ferlan
During qemuConnectGetAllDomainStats if qemuDomainGetStats causes
a failure, then when collecting more than one domain's worth of
statistics the loop in virDomainStatsRecordListFree would call
virDomainFree which would call virResetLastError effectively wiping
A (probably) silly question, but why do we have to call virResetLastError as
within virDomainFree in the first place?
Well virDomainFree can return an error just like any other API can :-)
See virCheckDomainReturn(...)
In which case we'd get into virRaiseErrorFull which already resets the error
object inside, that's why I'm asking whether there's another reason for it.
That is insuficient as APIs need to guarantee that when they *don't* take
the error path, the last error is clear.

Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Erik Skultety
2018-11-28 16:17:17 UTC
Permalink
Post by Daniel P. Berrangé
Post by Erik Skultety
Post by Daniel P. Berrangé
Post by Erik Skultety
Post by John Ferlan
During qemuConnectGetAllDomainStats if qemuDomainGetStats causes
a failure, then when collecting more than one domain's worth of
statistics the loop in virDomainStatsRecordListFree would call
virDomainFree which would call virResetLastError effectively wiping
A (probably) silly question, but why do we have to call virResetLastError as
within virDomainFree in the first place?
Well virDomainFree can return an error just like any other API can :-)
See virCheckDomainReturn(...)
In which case we'd get into virRaiseErrorFull which already resets the error
object inside, that's why I'm asking whether there's another reason for it.
That is insuficient as APIs need to guarantee that when they *don't* take
the error path, the last error is clear.
Right, makes sense, thanks.

Erik
Ján Tomko
2018-12-06 13:49:59 UTC
Permalink
Post by John Ferlan
During qemuConnectGetAllDomainStats if qemuDomainGetStats causes
a failure, then when collecting more than one domain's worth of
statistics the loop in virDomainStatsRecordListFree would call
virDomainFree which would call virResetLastError effectively wiping
out the reason we failed leaving the caller with no idea why the
collection failed.
To fix this, let's save the error and restore it prior to return
so that a caller such as 'virsh domstats' doesn't get the generic
"error: An error occurred, but the cause is unknown".
---
src/qemu/qemu_driver.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 7d9e17e72c..2fb8eef609 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -21092,6 +21092,7 @@ qemuConnectGetAllDomainStats(virConnectPtr conn,
unsigned int flags)
{
virQEMUDriverPtr driver = conn->privateData;
+ virErrorPtr save_err = NULL;
git grep virError src/qemu shows most files use orig_err as the variable
name
Post by John Ferlan
virDomainObjPtr *vms = NULL;
virDomainObjPtr vm;
size_t nvms;
@@ -21160,6 +21161,7 @@ qemuConnectGetAllDomainStats(virConnectPtr conn,
if (flags & VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING)
domflags |= QEMU_DOMAIN_STATS_BACKING;
if (qemuDomainGetStats(conn, vm, stats, &tmp, domflags) < 0) {
+ save_err = virSaveLastError();
Since virDomainStatsRecordListFree is the one resetting the error,
I think we should only save it right before that call.

This would cause a memory leak if qemuDomainGetStats would fail for more
than one domain.
Post by John Ferlan
if (HAVE_JOB(domflags) && vm)
qemuDomainObjEndJob(driver, vm);
@@ -21184,6 +21186,10 @@ qemuConnectGetAllDomainStats(virConnectPtr conn,
Here. And we have a specific helper for that:
virErrorPreserveLast(&orig_err);
Post by John Ferlan
virDomainStatsRecordListFree(tmpstats);
virObjectListFreeCount(vms, nvms);
+ if (save_err) {
+ virSetError(save_err);
+ virFreeError(save_err);
+ }
virErrorRestore(&orig_err);

With that addressed:
Reviewed-by: Ján Tomko <***@redhat.com>

Jano
Ján Tomko
2018-12-06 15:15:07 UTC
Permalink
Post by Ján Tomko
Post by John Ferlan
During qemuConnectGetAllDomainStats if qemuDomainGetStats causes
a failure, then when collecting more than one domain's worth of
statistics the loop in virDomainStatsRecordListFree would call
virDomainFree which would call virResetLastError effectively wiping
out the reason we failed leaving the caller with no idea why the
collection failed.
To fix this, let's save the error and restore it prior to return
so that a caller such as 'virsh domstats' doesn't get the generic
"error: An error occurred, but the cause is unknown".
---
src/qemu/qemu_driver.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 7d9e17e72c..2fb8eef609 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -21092,6 +21092,7 @@ qemuConnectGetAllDomainStats(virConnectPtr conn,
unsigned int flags)
{
virQEMUDriverPtr driver = conn->privateData;
+ virErrorPtr save_err = NULL;
git grep virError src/qemu shows most files use orig_err as the variable
name
Post by John Ferlan
virDomainObjPtr *vms = NULL;
virDomainObjPtr vm;
size_t nvms;
@@ -21160,6 +21161,7 @@ qemuConnectGetAllDomainStats(virConnectPtr conn,
if (flags & VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING)
domflags |= QEMU_DOMAIN_STATS_BACKING;
if (qemuDomainGetStats(conn, vm, stats, &tmp, domflags) < 0) {
+ save_err = virSaveLastError();
Since virDomainStatsRecordListFree is the one resetting the error,
I think we should only save it right before that call.
This would cause a memory leak if qemuDomainGetStats would fail for more
than one domain.
Ehm, disregard this sentence. (Thanks Erik for pointing that out)
But I still consider the cleanup section a beter place for this.

Jano
Post by Ján Tomko
Post by John Ferlan
if (HAVE_JOB(domflags) && vm)
qemuDomainObjEndJob(driver, vm);
@@ -21184,6 +21186,10 @@ qemuConnectGetAllDomainStats(virConnectPtr conn,
virErrorPreserveLast(&orig_err);
Post by John Ferlan
virDomainStatsRecordListFree(tmpstats);
virObjectListFreeCount(vms, nvms);
+ if (save_err) {
+ virSetError(save_err);
+ virFreeError(save_err);
+ }
virErrorRestore(&orig_err);
Jano
--
libvir-list mailing list
https://www.redhat.com/mailman/listinfo/libvir-list
John Ferlan
2018-12-06 16:35:28 UTC
Permalink
Post by Ján Tomko
Post by Ján Tomko
Post by John Ferlan
During qemuConnectGetAllDomainStats if qemuDomainGetStats causes
a failure, then when collecting more than one domain's worth of
statistics the loop in virDomainStatsRecordListFree would call
virDomainFree which would call virResetLastError effectively wiping
out the reason we failed leaving the caller with no idea why the
collection failed.
To fix this, let's save the error and restore it prior to return
so that a caller such as 'virsh domstats' doesn't get the generic
"error: An error occurred, but the cause is unknown".
---
src/qemu/qemu_driver.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 7d9e17e72c..2fb8eef609 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -21092,6 +21092,7 @@ qemuConnectGetAllDomainStats(virConnectPtr conn,
                            unsigned int flags)
{
   virQEMUDriverPtr driver = conn->privateData;
+    virErrorPtr save_err = NULL;
git grep virError src/qemu shows most files use orig_err as the variable
name
OK - I'll modify... Of course the example I cut-n-paste'd from was
save_err - chances were slim, but possible.
Post by Ján Tomko
Post by Ján Tomko
Post by John Ferlan
   virDomainObjPtr *vms = NULL;
   virDomainObjPtr vm;
   size_t nvms;
@@ -21160,6 +21161,7 @@ qemuConnectGetAllDomainStats(virConnectPtr conn,
       if (flags & VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING)
           domflags |= QEMU_DOMAIN_STATS_BACKING;
       if (qemuDomainGetStats(conn, vm, stats, &tmp, domflags) < 0) {
+            save_err = virSaveLastError();
Since virDomainStatsRecordListFree is the one resetting the error,
I think we should only save it right before that call.
This would cause a memory leak if qemuDomainGetStats would fail for more
than one domain.
Ehm, disregard this sentence. (Thanks Erik for pointing that out)
But I still consider the cleanup section a beter place for this.
It was a 50/50 coin flip for placement. I'm fine with and will move this
to just before the virDomainStatsRecordListFree. Perhaps the more common
model used throughout the code. I hedged my "bets" by putting it here
because I thought it was clearer as to why an error would be saved.
Post by Ján Tomko
Post by Ján Tomko
Post by John Ferlan
           if (HAVE_JOB(domflags) && vm)
               qemuDomainObjEndJob(driver, vm);
@@ -21184,6 +21186,10 @@ qemuConnectGetAllDomainStats(virConnectPtr conn,
  virErrorPreserveLast(&orig_err);
We should be "more consistent" in the code regarding usage to avoid
future copypasta's, but going with Preserve and Restore is fine by me.

And by more consistent the underlying question/tone is - should the
virSaveLastError w/ paired virSetError/virFreeError all be replaced with
the Preserve/Restore... I see that could cause a lot of short term
churn, but perhaps the long term gain is worth it. The second question
then becomes all at one time or module my module...

Thanks for the review - the changes are made here at least.

John
Post by Ján Tomko
Post by Ján Tomko
Post by John Ferlan
   virDomainStatsRecordListFree(tmpstats);
   virObjectListFreeCount(vms, nvms);
+    if (save_err) {
+        virSetError(save_err);
+        virFreeError(save_err);
+    }
virErrorRestore(&orig_err);
Jano
--
libvir-list mailing list
https://www.redhat.com/mailman/listinfo/libvir-list
John Ferlan
2018-11-27 16:23:23 UTC
Permalink
Commit 212dc9286 made a generic qemuDomainGetIOThreadsMon which
would fail if the QEMU_CAPS_OBJECT_IOTHREAD didn't exist. Then
commit d1eac927 used that helper for the collection of all domain
stats. However, if the capability doesn't exist, then the entire
stats collection fails. Since the IOThread stats were meant to be
if available only, thus rather than failing if the capability
doesn't exist, let's just not collect the stats. Restore the caps
failure logic for qemuDomainGetIOThreadsLive.

Signed-off-by: John Ferlan <***@redhat.com>
---
src/qemu/qemu_driver.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 2fb8eef609..60e29577ad 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -5490,16 +5490,9 @@ qemuDomainGetIOThreadsMon(virQEMUDriverPtr driver,
virDomainObjPtr vm,
qemuMonitorIOThreadInfoPtr **iothreads)
{
- qemuDomainObjPrivatePtr priv;
+ qemuDomainObjPrivatePtr priv = vm->privateData;
int niothreads = 0;

- priv = vm->privateData;
- if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD)) {
- virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
- _("IOThreads not supported with this binary"));
- return -1;
- }
-
qemuDomainObjEnterMonitor(driver, vm);
niothreads = qemuMonitorGetIOThreads(priv->mon, iothreads);
if (qemuDomainObjExitMonitor(driver, vm) < 0 || niothreads < 0)
@@ -5514,6 +5507,7 @@ qemuDomainGetIOThreadsLive(virQEMUDriverPtr driver,
virDomainObjPtr vm,
virDomainIOThreadInfoPtr **info)
{
+ qemuDomainObjPrivatePtr priv;
qemuMonitorIOThreadInfoPtr *iothreads = NULL;
virDomainIOThreadInfoPtr *info_ret = NULL;
int niothreads = 0;
@@ -5529,6 +5523,13 @@ qemuDomainGetIOThreadsLive(virQEMUDriverPtr driver,
goto endjob;
}

+ priv = vm->privateData;
+ if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD)) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("IOThreads not supported with this binary"));
+ goto endjob;
+ }
+
if ((niothreads = qemuDomainGetIOThreadsMon(driver, vm, &iothreads)) < 0)
goto endjob;

@@ -20874,6 +20875,7 @@ qemuDomainGetStatsIOThread(virQEMUDriverPtr driver,
int *maxparams,
unsigned int privflags ATTRIBUTE_UNUSED)
{
+ qemuDomainObjPrivatePtr priv;
size_t i;
qemuMonitorIOThreadInfoPtr *iothreads = NULL;
int niothreads;
@@ -20882,6 +20884,10 @@ qemuDomainGetStatsIOThread(virQEMUDriverPtr driver,
if (!virDomainObjIsActive(dom))
return 0;

+ priv = dom->privateData;
+ if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD))
+ return 0;
+
if ((niothreads = qemuDomainGetIOThreadsMon(driver, dom, &iothreads)) < 0)
return -1;
--
2.17.2
Ján Tomko
2018-12-06 13:59:06 UTC
Permalink
Post by John Ferlan
Commit 212dc9286 made a generic qemuDomainGetIOThreadsMon which
would fail if the QEMU_CAPS_OBJECT_IOTHREAD didn't exist. Then
commit d1eac927 used that helper for the collection of all domain
stats. However, if the capability doesn't exist, then the entire
stats collection fails. Since the IOThread stats were meant to be
if available only, thus rather than failing if the capability
doesn't exist, let's just not collect the stats. Restore the caps
failure logic for qemuDomainGetIOThreadsLive.
---
src/qemu/qemu_driver.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 2fb8eef609..60e29577ad 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -5490,16 +5490,9 @@ qemuDomainGetIOThreadsMon(virQEMUDriverPtr driver,
virDomainObjPtr vm,
qemuMonitorIOThreadInfoPtr **iothreads)
{
- qemuDomainObjPrivatePtr priv;
+ qemuDomainObjPrivatePtr priv = vm->privateData;
int niothreads = 0;
- priv = vm->privateData;
- if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD)) {
- virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
- _("IOThreads not supported with this binary"));
- return -1;
- }
-
qemuDomainObjEnterMonitor(driver, vm);
niothreads = qemuMonitorGetIOThreads(priv->mon, iothreads);
if (qemuDomainObjExitMonitor(driver, vm) < 0 || niothreads < 0)
@@ -5514,6 +5507,7 @@ qemuDomainGetIOThreadsLive(virQEMUDriverPtr driver,
virDomainObjPtr vm,
virDomainIOThreadInfoPtr **info)
{
+ qemuDomainObjPrivatePtr priv;
qemuMonitorIOThreadInfoPtr *iothreads = NULL;
virDomainIOThreadInfoPtr *info_ret = NULL;
int niothreads = 0;
@@ -5529,6 +5523,13 @@ qemuDomainGetIOThreadsLive(virQEMUDriverPtr driver,
goto endjob;
}
+ priv = vm->privateData;
+ if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD)) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("IOThreads not supported with this binary"));
+ goto endjob;
+ }
+
if ((niothreads = qemuDomainGetIOThreadsMon(driver, vm, &iothreads)) < 0)
goto endjob;
@@ -20874,6 +20875,7 @@ qemuDomainGetStatsIOThread(virQEMUDriverPtr driver,
int *maxparams,
unsigned int privflags ATTRIBUTE_UNUSED)
{
+ qemuDomainObjPrivatePtr priv;
dom is locked when this function is called, you can initialize priv here
Post by John Ferlan
size_t i;
qemuMonitorIOThreadInfoPtr *iothreads = NULL;
int niothreads;
@@ -20882,6 +20884,10 @@ qemuDomainGetStatsIOThread(virQEMUDriverPtr driver,
if (!virDomainObjIsActive(dom))
return 0;
+ priv = dom->privateData;
+ if (!virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_OBJECT_IOTHREAD))
+ return 0;
+
if ((niothreads = qemuDomainGetIOThreadsMon(driver, dom, &iothreads)) < 0)
return -1;
Reviewed-by: Ján Tomko <***@redhat.com>

Jano
John Ferlan
2018-11-30 14:56:57 UTC
Permalink
Thanks for the quick peek on patch1; however, I was hoping to have these
patches be considered for the current release seeing as patch2 fixes a
condition introduced in the release and because the next release won't
be until mid January or so. The downside of not pushing patch2 is that
running 'virsh domstats' will fail completely on a domain that doesn't
have the IOThread capability. Without patch1, you'd only get the generic
cause is unknown reason. If it's felt that the patches can wait for the
next release that's fine, but no harm in asking...

Tks,

John
Post by John Ferlan
One is longer term (patch1), while the other is sourced in this
release (4.10.0) when IOThread stats were added.
qemu: Save qemuDomainGetStats error
qemu: Don't fail stats collection due to IOThread capability
src/qemu/qemu_driver.c | 28 ++++++++++++++++++++--------
1 file changed, 20 insertions(+), 8 deletions(-)
John Ferlan
2018-12-06 12:43:04 UTC
Permalink
ping?

Tks,

John
Post by John Ferlan
One is longer term (patch1), while the other is sourced in this
release (4.10.0) when IOThread stats were added.
qemu: Save qemuDomainGetStats error
qemu: Don't fail stats collection due to IOThread capability
src/qemu/qemu_driver.c | 28 ++++++++++++++++++++--------
1 file changed, 20 insertions(+), 8 deletions(-)
Loading...