Discussion:
[libvirt] [PATCH] libvirt: add daemon itself as shutdown reason
Nikolay Shirokovskiy
2018-10-08 11:21:08 UTC
Permalink
Let's introduce shutdown reason "daemon" which means we have to
kill running domain ourselves as the best action we can take at
that moment. Failure to pick up domain on daemon restart is
one example of such case. Using reason "crashed" is a bit misleading
as it is used when qemu is actually crashed or qemu destroyed on
guest panic as result of user choice that is the problem was
in qemu/guest itself. So I propose to use "crashed" only for
qemu side issues and introduce "daemon" when we have to kill the qemu
for good.

There is another example where "daemon" will be useful. If we can
not reboot domain we kill it and got "crashed" reason now. Looks
like good candidate for "daemon" reason.

Signed-off-by: Nikolay Shirokovskiy <***@virtuozzo.com>
---
include/libvirt/libvirt-domain.h | 1 +
src/conf/domain_conf.c | 3 ++-
src/qemu/qemu_process.c | 11 ++++-------
tools/virsh-domain-monitor.c | 3 ++-
4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
index fdd2d6b..11fdab5 100644
--- a/include/libvirt/libvirt-domain.h
+++ b/include/libvirt/libvirt-domain.h
@@ -145,6 +145,7 @@ typedef enum {
VIR_DOMAIN_SHUTOFF_FAILED = 6, /* domain failed to start */
VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT = 7, /* restored from a snapshot which was
* taken while domain was shutoff */
+ VIR_DOMAIN_SHUTOFF_DAEMON = 8, /* daemon have to kill domain */
# ifdef VIR_ENUM_SENTINELS
VIR_DOMAIN_SHUTOFF_LAST
# endif
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 9911d56..e441723 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -779,7 +779,8 @@ VIR_ENUM_IMPL(virDomainShutoffReason, VIR_DOMAIN_SHUTOFF_LAST,
"migrated",
"saved",
"failed",
- "from snapshot")
+ "from snapshot",
+ "daemon")

VIR_ENUM_IMPL(virDomainCrashedReason, VIR_DOMAIN_CRASHED_LAST,
"unknown",
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index e9c7618..c4bc9ca 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7988,15 +7988,12 @@ qemuProcessReconnect(void *opaque)
/* We can't get the monitor back, so must kill the VM
* to remove danger of it ending up running twice if
* user tries to start it again later
- * If we couldn't get the monitor since QEMU supports
- * no-shutdown, we can safely say that the domain
- * crashed ... */
- state = VIR_DOMAIN_SHUTOFF_CRASHED;
- /* If BeginJob failed, we jumped here without a job, let's hope another
+ * If BeginJob failed, we jumped here without a job, let's hope another
* thread didn't have a chance to start playing with the domain yet
* (it's all we can do anyway).
*/
- qemuProcessStop(driver, obj, state, QEMU_ASYNC_JOB_NONE, stopFlags);
+ qemuProcessStop(driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
+ QEMU_ASYNC_JOB_NONE, stopFlags);
}
goto cleanup;
}
@@ -8035,7 +8032,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
* is no thread that could be doing anything else with the same domain
* object.
*/
- qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
+ qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
QEMU_ASYNC_JOB_NONE, 0);
qemuDomainRemoveInactiveJobLocked(src->driver, obj);

diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c
index 3a26363..f0ad558 100644
--- a/tools/virsh-domain-monitor.c
+++ b/tools/virsh-domain-monitor.c
@@ -212,7 +212,8 @@ VIR_ENUM_IMPL(virshDomainShutoffReason,
N_("migrated"),
N_("saved"),
N_("failed"),
- N_("from snapshot"))
+ N_("from snapshot"),
+ N_("daemon"))

VIR_ENUM_DECL(virshDomainCrashedReason)
VIR_ENUM_IMPL(virshDomainCrashedReason,
--
1.8.3.1
John Ferlan
2018-10-16 12:48:52 UTC
Permalink
Post by Nikolay Shirokovskiy
Let's introduce shutdown reason "daemon" which means we have to
kill running domain ourselves as the best action we can take at
that moment. Failure to pick up domain on daemon restart is
one example of such case. Using reason "crashed" is a bit misleading
as it is used when qemu is actually crashed or qemu destroyed on
guest panic as result of user choice that is the problem was
in qemu/guest itself. So I propose to use "crashed" only for
qemu side issues and introduce "daemon" when we have to kill the qemu
for good.
How about (although reading below may shed some more context):

This patch introduces a new shutdown reason "daemon" in order
to indicate that the daemon needed to force shutdown the domain
as the best course of action to take at the moment.

This action would occur during Reconnect processing when it's
determined that either the QEMU monitor no longer exists for
some unknown reason or creation of a thread to reconnect the
domain failed.
Post by Nikolay Shirokovskiy
There is another example where "daemon" will be useful. If we can
not reboot domain we kill it and got "crashed" reason now. Looks
like good candidate for "daemon" reason.
If you feel this way, then a followup patch could/should be posted;
otherwise, this'll be lost to commit message heaven where all good ideas
go to die ;-).
Post by Nikolay Shirokovskiy
---
include/libvirt/libvirt-domain.h | 1 +
src/conf/domain_conf.c | 3 ++-
src/qemu/qemu_process.c | 11 ++++-------
tools/virsh-domain-monitor.c | 3 ++-
4 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
index fdd2d6b..11fdab5 100644
--- a/include/libvirt/libvirt-domain.h
+++ b/include/libvirt/libvirt-domain.h
@@ -145,6 +145,7 @@ typedef enum {
VIR_DOMAIN_SHUTOFF_FAILED = 6, /* domain failed to start */
VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT = 7, /* restored from a snapshot which was
* taken while domain was shutoff */
+ VIR_DOMAIN_SHUTOFF_DAEMON = 8, /* daemon have to kill domain */
s/have to/decides to/
Post by Nikolay Shirokovskiy
# ifdef VIR_ENUM_SENTINELS
VIR_DOMAIN_SHUTOFF_LAST
# endif
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 9911d56..e441723 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -779,7 +779,8 @@ VIR_ENUM_IMPL(virDomainShutoffReason, VIR_DOMAIN_SHUTOFF_LAST,
"migrated",
"saved",
"failed",
- "from snapshot")
+ "from snapshot",
+ "daemon")
VIR_ENUM_IMPL(virDomainCrashedReason, VIR_DOMAIN_CRASHED_LAST,
"unknown",
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index e9c7618..c4bc9ca 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7988,15 +7988,12 @@ qemuProcessReconnect(void *opaque)
/* We can't get the monitor back, so must kill the VM
* to remove danger of it ending up running twice if
* user tries to start it again later
Since we're changing anyway, let's put the stop there, e.g.
"s/later/later./"
Post by Nikolay Shirokovskiy
- * If we couldn't get the monitor since QEMU supports
- * no-shutdown, we can safely say that the domain
- * crashed ... */
- state = VIR_DOMAIN_SHUTOFF_CRASHED;
- /* If BeginJob failed, we jumped here without a job, let's hope another
+ * If BeginJob failed, we jumped here without a job, let's hope another
* thread didn't have a chance to start playing with the domain yet
* (it's all we can do anyway).
*/
- qemuProcessStop(driver, obj, state, QEMU_ASYNC_JOB_NONE, stopFlags);
+ qemuProcessStop(driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
+ QEMU_ASYNC_JOB_NONE, stopFlags);
}
goto cleanup;
}
@@ -8035,7 +8032,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
* is no thread that could be doing anything else with the same domain
* object.
*/
- qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
+ qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
QEMU_ASYNC_JOB_NONE, 0);
qemuDomainRemoveInactiveJobLocked(src->driver, obj);
When qemuProcessReconnectHelper was introduced (commit d38897a5d)
reconnection failure used VIR_DOMAIN_SHUTOFF_FAILED; however, that was
changed in commit bda2f17d to either VIR_DOMAIN_SHUTOFF_CRASHED or
VIR_DOMAIN_SHUTOFF_UNKNOWN. When QEMU_CAPS_NO_SHUTDOWN checking was
removed in commit fe35b1ad6 the conditional state was just left at
VIR_DOMAIN_SHUTOFF_CRASHED.

Still I agree w/ Martin's thoughts, which unfortunately wasn't commented
on in the code leading to the latter revert commit losing that unknown
reason. The reason for the "-no-shutdown" *did* change and that probably
should have been reflected in the qemuProcessReconnect logic.

So I think perhaps we should have an initial patch which references or
uses that first paragraph in order to change the code to:

/* We cannot get the monitor back, so kill the VM to
* remove possibility of it ending up running twice if
* user tries to start it again later.
*
* If we cannot get to the monitor when the QEMU command
* line used -no-shutdown, then we can safely say that the
* domain crashed; otherwise we don't really know. */
if (priv->monJSON && priv->allowReboot == VIR_TRISTATE_BOOL_YES)
state = VIR_DOMAIN_SHUTOFF_CRASHED;
else
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;

and uses the short commit msg of "qemu: Restore lost shutdown reason".

Then this followup would change the "UNKNOWN" to "DAEMON" keeping the
CRASHED otherwise.

John
Post by Nikolay Shirokovskiy
diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c
index 3a26363..f0ad558 100644
--- a/tools/virsh-domain-monitor.c
+++ b/tools/virsh-domain-monitor.c
@@ -212,7 +212,8 @@ VIR_ENUM_IMPL(virshDomainShutoffReason,
N_("migrated"),
N_("saved"),
N_("failed"),
- N_("from snapshot"))
+ N_("from snapshot"),
+ N_("daemon"))
VIR_ENUM_DECL(virshDomainCrashedReason)
VIR_ENUM_IMPL(virshDomainCrashedReason,
Nikolay Shirokovskiy
2018-10-17 08:16:38 UTC
Permalink
Post by John Ferlan
Post by Nikolay Shirokovskiy
Let's introduce shutdown reason "daemon" which means we have to
kill running domain ourselves as the best action we can take at
that moment. Failure to pick up domain on daemon restart is
one example of such case. Using reason "crashed" is a bit misleading
as it is used when qemu is actually crashed or qemu destroyed on
guest panic as result of user choice that is the problem was
in qemu/guest itself. So I propose to use "crashed" only for
qemu side issues and introduce "daemon" when we have to kill the qemu
for good.
This patch introduces a new shutdown reason "daemon" in order
to indicate that the daemon needed to force shutdown the domain
as the best course of action to take at the moment.
This action would occur during Reconnect processing when it's
determined that either the QEMU monitor no longer exists for
some unknown reason or creation of a thread to reconnect the
domain failed.
Mostly I'm fine with this one except "monitor no longer exists"
is condition for reason "crashed" or "unknown" (Martin's contribution)
Post by John Ferlan
Post by Nikolay Shirokovskiy
There is another example where "daemon" will be useful. If we can
not reboot domain we kill it and got "crashed" reason now. Looks
like good candidate for "daemon" reason.
If you feel this way, then a followup patch could/should be posted;
otherwise, this'll be lost to commit message heaven where all good ideas
go to die ;-).
Sure!
Post by John Ferlan
Post by Nikolay Shirokovskiy
---
include/libvirt/libvirt-domain.h | 1 +
src/conf/domain_conf.c | 3 ++-
src/qemu/qemu_process.c | 11 ++++-------
tools/virsh-domain-monitor.c | 3 ++-
4 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
index fdd2d6b..11fdab5 100644
--- a/include/libvirt/libvirt-domain.h
+++ b/include/libvirt/libvirt-domain.h
@@ -145,6 +145,7 @@ typedef enum {
VIR_DOMAIN_SHUTOFF_FAILED = 6, /* domain failed to start */
VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT = 7, /* restored from a snapshot which was
* taken while domain was shutoff */
+ VIR_DOMAIN_SHUTOFF_DAEMON = 8, /* daemon have to kill domain */
s/have to/decides to/
Post by Nikolay Shirokovskiy
# ifdef VIR_ENUM_SENTINELS
VIR_DOMAIN_SHUTOFF_LAST
# endif
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 9911d56..e441723 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -779,7 +779,8 @@ VIR_ENUM_IMPL(virDomainShutoffReason, VIR_DOMAIN_SHUTOFF_LAST,
"migrated",
"saved",
"failed",
- "from snapshot")
+ "from snapshot",
+ "daemon")
VIR_ENUM_IMPL(virDomainCrashedReason, VIR_DOMAIN_CRASHED_LAST,
"unknown",
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index e9c7618..c4bc9ca 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7988,15 +7988,12 @@ qemuProcessReconnect(void *opaque)
/* We can't get the monitor back, so must kill the VM
* to remove danger of it ending up running twice if
* user tries to start it again later
Since we're changing anyway, let's put the stop there, e.g.
"s/later/later./"
Post by Nikolay Shirokovskiy
- * If we couldn't get the monitor since QEMU supports
- * no-shutdown, we can safely say that the domain
- * crashed ... */
- state = VIR_DOMAIN_SHUTOFF_CRASHED;
- /* If BeginJob failed, we jumped here without a job, let's hope another
+ * If BeginJob failed, we jumped here without a job, let's hope another
* thread didn't have a chance to start playing with the domain yet
* (it's all we can do anyway).
*/
- qemuProcessStop(driver, obj, state, QEMU_ASYNC_JOB_NONE, stopFlags);
+ qemuProcessStop(driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
+ QEMU_ASYNC_JOB_NONE, stopFlags);
}
goto cleanup;
}
@@ -8035,7 +8032,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
* is no thread that could be doing anything else with the same domain
* object.
*/
- qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
+ qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
QEMU_ASYNC_JOB_NONE, 0);
qemuDomainRemoveInactiveJobLocked(src->driver, obj);
When qemuProcessReconnectHelper was introduced (commit d38897a5d)
reconnection failure used VIR_DOMAIN_SHUTOFF_FAILED; however, that was
changed in commit bda2f17d to either VIR_DOMAIN_SHUTOFF_CRASHED or
VIR_DOMAIN_SHUTOFF_UNKNOWN. When QEMU_CAPS_NO_SHUTDOWN checking was
removed in commit fe35b1ad6 the conditional state was just left at
VIR_DOMAIN_SHUTOFF_CRASHED.
Still I agree w/ Martin's thoughts, which unfortunately wasn't commented
on in the code leading to the latter revert commit losing that unknown
reason. The reason for the "-no-shutdown" *did* change and that probably
should have been reflected in the qemuProcessReconnect logic.
So I think perhaps we should have an initial patch which references or
/* We cannot get the monitor back, so kill the VM to
* remove possibility of it ending up running twice if
* user tries to start it again later.
*
* If we cannot get to the monitor when the QEMU command
* line used -no-shutdown, then we can safely say that the
* domain crashed; otherwise we don't really know. */
if (priv->monJSON && priv->allowReboot == VIR_TRISTATE_BOOL_YES)
state = VIR_DOMAIN_SHUTOFF_CRASHED;
else
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
and uses the short commit msg of "qemu: Restore lost shutdown reason".
Then this followup would change the "UNKNOWN" to "DAEMON" keeping the
CRASHED otherwise.
Then we will get "crashed" reason always for modern qemu and domains
that can reboot which is not true.

However I see now that the issue is more complicated then I thought. I agree
with Martin too but we have to additionally check that connect(2) for domain
monitor socket was called and error was ECONNREFUSED. So finally state is
resolved like this:

if (connect(2) was not called) {
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
else if (connect(2) result is NOT ECONNREFUSED) {
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
} else if (connect(2) result is ECONNREFUSED) {
if (priv->monJSON && priv->allowReboot == VIR_TRISTATE_BOOL_YES)
state = VIR_DOMAIN_SHUTOFF_CRASHED;
else
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
} else {
state = VIR_DOMAIN_SHUTOFF_DAEMON;
}

Moreover we should not later kill process by remembered pid in all cases except
last one in the above code because we can kill some other process.

Looks like a bit complicated, not sure should we pursuit this one or just left
VIR_DOMAIN_SHUTOFF_UNKNOWN only (in this case we can introduce
VIR_DOMAIN_SHUTOFF_DAEMON for reboot failures where reason is obvisous).

Nikolay
Post by John Ferlan
Post by Nikolay Shirokovskiy
diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c
index 3a26363..f0ad558 100644
--- a/tools/virsh-domain-monitor.c
+++ b/tools/virsh-domain-monitor.c
@@ -212,7 +212,8 @@ VIR_ENUM_IMPL(virshDomainShutoffReason,
N_("migrated"),
N_("saved"),
N_("failed"),
- N_("from snapshot"))
+ N_("from snapshot"),
+ N_("daemon"))
VIR_ENUM_DECL(virshDomainCrashedReason)
VIR_ENUM_IMPL(virshDomainCrashedReason,
John Ferlan
2018-10-18 15:30:10 UTC
Permalink
Post by Nikolay Shirokovskiy
Post by John Ferlan
Post by Nikolay Shirokovskiy
Let's introduce shutdown reason "daemon" which means we have to
kill running domain ourselves as the best action we can take at
that moment. Failure to pick up domain on daemon restart is
one example of such case. Using reason "crashed" is a bit misleading
as it is used when qemu is actually crashed or qemu destroyed on
guest panic as result of user choice that is the problem was
in qemu/guest itself. So I propose to use "crashed" only for
qemu side issues and introduce "daemon" when we have to kill the qemu
for good.
This patch introduces a new shutdown reason "daemon" in order
to indicate that the daemon needed to force shutdown the domain
as the best course of action to take at the moment.
This action would occur during Reconnect processing when it's
determined that either the QEMU monitor no longer exists for
some unknown reason or creation of a thread to reconnect the
domain failed.
Mostly I'm fine with this one except "monitor no longer exists"
is condition for reason "crashed" or "unknown" (Martin's contribution)
Post by John Ferlan
Post by Nikolay Shirokovskiy
There is another example where "daemon" will be useful. If we can
not reboot domain we kill it and got "crashed" reason now. Looks
like good candidate for "daemon" reason.
If you feel this way, then a followup patch could/should be posted;
otherwise, this'll be lost to commit message heaven where all good ideas
go to die ;-).
Sure!
Post by John Ferlan
Post by Nikolay Shirokovskiy
---
include/libvirt/libvirt-domain.h | 1 +
src/conf/domain_conf.c | 3 ++-
src/qemu/qemu_process.c | 11 ++++-------
tools/virsh-domain-monitor.c | 3 ++-
4 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
index fdd2d6b..11fdab5 100644
--- a/include/libvirt/libvirt-domain.h
+++ b/include/libvirt/libvirt-domain.h
@@ -145,6 +145,7 @@ typedef enum {
VIR_DOMAIN_SHUTOFF_FAILED = 6, /* domain failed to start */
VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT = 7, /* restored from a snapshot which was
* taken while domain was shutoff */
+ VIR_DOMAIN_SHUTOFF_DAEMON = 8, /* daemon have to kill domain */
s/have to/decides to/
Post by Nikolay Shirokovskiy
# ifdef VIR_ENUM_SENTINELS
VIR_DOMAIN_SHUTOFF_LAST
# endif
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 9911d56..e441723 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -779,7 +779,8 @@ VIR_ENUM_IMPL(virDomainShutoffReason, VIR_DOMAIN_SHUTOFF_LAST,
"migrated",
"saved",
"failed",
- "from snapshot")
+ "from snapshot",
+ "daemon")
VIR_ENUM_IMPL(virDomainCrashedReason, VIR_DOMAIN_CRASHED_LAST,
"unknown",
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index e9c7618..c4bc9ca 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7988,15 +7988,12 @@ qemuProcessReconnect(void *opaque)
/* We can't get the monitor back, so must kill the VM
* to remove danger of it ending up running twice if
* user tries to start it again later
Since we're changing anyway, let's put the stop there, e.g.
"s/later/later./"
Post by Nikolay Shirokovskiy
- * If we couldn't get the monitor since QEMU supports
- * no-shutdown, we can safely say that the domain
- * crashed ... */
- state = VIR_DOMAIN_SHUTOFF_CRASHED;
- /* If BeginJob failed, we jumped here without a job, let's hope another
+ * If BeginJob failed, we jumped here without a job, let's hope another
* thread didn't have a chance to start playing with the domain yet
* (it's all we can do anyway).
*/
- qemuProcessStop(driver, obj, state, QEMU_ASYNC_JOB_NONE, stopFlags);
+ qemuProcessStop(driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
+ QEMU_ASYNC_JOB_NONE, stopFlags);
}
goto cleanup;
}
@@ -8035,7 +8032,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
* is no thread that could be doing anything else with the same domain
* object.
*/
- qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
+ qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
QEMU_ASYNC_JOB_NONE, 0);
qemuDomainRemoveInactiveJobLocked(src->driver, obj);
When qemuProcessReconnectHelper was introduced (commit d38897a5d)
reconnection failure used VIR_DOMAIN_SHUTOFF_FAILED; however, that was
changed in commit bda2f17d to either VIR_DOMAIN_SHUTOFF_CRASHED or
VIR_DOMAIN_SHUTOFF_UNKNOWN. When QEMU_CAPS_NO_SHUTDOWN checking was
removed in commit fe35b1ad6 the conditional state was just left at
VIR_DOMAIN_SHUTOFF_CRASHED.
Still I agree w/ Martin's thoughts, which unfortunately wasn't commented
on in the code leading to the latter revert commit losing that unknown
reason. The reason for the "-no-shutdown" *did* change and that probably
should have been reflected in the qemuProcessReconnect logic.
So I think perhaps we should have an initial patch which references or
/* We cannot get the monitor back, so kill the VM to
* remove possibility of it ending up running twice if
* user tries to start it again later.
*
* If we cannot get to the monitor when the QEMU command
* line used -no-shutdown, then we can safely say that the
* domain crashed; otherwise we don't really know. */
if (priv->monJSON && priv->allowReboot == VIR_TRISTATE_BOOL_YES)
state = VIR_DOMAIN_SHUTOFF_CRASHED;
else
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
and uses the short commit msg of "qemu: Restore lost shutdown reason".
I've posted a patch for the above, see:

https://www.redhat.com/archives/libvir-list/2018-October/msg00945.html

while we work through the rest.
Post by Nikolay Shirokovskiy
Post by John Ferlan
Then this followup would change the "UNKNOWN" to "DAEMON" keeping the
CRASHED otherwise.
Then we will get "crashed" reason always for modern qemu and domains
that can reboot which is not true.
However I see now that the issue is more complicated then I thought. I agree
with Martin too but we have to additionally check that connect(2) for domain
monitor socket was called and error was ECONNREFUSED. So finally state is
if (connect(2) was not called) {
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;> else if (connect(2) result is NOT ECONNREFUSED) {
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
} else if (connect(2) result is ECONNREFUSED) {
if (priv->monJSON && priv->allowReboot == VIR_TRISTATE_BOOL_YES)
state = VIR_DOMAIN_SHUTOFF_CRASHED;
else
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
} else {
state = VIR_DOMAIN_SHUTOFF_DAEMON;
}
Do we really want to dig into why qemuConnectMonitor failed? Or is it
good enough to do something because it did fail. IOW: What condition of
using DAEMON as the reason is most important?

Failure before we try to open monitor?
Failure to open the monitor?
Failure after opening the monitor?

Once we open the monitor, priv->mon is set and we could use that.
Post by Nikolay Shirokovskiy
Moreover we should not later kill process by remembered pid in all cases except
last one in the above code because we can kill some other process.
Not quite sure I follow. This is the reconnect path to connect so some
existing/running domain. The {vm|obj}->pid would be something filled via
/var/run/libvirt/qemu processing for {domain}.{xml|pid}. Different hunk
of code. Maybe I'm misreading your comment.

John
Post by Nikolay Shirokovskiy
Looks like a bit complicated, not sure should we pursuit this one or just left
VIR_DOMAIN_SHUTOFF_UNKNOWN only (in this case we can introduce
VIR_DOMAIN_SHUTOFF_DAEMON for reboot failures where reason is obvisous).
Nikolay
Post by John Ferlan
Post by Nikolay Shirokovskiy
diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c
index 3a26363..f0ad558 100644
--- a/tools/virsh-domain-monitor.c
+++ b/tools/virsh-domain-monitor.c
@@ -212,7 +212,8 @@ VIR_ENUM_IMPL(virshDomainShutoffReason,
N_("migrated"),
N_("saved"),
N_("failed"),
- N_("from snapshot"))
+ N_("from snapshot"),
+ N_("daemon"))
VIR_ENUM_DECL(virshDomainCrashedReason)
VIR_ENUM_IMPL(virshDomainCrashedReason,
Nikolay Shirokovskiy
2018-10-19 09:24:38 UTC
Permalink
Post by John Ferlan
Post by Nikolay Shirokovskiy
Post by John Ferlan
Post by Nikolay Shirokovskiy
Let's introduce shutdown reason "daemon" which means we have to
kill running domain ourselves as the best action we can take at
that moment. Failure to pick up domain on daemon restart is
one example of such case. Using reason "crashed" is a bit misleading
as it is used when qemu is actually crashed or qemu destroyed on
guest panic as result of user choice that is the problem was
in qemu/guest itself. So I propose to use "crashed" only for
qemu side issues and introduce "daemon" when we have to kill the qemu
for good.
This patch introduces a new shutdown reason "daemon" in order
to indicate that the daemon needed to force shutdown the domain
as the best course of action to take at the moment.
This action would occur during Reconnect processing when it's
determined that either the QEMU monitor no longer exists for
some unknown reason or creation of a thread to reconnect the
domain failed.
Mostly I'm fine with this one except "monitor no longer exists"
is condition for reason "crashed" or "unknown" (Martin's contribution)
Post by John Ferlan
Post by Nikolay Shirokovskiy
There is another example where "daemon" will be useful. If we can
not reboot domain we kill it and got "crashed" reason now. Looks
like good candidate for "daemon" reason.
If you feel this way, then a followup patch could/should be posted;
otherwise, this'll be lost to commit message heaven where all good ideas
go to die ;-).
Sure!
Post by John Ferlan
Post by Nikolay Shirokovskiy
---
include/libvirt/libvirt-domain.h | 1 +
src/conf/domain_conf.c | 3 ++-
src/qemu/qemu_process.c | 11 ++++-------
tools/virsh-domain-monitor.c | 3 ++-
4 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
index fdd2d6b..11fdab5 100644
--- a/include/libvirt/libvirt-domain.h
+++ b/include/libvirt/libvirt-domain.h
@@ -145,6 +145,7 @@ typedef enum {
VIR_DOMAIN_SHUTOFF_FAILED = 6, /* domain failed to start */
VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT = 7, /* restored from a snapshot which was
* taken while domain was shutoff */
+ VIR_DOMAIN_SHUTOFF_DAEMON = 8, /* daemon have to kill domain */
s/have to/decides to/
Post by Nikolay Shirokovskiy
# ifdef VIR_ENUM_SENTINELS
VIR_DOMAIN_SHUTOFF_LAST
# endif
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 9911d56..e441723 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -779,7 +779,8 @@ VIR_ENUM_IMPL(virDomainShutoffReason, VIR_DOMAIN_SHUTOFF_LAST,
"migrated",
"saved",
"failed",
- "from snapshot")
+ "from snapshot",
+ "daemon")
VIR_ENUM_IMPL(virDomainCrashedReason, VIR_DOMAIN_CRASHED_LAST,
"unknown",
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index e9c7618..c4bc9ca 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7988,15 +7988,12 @@ qemuProcessReconnect(void *opaque)
/* We can't get the monitor back, so must kill the VM
* to remove danger of it ending up running twice if
* user tries to start it again later
Since we're changing anyway, let's put the stop there, e.g.
"s/later/later./"
Post by Nikolay Shirokovskiy
- * If we couldn't get the monitor since QEMU supports
- * no-shutdown, we can safely say that the domain
- * crashed ... */
- state = VIR_DOMAIN_SHUTOFF_CRASHED;
- /* If BeginJob failed, we jumped here without a job, let's hope another
+ * If BeginJob failed, we jumped here without a job, let's hope another
* thread didn't have a chance to start playing with the domain yet
* (it's all we can do anyway).
*/
- qemuProcessStop(driver, obj, state, QEMU_ASYNC_JOB_NONE, stopFlags);
+ qemuProcessStop(driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
+ QEMU_ASYNC_JOB_NONE, stopFlags);
}
goto cleanup;
}
@@ -8035,7 +8032,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
* is no thread that could be doing anything else with the same domain
* object.
*/
- qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
+ qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
QEMU_ASYNC_JOB_NONE, 0);
qemuDomainRemoveInactiveJobLocked(src->driver, obj);
When qemuProcessReconnectHelper was introduced (commit d38897a5d)
reconnection failure used VIR_DOMAIN_SHUTOFF_FAILED; however, that was
changed in commit bda2f17d to either VIR_DOMAIN_SHUTOFF_CRASHED or
VIR_DOMAIN_SHUTOFF_UNKNOWN. When QEMU_CAPS_NO_SHUTDOWN checking was
removed in commit fe35b1ad6 the conditional state was just left at
VIR_DOMAIN_SHUTOFF_CRASHED.
Still I agree w/ Martin's thoughts, which unfortunately wasn't commented
on in the code leading to the latter revert commit losing that unknown
reason. The reason for the "-no-shutdown" *did* change and that probably
should have been reflected in the qemuProcessReconnect logic.
So I think perhaps we should have an initial patch which references or
/* We cannot get the monitor back, so kill the VM to
* remove possibility of it ending up running twice if
* user tries to start it again later.
*
* If we cannot get to the monitor when the QEMU command
* line used -no-shutdown, then we can safely say that the
* domain crashed; otherwise we don't really know. */
if (priv->monJSON && priv->allowReboot == VIR_TRISTATE_BOOL_YES)
state = VIR_DOMAIN_SHUTOFF_CRASHED;
else
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
and uses the short commit msg of "qemu: Restore lost shutdown reason".
https://www.redhat.com/archives/libvir-list/2018-October/msg00945.html
while we work through the rest.
Post by Nikolay Shirokovskiy
Post by John Ferlan
Then this followup would change the "UNKNOWN" to "DAEMON" keeping the
CRASHED otherwise.
Then we will get "crashed" reason always for modern qemu and domains
that can reboot which is not true.
However I see now that the issue is more complicated then I thought. I agree
with Martin too but we have to additionally check that connect(2) for domain
monitor socket was called and error was ECONNREFUSED. So finally state is
if (connect(2) was not called) {
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;> else if (connect(2) result is NOT ECONNREFUSED) {
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
} else if (connect(2) result is ECONNREFUSED) {
if (priv->monJSON && priv->allowReboot == VIR_TRISTATE_BOOL_YES)
state = VIR_DOMAIN_SHUTOFF_CRASHED;
else
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
} else {
state = VIR_DOMAIN_SHUTOFF_DAEMON;
}
Do we really want to dig into why qemuConnectMonitor failed? Or is it
good enough to do something because it did fail. IOW: What condition of
using DAEMON as the reason is most important?
Failure before we try to open monitor?
Failure to open the monitor?
Failure after opening the monitor?
Once we open the monitor, priv->mon is set and we could use that.
Sure if priv->mon != NULL then the reason to shutdown is DAEMON only.

But for example the patch you sent is not quite correct. If we don't even try
to call connect or connect result is something unexpected then we can't be sure
of anything - the process may exist or not exist so we can't tell it crashed if
it was started with -noshutdown option. The virDomainObjIsActive is only
remembered state at this point. We need to set reason UNKNOWN in this case.
Let's call this the first case. The second case is when connect returns
ECONNREFUSED - this means nobody listening on monitor socket - this is the case
your patch should apply to. The problem is you can not distinguish case 1 and
2 now. In both cases you'll have priv->mon = NULL. Moreover in some cases we
can close monitor right after successful connection thus sometimes even if
priv->mon == NULL we need to set DAEMON reason. That's why I don't use
priv->mon in conditions in the above if/else snippet.
Post by John Ferlan
Post by Nikolay Shirokovskiy
Moreover we should not later kill process by remembered pid in all cases except
last one in the above code because we can kill some other process.
Not quite sure I follow. This is the reconnect path to connect so some
existing/running domain. The {vm|obj}->pid would be something filled via
/var/run/libvirt/qemu processing for {domain}.{xml|pid}. Different hunk
of code. Maybe I'm misreading your comment.
We can't be sure domain is running. And if it is not running some unrelated
process can reuse pid. So imagine we don't even have chance to call connect(2)
because of early error and qemu really died and some process with same pid
is running - we call qemuProcessStop on error path and shoot innocent process.

Even if we connect and get ECONNREFUSED which means there is no qemu process
pid can be reused and we should not kill. In short we should clear
vm->pid before calling qemuProcessStop if shutdown reason is different from DAEMON.

Nikolay
Post by John Ferlan
Post by Nikolay Shirokovskiy
Looks like a bit complicated, not sure should we pursuit this one or just left
VIR_DOMAIN_SHUTOFF_UNKNOWN only (in this case we can introduce
VIR_DOMAIN_SHUTOFF_DAEMON for reboot failures where reason is obvisous).
Nikolay
Post by John Ferlan
Post by Nikolay Shirokovskiy
diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c
index 3a26363..f0ad558 100644
--- a/tools/virsh-domain-monitor.c
+++ b/tools/virsh-domain-monitor.c
@@ -212,7 +212,8 @@ VIR_ENUM_IMPL(virshDomainShutoffReason,
N_("migrated"),
N_("saved"),
N_("failed"),
- N_("from snapshot"))
+ N_("from snapshot"),
+ N_("daemon"))
VIR_ENUM_DECL(virshDomainCrashedReason)
VIR_ENUM_IMPL(virshDomainCrashedReason,
John Ferlan
2018-10-19 16:48:47 UTC
Permalink
Post by Nikolay Shirokovskiy
Post by John Ferlan
Post by Nikolay Shirokovskiy
Post by John Ferlan
Post by Nikolay Shirokovskiy
Let's introduce shutdown reason "daemon" which means we have to
kill running domain ourselves as the best action we can take at
that moment. Failure to pick up domain on daemon restart is
one example of such case. Using reason "crashed" is a bit misleading
as it is used when qemu is actually crashed or qemu destroyed on
guest panic as result of user choice that is the problem was
in qemu/guest itself. So I propose to use "crashed" only for
qemu side issues and introduce "daemon" when we have to kill the qemu
for good.
This patch introduces a new shutdown reason "daemon" in order
to indicate that the daemon needed to force shutdown the domain
as the best course of action to take at the moment.
This action would occur during Reconnect processing when it's
determined that either the QEMU monitor no longer exists for
some unknown reason or creation of a thread to reconnect the
domain failed.
Mostly I'm fine with this one except "monitor no longer exists"
is condition for reason "crashed" or "unknown" (Martin's contribution)
Post by John Ferlan
Post by Nikolay Shirokovskiy
There is another example where "daemon" will be useful. If we can
not reboot domain we kill it and got "crashed" reason now. Looks
like good candidate for "daemon" reason.
If you feel this way, then a followup patch could/should be posted;
otherwise, this'll be lost to commit message heaven where all good ideas
go to die ;-).
Sure!
Post by John Ferlan
Post by Nikolay Shirokovskiy
---
include/libvirt/libvirt-domain.h | 1 +
src/conf/domain_conf.c | 3 ++-
src/qemu/qemu_process.c | 11 ++++-------
tools/virsh-domain-monitor.c | 3 ++-
4 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
index fdd2d6b..11fdab5 100644
--- a/include/libvirt/libvirt-domain.h
+++ b/include/libvirt/libvirt-domain.h
@@ -145,6 +145,7 @@ typedef enum {
VIR_DOMAIN_SHUTOFF_FAILED = 6, /* domain failed to start */
VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT = 7, /* restored from a snapshot which was
* taken while domain was shutoff */
+ VIR_DOMAIN_SHUTOFF_DAEMON = 8, /* daemon have to kill domain */
s/have to/decides to/
Post by Nikolay Shirokovskiy
# ifdef VIR_ENUM_SENTINELS
VIR_DOMAIN_SHUTOFF_LAST
# endif
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 9911d56..e441723 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -779,7 +779,8 @@ VIR_ENUM_IMPL(virDomainShutoffReason, VIR_DOMAIN_SHUTOFF_LAST,
"migrated",
"saved",
"failed",
- "from snapshot")
+ "from snapshot",
+ "daemon")
VIR_ENUM_IMPL(virDomainCrashedReason, VIR_DOMAIN_CRASHED_LAST,
"unknown",
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index e9c7618..c4bc9ca 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7988,15 +7988,12 @@ qemuProcessReconnect(void *opaque)
/* We can't get the monitor back, so must kill the VM
* to remove danger of it ending up running twice if
* user tries to start it again later
Since we're changing anyway, let's put the stop there, e.g.
"s/later/later./"
Post by Nikolay Shirokovskiy
- * If we couldn't get the monitor since QEMU supports
- * no-shutdown, we can safely say that the domain
- * crashed ... */
- state = VIR_DOMAIN_SHUTOFF_CRASHED;
- /* If BeginJob failed, we jumped here without a job, let's hope another
+ * If BeginJob failed, we jumped here without a job, let's hope another
* thread didn't have a chance to start playing with the domain yet
* (it's all we can do anyway).
*/
- qemuProcessStop(driver, obj, state, QEMU_ASYNC_JOB_NONE, stopFlags);
+ qemuProcessStop(driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
+ QEMU_ASYNC_JOB_NONE, stopFlags);
}
goto cleanup;
}
@@ -8035,7 +8032,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
* is no thread that could be doing anything else with the same domain
* object.
*/
- qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
+ qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
QEMU_ASYNC_JOB_NONE, 0);
qemuDomainRemoveInactiveJobLocked(src->driver, obj);
When qemuProcessReconnectHelper was introduced (commit d38897a5d)
reconnection failure used VIR_DOMAIN_SHUTOFF_FAILED; however, that was
changed in commit bda2f17d to either VIR_DOMAIN_SHUTOFF_CRASHED or
VIR_DOMAIN_SHUTOFF_UNKNOWN. When QEMU_CAPS_NO_SHUTDOWN checking was
removed in commit fe35b1ad6 the conditional state was just left at
VIR_DOMAIN_SHUTOFF_CRASHED.
Still I agree w/ Martin's thoughts, which unfortunately wasn't commented
on in the code leading to the latter revert commit losing that unknown
reason. The reason for the "-no-shutdown" *did* change and that probably
should have been reflected in the qemuProcessReconnect logic.
So I think perhaps we should have an initial patch which references or
/* We cannot get the monitor back, so kill the VM to
* remove possibility of it ending up running twice if
* user tries to start it again later.
*
* If we cannot get to the monitor when the QEMU command
* line used -no-shutdown, then we can safely say that the
* domain crashed; otherwise we don't really know. */
if (priv->monJSON && priv->allowReboot == VIR_TRISTATE_BOOL_YES)
state = VIR_DOMAIN_SHUTOFF_CRASHED;
else
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
and uses the short commit msg of "qemu: Restore lost shutdown reason".
https://www.redhat.com/archives/libvir-list/2018-October/msg00945.html
while we work through the rest.
Post by Nikolay Shirokovskiy
Post by John Ferlan
Then this followup would change the "UNKNOWN" to "DAEMON" keeping the
CRASHED otherwise.
Then we will get "crashed" reason always for modern qemu and domains
that can reboot which is not true.
However I see now that the issue is more complicated then I thought. I agree
with Martin too but we have to additionally check that connect(2) for domain
monitor socket was called and error was ECONNREFUSED. So finally state is
if (connect(2) was not called) {
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;> else if (connect(2) result is NOT ECONNREFUSED) {
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
} else if (connect(2) result is ECONNREFUSED) {
if (priv->monJSON && priv->allowReboot == VIR_TRISTATE_BOOL_YES)
state = VIR_DOMAIN_SHUTOFF_CRASHED;
else
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
} else {
state = VIR_DOMAIN_SHUTOFF_DAEMON;
}
Do we really want to dig into why qemuConnectMonitor failed? Or is it
good enough to do something because it did fail. IOW: What condition of
using DAEMON as the reason is most important?
Failure before we try to open monitor?
Failure to open the monitor?
Failure after opening the monitor?
Once we open the monitor, priv->mon is set and we could use that.
Sure if priv->mon != NULL then the reason to shutdown is DAEMON only.
But for example the patch you sent is not quite correct. If we don't even try
Please comment on it in that patch! I was going for "same check" used
to place the "-no-shutdown" on the command line.
Post by Nikolay Shirokovskiy
to call connect or connect result is something unexpected then we can't be sure
of anything - the process may exist or not exist so we can't tell it crashed if
it was started with -noshutdown option. The virDomainObjIsActive is only
remembered state at this point. We need to set reason UNKNOWN in this case.
Let's call this the first case. The second case is when connect returns
ECONNREFUSED - this means nobody listening on monitor socket - this is the case
your patch should apply to. The problem is you can not distinguish case 1 and
2 now. In both cases you'll have priv->mon = NULL. Moreover in some cases we
can close monitor right after successful connection thus sometimes even if
priv->mon == NULL we need to set DAEMON reason. That's why I don't use
priv->mon in conditions in the above if/else snippet.
Post by John Ferlan
Post by Nikolay Shirokovskiy
Moreover we should not later kill process by remembered pid in all cases except
last one in the above code because we can kill some other process.
Not quite sure I follow. This is the reconnect path to connect so some
existing/running domain. The {vm|obj}->pid would be something filled via
/var/run/libvirt/qemu processing for {domain}.{xml|pid}. Different hunk
of code. Maybe I'm misreading your comment.
We can't be sure domain is running. And if it is not running some unrelated
process can reuse pid. So imagine we don't even have chance to call connect(2)
because of early error and qemu really died and some process with same pid
is running - we call qemuProcessStop on error path and shoot innocent process.
I'd have to dig through all the code that handles the reconnection, but
I'm working under the assumption that the .xml and .pid files found in
the /var/run/libvirt/qemu directory would have been vetted previously to
determine whether the pid mentioned in the file related to a qemu
process or not. I certainly understand the downside of doing something
to someone else's process!

John
Post by Nikolay Shirokovskiy
Even if we connect and get ECONNREFUSED which means there is no qemu process
pid can be reused and we should not kill. In short we should clear
vm->pid before calling qemuProcessStop if shutdown reason is different from DAEMON.
Nikolay
Post by John Ferlan
Post by Nikolay Shirokovskiy
Looks like a bit complicated, not sure should we pursuit this one or just left
VIR_DOMAIN_SHUTOFF_UNKNOWN only (in this case we can introduce
VIR_DOMAIN_SHUTOFF_DAEMON for reboot failures where reason is obvisous).
Nikolay
Post by John Ferlan
Post by Nikolay Shirokovskiy
diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c
index 3a26363..f0ad558 100644
--- a/tools/virsh-domain-monitor.c
+++ b/tools/virsh-domain-monitor.c
@@ -212,7 +212,8 @@ VIR_ENUM_IMPL(virshDomainShutoffReason,
N_("migrated"),
N_("saved"),
N_("failed"),
- N_("from snapshot"))
+ N_("from snapshot"),
+ N_("daemon"))
VIR_ENUM_DECL(virshDomainCrashedReason)
VIR_ENUM_IMPL(virshDomainCrashedReason,
John Ferlan
2018-11-07 14:29:07 UTC
Permalink
Post by Nikolay Shirokovskiy
Let's introduce shutdown reason "daemon" which means we have to
kill running domain ourselves as the best action we can take at
that moment. Failure to pick up domain on daemon restart is
one example of such case. Using reason "crashed" is a bit misleading
as it is used when qemu is actually crashed or qemu destroyed on
guest panic as result of user choice that is the problem was
in qemu/guest itself. So I propose to use "crashed" only for
qemu side issues and introduce "daemon" when we have to kill the qemu
for good.
There is another example where "daemon" will be useful. If we can
not reboot domain we kill it and got "crashed" reason now. Looks
like good candidate for "daemon" reason.
---
include/libvirt/libvirt-domain.h | 1 +
src/conf/domain_conf.c | 3 ++-
src/qemu/qemu_process.c | 11 ++++-------
tools/virsh-domain-monitor.c | 3 ++-
4 files changed, 9 insertions(+), 9 deletions(-)
[...]

Now that I've pushed commits 296e05b54 and 8f0f8425d, let's revisit
this... Merging in that set of changes with this patch means that
Post by Nikolay Shirokovskiy
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index e9c7618..c4bc9ca 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7988,15 +7988,12 @@ qemuProcessReconnect(void *opaque)
/* We can't get the monitor back, so must kill the VM
* to remove danger of it ending up running twice if
* user tries to start it again later
- * If we couldn't get the monitor since QEMU supports
- * no-shutdown, we can safely say that the domain
- * crashed ... */
- state = VIR_DOMAIN_SHUTOFF_CRASHED;
- /* If BeginJob failed, we jumped here without a job, let's hope another
+ * If BeginJob failed, we jumped here without a job, let's hope another
* thread didn't have a chance to start playing with the domain yet
* (it's all we can do anyway).
*/
- qemuProcessStop(driver, obj, state, QEMU_ASYNC_JOB_NONE, stopFlags);
+ qemuProcessStop(driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
+ QEMU_ASYNC_JOB_NONE, stopFlags);
}
goto cleanup;
}
@@ -8035,7 +8032,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
* is no thread that could be doing anything else with the same domain
* object.
*/
- qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
+ qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
QEMU_ASYNC_JOB_NONE, 0);
qemuDomainRemoveInactiveJobLocked(src->driver, obj);
We'd have:

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 3bf84834ec..023e187729 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7995,10 +7995,14 @@ qemuProcessReconnect(void *opaque)
*
* If we cannot get to the monitor when the QEMU command
* line used -no-shutdown, then we can safely say that the
- * domain crashed; otherwise, we don't really know. */
+ * domain crashed; otherwise, if the monitor was started,
+ * then we can blame ourselves, else we failed before the
+ * monitor started so we don't really know. */
if (!priv->mon && tryMonReconn &&
qemuDomainIsUsingNoShutdown(priv))
state = VIR_DOMAIN_SHUTOFF_CRASHED;
+ else if (priv->mon)
+ state = VIR_DOMAIN_SHUTOFF_DAEMON;
else
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;

@@ -8045,7 +8049,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
* is no thread that could be doing anything else with the same
domain
* object.
*/
- qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
+ qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
QEMU_ASYNC_JOB_NONE, 0);
qemuDomainRemoveInactiveJobLocked(src->driver, obj);

[...]

So does this essentially meet/handle the condition you were trying to
alter? That is - once we get beyond the monitor reconnection, then if
we end up in error that means the daemon has decided to stop things.
Leaving the UNKNOWN to be the period prior to attempting to reconnect
and CRASHED to the period during which we try to connect to the monitor.

This also captures failures after connection is successful (when
priv->mon in qemuConnectMonitor).

Getting more specific failures of why the connection failed is perhaps a
future task if it really matters.

If this looks good to you let me know, I can merge and push with the
previously noted changes and a commit message of:

"This patch introduces a new shutdown reason "daemon" in order
to indicate that the daemon needed to force shutdown the domain
as the best course of action to take at the moment.

This action would occur during reconnection when processing encounters
an error once the monitor reconnection is successful."


John
Nikolay Shirokovskiy
2018-11-08 07:03:11 UTC
Permalink
Post by John Ferlan
Post by Nikolay Shirokovskiy
Let's introduce shutdown reason "daemon" which means we have to
kill running domain ourselves as the best action we can take at
that moment. Failure to pick up domain on daemon restart is
one example of such case. Using reason "crashed" is a bit misleading
as it is used when qemu is actually crashed or qemu destroyed on
guest panic as result of user choice that is the problem was
in qemu/guest itself. So I propose to use "crashed" only for
qemu side issues and introduce "daemon" when we have to kill the qemu
for good.
There is another example where "daemon" will be useful. If we can
not reboot domain we kill it and got "crashed" reason now. Looks
like good candidate for "daemon" reason.
---
include/libvirt/libvirt-domain.h | 1 +
src/conf/domain_conf.c | 3 ++-
src/qemu/qemu_process.c | 11 ++++-------
tools/virsh-domain-monitor.c | 3 ++-
4 files changed, 9 insertions(+), 9 deletions(-)
[...]
Now that I've pushed commits 296e05b54 and 8f0f8425d, let's revisit
this... Merging in that set of changes with this patch means that
Post by Nikolay Shirokovskiy
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index e9c7618..c4bc9ca 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7988,15 +7988,12 @@ qemuProcessReconnect(void *opaque)
/* We can't get the monitor back, so must kill the VM
* to remove danger of it ending up running twice if
* user tries to start it again later
- * If we couldn't get the monitor since QEMU supports
- * no-shutdown, we can safely say that the domain
- * crashed ... */
- state = VIR_DOMAIN_SHUTOFF_CRASHED;
- /* If BeginJob failed, we jumped here without a job, let's hope another
+ * If BeginJob failed, we jumped here without a job, let's hope another
* thread didn't have a chance to start playing with the domain yet
* (it's all we can do anyway).
*/
- qemuProcessStop(driver, obj, state, QEMU_ASYNC_JOB_NONE, stopFlags);
+ qemuProcessStop(driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
+ QEMU_ASYNC_JOB_NONE, stopFlags);
}
goto cleanup;
}
@@ -8035,7 +8032,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
* is no thread that could be doing anything else with the same domain
* object.
*/
- qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
+ qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
QEMU_ASYNC_JOB_NONE, 0);
qemuDomainRemoveInactiveJobLocked(src->driver, obj);
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 3bf84834ec..023e187729 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7995,10 +7995,14 @@ qemuProcessReconnect(void *opaque)
*
* If we cannot get to the monitor when the QEMU command
* line used -no-shutdown, then we can safely say that the
- * domain crashed; otherwise, we don't really know. */
+ * domain crashed; otherwise, if the monitor was started,
+ * then we can blame ourselves, else we failed before the
+ * monitor started so we don't really know. */
if (!priv->mon && tryMonReconn &&
qemuDomainIsUsingNoShutdown(priv))
state = VIR_DOMAIN_SHUTOFF_CRASHED;
+ else if (priv->mon)
+ state = VIR_DOMAIN_SHUTOFF_DAEMON;
else
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;
@@ -8045,7 +8049,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
* is no thread that could be doing anything else with the same domain
* object.
*/
- qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
+ qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
QEMU_ASYNC_JOB_NONE, 0);
I think we need to report VIR_DOMAIN_SHUTOFF_UNKNOWN here. This is similar to
qemuProcessReconnect before we try to reconnect. May be this hunk need to be
placed in distinct patch therefore.
Post by John Ferlan
qemuDomainRemoveInactiveJobLocked(src->driver, obj);
[...]
So does this essentially meet/handle the condition you were trying to
alter? That is - once we get beyond the monitor reconnection, then if
we end up in error that means the daemon has decided to stop things.
Leaving the UNKNOWN to be the period prior to attempting to reconnect
and CRASHED to the period during which we try to connect to the monitor.
This also captures failures after connection is successful (when
priv->mon in qemuConnectMonitor).
Getting more specific failures of why the connection failed is perhaps a
future task if it really matters.
This looks good enough for me too.
Post by John Ferlan
If this looks good to you let me know, I can merge and push with the
"This patch introduces a new shutdown reason "daemon" in order
to indicate that the daemon needed to force shutdown the domain
as the best course of action to take at the moment.
This action would occur during reconnection when processing encounters
an error once the monitor reconnection is successful."
Besides the first comment I'm fine with this change. I'm going
to send a patch to change reason to VIR_DOMAIN_SHUTOFF_DAEMON in
appropriate reboot failure cases too. Also one day let's fix the code not to kill
unrelated process.

Nikolay
John Ferlan
2018-11-08 12:58:19 UTC
Permalink
Post by Nikolay Shirokovskiy
Post by John Ferlan
Post by Nikolay Shirokovskiy
Let's introduce shutdown reason "daemon" which means we have to
kill running domain ourselves as the best action we can take at
that moment. Failure to pick up domain on daemon restart is
one example of such case. Using reason "crashed" is a bit misleading
as it is used when qemu is actually crashed or qemu destroyed on
guest panic as result of user choice that is the problem was
in qemu/guest itself. So I propose to use "crashed" only for
qemu side issues and introduce "daemon" when we have to kill the qemu
for good.
There is another example where "daemon" will be useful. If we can
not reboot domain we kill it and got "crashed" reason now. Looks
like good candidate for "daemon" reason.
---
include/libvirt/libvirt-domain.h | 1 +
src/conf/domain_conf.c | 3 ++-
src/qemu/qemu_process.c | 11 ++++-------
tools/virsh-domain-monitor.c | 3 ++-
4 files changed, 9 insertions(+), 9 deletions(-)
[...]
Post by Nikolay Shirokovskiy
Post by John Ferlan
@@ -8045,7 +8049,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
* is no thread that could be doing anything else with the same domain
* object.
*/
- qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
+ qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
QEMU_ASYNC_JOB_NONE, 0);
I think we need to report VIR_DOMAIN_SHUTOFF_UNKNOWN here. This is similar to
qemuProcessReconnect before we try to reconnect. May be this hunk need to be
placed in distinct patch therefore.
Perhaps keeping "FAILED" would be better - that way (and I can add this
to the qemuProcessReconnect description):

* Failure to reconnect will generate the following reasons:
* VIR_DOMAIN_SHUTOFF_FAILED => Tried to create this thread, but failed
* VIR_DOMAIN_SHUTOFF_UNKNOWN => Thread started, but failed before reconnect
* to the monitor
* VIR_DOMAIN_SHUTOFF_CRASHED => Attempted to reconnect to the monitor, but
* failed to open, assume domain crash
* VIR_DOMAIN_SHUTOFF_DAEMON => Reconnected to the monitor, but decided
* afterwards we needed to stop the process

Look good?

Tks -

John


[...]
Nikolay Shirokovskiy
2018-11-08 13:10:51 UTC
Permalink
Post by John Ferlan
Post by Nikolay Shirokovskiy
Post by John Ferlan
Post by Nikolay Shirokovskiy
Let's introduce shutdown reason "daemon" which means we have to
kill running domain ourselves as the best action we can take at
that moment. Failure to pick up domain on daemon restart is
one example of such case. Using reason "crashed" is a bit misleading
as it is used when qemu is actually crashed or qemu destroyed on
guest panic as result of user choice that is the problem was
in qemu/guest itself. So I propose to use "crashed" only for
qemu side issues and introduce "daemon" when we have to kill the qemu
for good.
There is another example where "daemon" will be useful. If we can
not reboot domain we kill it and got "crashed" reason now. Looks
like good candidate for "daemon" reason.
---
include/libvirt/libvirt-domain.h | 1 +
src/conf/domain_conf.c | 3 ++-
src/qemu/qemu_process.c | 11 ++++-------
tools/virsh-domain-monitor.c | 3 ++-
4 files changed, 9 insertions(+), 9 deletions(-)
[...]
Post by Nikolay Shirokovskiy
Post by John Ferlan
@@ -8045,7 +8049,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
* is no thread that could be doing anything else with the same domain
* object.
*/
- qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
+ qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
QEMU_ASYNC_JOB_NONE, 0);
I think we need to report VIR_DOMAIN_SHUTOFF_UNKNOWN here. This is similar to
qemuProcessReconnect before we try to reconnect. May be this hunk need to be
placed in distinct patch therefore.
Perhaps keeping "FAILED" would be better - that way (and I can add this
* VIR_DOMAIN_SHUTOFF_FAILED => Tried to create this thread, but failed
* VIR_DOMAIN_SHUTOFF_UNKNOWN => Thread started, but failed before reconnect
* to the monitor
* VIR_DOMAIN_SHUTOFF_CRASHED => Attempted to reconnect to the monitor, but
* failed to open, assume domain crash
* VIR_DOMAIN_SHUTOFF_DAEMON => Reconnected to the monitor, but decided
* afterwards we needed to stop the process
Look good?
Unfortunately meaning of each is fixed in API:

VIR_DOMAIN_SHUTOFF_UNKNOWN = 0, /* the reason is unknown */

VIR_DOMAIN_SHUTOFF_CRASHED = 3, /* domain crashed */

VIR_DOMAIN_SHUTOFF_FAILED = 6, /* domain failed to start */

So this is definetly not FAILED. I guess we can't change meaning now also, so
we have to use UNKNOWN and explaining reasons again is qemuProcessReconnect is
needless.

By the way how from reconnection POV failing to spawn a thread differ from any
error in qemuProcessReconnect before monitor connectiton.

Nikolay
John Ferlan
2018-11-08 13:52:43 UTC
Permalink
Post by Nikolay Shirokovskiy
Post by John Ferlan
Post by Nikolay Shirokovskiy
Post by John Ferlan
Post by Nikolay Shirokovskiy
Let's introduce shutdown reason "daemon" which means we have to
kill running domain ourselves as the best action we can take at
that moment. Failure to pick up domain on daemon restart is
one example of such case. Using reason "crashed" is a bit misleading
as it is used when qemu is actually crashed or qemu destroyed on
guest panic as result of user choice that is the problem was
in qemu/guest itself. So I propose to use "crashed" only for
qemu side issues and introduce "daemon" when we have to kill the qemu
for good.
There is another example where "daemon" will be useful. If we can
not reboot domain we kill it and got "crashed" reason now. Looks
like good candidate for "daemon" reason.
---
include/libvirt/libvirt-domain.h | 1 +
src/conf/domain_conf.c | 3 ++-
src/qemu/qemu_process.c | 11 ++++-------
tools/virsh-domain-monitor.c | 3 ++-
4 files changed, 9 insertions(+), 9 deletions(-)
[...]
Post by Nikolay Shirokovskiy
Post by John Ferlan
@@ -8045,7 +8049,7 @@ qemuProcessReconnectHelper(virDomainObjPtr obj,
* is no thread that could be doing anything else with the same domain
* object.
*/
- qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_FAILED,
+ qemuProcessStop(src->driver, obj, VIR_DOMAIN_SHUTOFF_DAEMON,
QEMU_ASYNC_JOB_NONE, 0);
I think we need to report VIR_DOMAIN_SHUTOFF_UNKNOWN here. This is similar to
qemuProcessReconnect before we try to reconnect. May be this hunk need to be
placed in distinct patch therefore.
Perhaps keeping "FAILED" would be better - that way (and I can add this
* VIR_DOMAIN_SHUTOFF_FAILED => Tried to create this thread, but failed
* VIR_DOMAIN_SHUTOFF_UNKNOWN => Thread started, but failed before reconnect
* to the monitor
* VIR_DOMAIN_SHUTOFF_CRASHED => Attempted to reconnect to the monitor, but
* failed to open, assume domain crash
* VIR_DOMAIN_SHUTOFF_DAEMON => Reconnected to the monitor, but decided
* afterwards we needed to stop the process
Look good?
VIR_DOMAIN_SHUTOFF_UNKNOWN = 0, /* the reason is unknown */
VIR_DOMAIN_SHUTOFF_CRASHED = 3, /* domain crashed */
VIR_DOMAIN_SHUTOFF_FAILED = 6, /* domain failed to start */
So this is definetly not FAILED. I guess we can't change meaning now also, so
we have to use UNKNOWN and explaining reasons again is qemuProcessReconnect is
needless.
By the way how from reconnection POV failing to spawn a thread differ from any
error in qemuProcessReconnect before monitor connectiton.
True, but let's walk through going through history

The FAILED reason was introduced in commit b046c55d4 (v0.9.2).

Then, commit d38897a5d (v0.9.5) introduced qemuProcessReconnectHelper to
run qemuProcessReconnect in a thread. This commit used FAILED for both
APIs because that kept the same reason.

Eventually commit bda2f17d7 (v0.9.12) changed FAILED to CRASHED and
UNKNOWN based on whether the domain was started w/ -no-shutdown. With
this commit, failure to start a thread would still use FAILED.

That stayed that way until commit fe35b1ad6 (v4.3.0) in inadvertently
removed UNKNOWN while removing the capability used to determine whether
-no-shutdown was used. This left CRASHED as the only reason.

That was repaired as a result of this patch by commit 296e05b54 (4.10.0).

So now the proposal which seems fair to me is to create a category
DAEMON which reduces the UNKNOWN window to better describe why we're
stopping the domain after reconnection.

So, going back to FAILED in qemuProcessReconnectHelper IMO is the right
thing to do at this point. If some patch comes along afterwards to
changed FAILED to UNKNOWN it can be debated then.

So, I'll restore the original FAILED reason, but not document the
various values since the reader of the code could figure it out.

John
John Ferlan
2018-11-13 20:36:59 UTC
Permalink
From: Nikolay Shirokovskiy <***@virtuozzo.com>

This patch introduces a new shutdown reason "daemon" in order
to indicate that the daemon needed to force shutdown the domain
as the best course of action to take at the moment.

This action would occur during reconnection when processing
encounters an error once the monitor reconnection is successful.

Signed-off-by: Nikolay Shirokovskiy <***@virtuozzo.com>
Reviewed-by: John Ferlan <***@redhat.com>
---

Since I hadn't seen a yay or nay on my last response related to
changes - figured I'd post things as they are now just to make it
easier to determine whether the changes made were acceptible
This is essentially your original patch merged with my other
changes and the removal of the qemu_process.c change.

include/libvirt/libvirt-domain.h | 2 ++
src/conf/domain_conf.c | 3 ++-
src/qemu/qemu_process.c | 6 +++++-
tools/virsh-domain-monitor.c | 3 ++-
4 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
index fdd2d6b8ea..71debd92b8 100644
--- a/include/libvirt/libvirt-domain.h
+++ b/include/libvirt/libvirt-domain.h
@@ -145,6 +145,8 @@ typedef enum {
VIR_DOMAIN_SHUTOFF_FAILED = 6, /* domain failed to start */
VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT = 7, /* restored from a snapshot which was
* taken while domain was shutoff */
+ VIR_DOMAIN_SHUTOFF_DAEMON = 8, /* daemon decides to kill domain
+ during reconnection processing */
# ifdef VIR_ENUM_SENTINELS
VIR_DOMAIN_SHUTOFF_LAST
# endif
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index e8e0adc819..b45f25fabb 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -779,7 +779,8 @@ VIR_ENUM_IMPL(virDomainShutoffReason, VIR_DOMAIN_SHUTOFF_LAST,
"migrated",
"saved",
"failed",
- "from snapshot")
+ "from snapshot",
+ "daemon")

VIR_ENUM_IMPL(virDomainCrashedReason, VIR_DOMAIN_CRASHED_LAST,
"unknown",
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 1850923914..59a94bb74a 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7978,10 +7978,14 @@ qemuProcessReconnect(void *opaque)
*
* If we cannot get to the monitor when the QEMU command
* line used -no-shutdown, then we can safely say that the
- * domain crashed; otherwise, we don't really know. */
+ * domain crashed; otherwise, if the monitor was started,
+ * then we can blame ourselves, else we failed before the
+ * monitor started so we don't really know. */
if (!priv->mon && tryMonReconn &&
qemuDomainIsUsingNoShutdown(priv))
state = VIR_DOMAIN_SHUTOFF_CRASHED;
+ else if (priv->mon)
+ state = VIR_DOMAIN_SHUTOFF_DAEMON;
else
state = VIR_DOMAIN_SHUTOFF_UNKNOWN;

diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c
index 3a2636377d..f0ad558f87 100644
--- a/tools/virsh-domain-monitor.c
+++ b/tools/virsh-domain-monitor.c
@@ -212,7 +212,8 @@ VIR_ENUM_IMPL(virshDomainShutoffReason,
N_("migrated"),
N_("saved"),
N_("failed"),
- N_("from snapshot"))
+ N_("from snapshot"),
+ N_("daemon"))

VIR_ENUM_DECL(virshDomainCrashedReason)
VIR_ENUM_IMPL(virshDomainCrashedReason,
--
2.17.2
Nikolay Shirokovskiy
2018-11-14 06:51:38 UTC
Permalink
Post by John Ferlan
This patch introduces a new shutdown reason "daemon" in order
to indicate that the daemon needed to force shutdown the domain
as the best course of action to take at the moment.
This action would occur during reconnection when processing
encounters an error once the monitor reconnection is successful.
---
Since I hadn't seen a yay or nay on my last response related to
changes - figured I'd post things as they are now just to make it
easier to determine whether the changes made were acceptible
This is essentially your original patch merged with my other
changes and the removal of the qemu_process.c change.
include/libvirt/libvirt-domain.h | 2 ++
src/conf/domain_conf.c | 3 ++-
src/qemu/qemu_process.c | 6 +++++-
tools/virsh-domain-monitor.c | 3 ++-
4 files changed, 11 insertions(+), 3 deletions(-)
I thought you are going to push the patch after last letter)

ACK
Erik Skultety
2018-11-14 07:55:10 UTC
Permalink
Post by John Ferlan
This patch introduces a new shutdown reason "daemon" in order
to indicate that the daemon needed to force shutdown the domain
as the best course of action to take at the moment.
This action would occur during reconnection when processing
encounters an error once the monitor reconnection is successful.
---
FWIW:
Reviewed-by: Erik Skultety <***@redhat.com>
Andrea Bolognani
2018-11-14 16:19:35 UTC
Permalink
Post by John Ferlan
This patch introduces a new shutdown reason "daemon" in order
to indicate that the daemon needed to force shutdown the domain
as the best course of action to take at the moment.
This action would occur during reconnection when processing
encounters an error once the monitor reconnection is successful.
This made language bindings sad:

https://ci.centos.org/view/libvirt/job/libvirt-go-check/106/
https://ci.centos.org/view/libvirt/job/libvirt-perl-check/107/

Interestingly, Python doesn't seem to be bothered at all.

Anyway, it would be great if someone would add support for the new
enum value to Go and Perl bindings and make CI happy again :)
--
Andrea Bolognani / Red Hat / Virtualization
Daniel P. Berrangé
2018-11-14 16:29:55 UTC
Permalink
Post by Andrea Bolognani
Post by John Ferlan
This patch introduces a new shutdown reason "daemon" in order
to indicate that the daemon needed to force shutdown the domain
as the best course of action to take at the moment.
This action would occur during reconnection when processing
encounters an error once the monitor reconnection is successful.
https://ci.centos.org/view/libvirt/job/libvirt-go-check/106/
https://ci.centos.org/view/libvirt/job/libvirt-perl-check/107/
This is normal - the check jobs are explicitly intended to fail
whenever public API additions are made to remind us (me) to add
support there.
Post by Andrea Bolognani
Interestingly, Python doesn't seem to be bothered at all.
python auto-generates some of its code.
Post by Andrea Bolognani
Anyway, it would be great if someone would add support for the new
enum value to Go and Perl bindings and make CI happy again :)
I'm doing it..

Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Loading...