Feature: Safely shutdown cloudstack#6755
Conversation
|
Found UI changes, kicking a new UI QA build |
4d7f30f to
dc36104
Compare
|
Found UI changes, kicking a new UI QA build |
|
@acs-robot a Jenkins job has been kicked to build UI QA env. I'll keep you posted as I make progress. |
|
UI build: ✔️ |
dc36104 to
3f07449
Compare
|
Found UI changes, kicking a new UI QA build |
|
@acs-robot a Jenkins job has been kicked to build UI QA env. I'll keep you posted as I make progress. |
3f07449 to
7719141
Compare
|
Found UI changes, kicking a new UI QA build |
|
@acs-robot a Jenkins job has been kicked to build UI QA env. I'll keep you posted as I make progress. |
|
UI build: ✔️ |
|
UI build: ✔️ |
7719141 to
db8287a
Compare
|
Found UI changes, kicking a new UI QA build |
|
@acs-robot a Jenkins job has been kicked to build UI QA env. I'll keep you posted as I make progress. |
|
UI build: ✔️ |
|
SonarCloud Quality Gate failed. |
DaanHoogland
left a comment
There was a problem hiding this comment.
looks generally good.
A functional question though, there is a UI component, but will the API let all clustered MS shutdown? (I didn´t see code for that) It seems only the MS that happens to handle the API will shut down.
| private static final int HEARTBEAT_INTERVAL = 2000; | ||
| private static final int GC_INTERVAL = 10000; // 10 seconds | ||
|
|
||
| private boolean allowAsyncJobs = true ; |
There was a problem hiding this comment.
seems like this should be called shutdownTriggered, or else the messages of the exceptions are to specific and we don´t know the reason the async jobs are disallowed is a shutdown instead of some other maintenance task.
There was a problem hiding this comment.
I'm guessing this can be reused later, hence the more generic naming. I'll look into polishing the message accordingly
plugins/shutdown/src/main/java/org/apache/cloudstack/api/command/CancelShutdownCmd.java
Outdated
Show resolved
Hide resolved
plugins/shutdown/src/main/java/org/apache/cloudstack/api/command/PrepareForShutdownCmd.java
Show resolved
Hide resolved
plugins/shutdown/src/main/java/org/apache/cloudstack/api/command/TriggerShutdownCmd.java
Outdated
Show resolved
Hide resolved
plugins/shutdown/src/main/java/org/apache/cloudstack/shutdown/ShutdownManager.java
Show resolved
Hide resolved
plugins/shutdown/src/main/java/org/apache/cloudstack/shutdown/ShutdownManagerImpl.java
Show resolved
Hide resolved
plugins/shutdown/src/main/java/org/apache/cloudstack/shutdown/ShutdownManagerImpl.java
Outdated
Show resolved
Hide resolved
| } | ||
| if (shutdownManager.countPendingJobs() == 0) { | ||
| s_logger.info("Shutting down now"); | ||
| System.exit(0); |
There was a problem hiding this comment.
are other threads allowed to shutdown as well (by maybe setting a flag on ManagedContextRunnable and adding a check for it there? (<== genuine question)
There was a problem hiding this comment.
When System.exit() is called, all shutdown hooks are triggered and only when complete does the program exit
https://docs.oracle.com/javase/7/docs/api/java/lang/Runtime.html#exit(int)
plugins/shutdown/src/test/java/org/apache/cloudstack/shutdown/ShutdownManagerImplTest.java
Show resolved
Hide resolved
db8287a to
f851091
Compare
|
Found UI changes, kicking a new UI QA build |
|
@acs-robot a Jenkins job has been kicked to build UI QA env. I'll keep you posted as I make progress. |
|
UI build: ✔️ |
|
@blueorangutan package |
|
@davidjumani a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
@davidjumani can you explain this? |
|
@blueorangutan package |
|
@davidjumani a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 5789 |
|
@blueorangutan test |
|
@davidjumani a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
|
Trillian test result (tid-6328)
|
|
@blueorangutan package |
|
@davidjumani a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 5818 |
|
@blueorangutan test |
|
@davidjumani a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
|
Trillian test result (tid-6345)
|
|
This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch. |
|
SonarCloud Quality Gate failed. |
|
Thanks, great feature. The banner background colour and text colour aren't in-line with our current AntD theme. Should this be warning (yellow) or error (red) for the message? https://antdv.com/components/alert/#Alert (see banner, to be used on top of the page) |
|
Thanks @rohityadavcloud I'll create a PR to address this |
* Safely shutdown feature (ref: apache#6755) * Updated version and some improvements * Management Server Maintenance - Prepare and Cancel Maintenance changes This is supported for the Cloudstack deployments with multiple management servers. - During preparing for maintenance, MS waits for pending jobs to finish, and then Transfer/Migrate the agents to other available MS - New APIs: prepareForMaintenance, cancelMaintenance - New MS States: PreparingToMaintenance, Maintenance * check for single active management server * refactoring plugin name * updated version, and cleanup * code improvements * support list hosts by management server id * update ui with ms maintenance apis * code improvements * ui changes * ui icons update * ui fixes * cond checks for maintenance and shutdown * fix for management server not down issue on service stop * continue with other components on error * agent transfer fixes * maintenance window timeout and fixes * ui changes - added connected agents tab, and updated hosts & management servers fields * marvin test update * keep maintenance after shutdown/restart, do not update last_updated time in cluster heartbeat during maintenance (notifies node inactive/down after heartbeat threshold) * listener for ms maintenance updates * cleanup * keep last msid in host table * review comments * allow only one mgmt server to prepare for maintenance * added ms uuid in logs * minor code improvements * ui fields update * fix systemvm navigation in connected agents * algorithm check and input from ui * check for active ms from host setting * agent migration code improvements * minor ui label fix * fixes & code improvements * agent reconnect fixes, consider avoid list * ui fixes * direct agents transfer and pending jobs timer task fixes * close unclosed socket channels if any * Updated pending jobs check timer task with ScheduledExecutorService * fixes * keep maintenance state on trigger shutdown call when ms is in maintenance * direct agent transfer fixes * add pending jobs count to ms response * during ms heartbeat, update state to up only when it's down * allow vm work jobs of async job created before prepare for maintenance * Revert "keep maintenance state on trigger shutdown call when ms is in maintenance" This reverts commit 4ebbea71ef20a65286bed41a517f03e253a8fe90. * removed duplicate schema changes from schema-41800to41810.sql (already defined at schema-41811to41812.sql)











Description
Adds the feature to safely shutdown CloudStack
It does the following :
Contains 4 new apis :
Types of changes
Feature/Enhancement Scale
Screenshots (if appropriate):
How Has This Been Tested?
TODO