-
Notifications
You must be signed in to change notification settings - Fork 56
cri: handle the migration to the new configuration format #1251
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename this new command to skuba cluster update config or skuba cluster update config because skuba cluster upgrade apply suggests that it will upgrade all nodes in the cluster
How can this be tested ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This indeed handles the migration (I haven't tested it) of the local files.
However, I think we miss the step in upgrades to also replace the sysconfig in the destination node.
Will this be in a separate PR or just a separate commit in this PR?
|
||
func criGenerateLocalConfiguration() error { | ||
cfg := clusterinit.InitConfiguration{ | ||
PauseImage: kubernetes.ComponentContainerImageForClusterVersion(kubernetes.Pause, kubernetes.LatestVersion()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is LatestVersion correct, or should that be the next version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least this is giving me this result: registry.suse.de/devel/caasp/5/containers/containers/caasp/v5/pause:3.2
, which I believe is fine, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is fine for now, but it might not be in the future, if the user is lagging behind more than a version. No?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, in this case they should be using the proper skuba
for each scenario (same as other upgrades). I don't think we have a next
version somewhere in the code, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm interesting, I thought we had that, because we have a path in the plan
and apply
commands, when dealing with multiple k8s versions, but I didn't check the code.
418c8e5
to
2d525dd
Compare
Updated the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good start. Here are a few changes to clean up the duties of each of the functions called.
cmd/skuba/cluster/upgrade.go
Outdated
Use: "config", | ||
Short: "Upgrades the local configuration", | ||
Run: func(cmd *cobra.Command, args []string) { | ||
if err := ssh.CriMigrate(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is very misleading. You are saying migrating local configuration, yet this function run things on the remote node.
For me the upgrade localconfig should be dealing with just local things, while the things happening on target nodes can be dealt with during node upgrade
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function does not touch files on remote nodes. It just updates the local configuration. Thus, I see two things:
- It should not go into the
ssh
package, as you said in another comment. - We may want to change the subcommand name from
config
tolocalconfig
, so to reinforce the idea of changes only happening locally, as it has been suggested. What do you think?
@@ -0,0 +1,7 @@ | |||
## Path : System/Management |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should NOT provide this. This should be deleted. We should trust the sysconfig coming from the package instead, and copy it from the package (in /usr/
iirc), to the /etc/sysconfig/
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bear in mind that this is just test data. This is not planned to be installed anywhere, and for testing purposes I will not rely on developers having the file already on their machines.
9bd4585
to
4b773e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more changes required IMO.
} | ||
|
||
if _, _, err = t.ssh("mv -f /etc/sysconfig/crio /etc/sysconfig/crio.backup"); err != nil { | ||
if _, _, err := t.ssh("mv -f /usr/share/fillup-templates/sysconfig.crio /etc/sysconfig/crio"); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if mv is a good idea, that would remove the file from its expected location, is that fine for zypper?
Otherwise the location is fine for me.
return err | ||
} | ||
_, _, err = t.ssh("mv -f /tmp/cri.d/default_flags /etc/sysconfig/crio") | ||
return err | ||
return t.target.UploadFile(skuba.CriDockerDefaultsConfFile(), filepath.Join("/etc/cri/default_flags.local")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not even sure we need to do this anymore. I think we can just remove it, as ppl will just upload their own config file anyway... WDYT?
files := clusterinit.CriScaffoldFiles["criconfig"] | ||
for _, file := range files { | ||
if err := clusterinit.WriteScaffoldFile(file, cfg); err != nil { | ||
_ = os.RemoveAll(skuba.CriConfDir()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we remove the CriConfDir ? It means if the customer has custom files, they will be removed, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly. clusterinit.WriteScaffoldFile(file, cfg)
will write scaffold files into that directory. Thus, any files with the same names will be effectively replaced. This means that customers have to back things up before performing the migration (as in with any migration).
That being said, if, for whatever reason, we fail to write these files, we should remove everything instead of sticking into an intermediary result (e.g. imagine that one file was successfully written but the other didn't). We don't want these kinds of scenarios, it's all or nothing.
Therefore, I think that this is actually the safest approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly.
clusterinit.WriteScaffoldFile(file, cfg)
will write scaffold files into that directory. Thus, any files with the same names will be effectively replaced. This means that customers have to back things up before performing the migration (as in with any migration).
I think in this case, the files being replaced are the "caasp defaults", which is fine, and the customer shouldn't touch them, so we are good. Yes, it's indeed safer to backup, while not necessary.
That being said, if, for whatever reason, we fail to write these files, we should remove everything instead of sticking into an intermediary result (e.g. imagine that one file was successfully written but the other didn't). We don't want these kinds of scenarios, it's all or nothing.
That's what I am not getting my head around. For me, we should fail, and say "There is something wrong that happened here, maybe you can figure it out with this error: %err". Not deleting the whole folder, which might contain user configuration on top of our default configs. That's very disruptive to me.
I agree it's better to bail, but I don't think it's better to delete the folder.
Therefore, I think that this is actually the safest approach.
Do you mean that, because this migration only runs ONCE, it should be okay to delete the folder, as there is no user configuration yet, and that code will probably never be used for something else? I am fine with that, but I would prefer to add this explicitly in the comments. A comment just before the dir removal should be enough, to clarify that "this is okay to do so because this process only runs once, so there should be no failure, and there should be no user configuration at this point yet"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see an extra comment because future me will read this badly.
files := clusterinit.CriScaffoldFiles["criconfig"] | ||
for _, file := range files { | ||
if err := clusterinit.WriteScaffoldFile(file, cfg); err != nil { | ||
_ = os.RemoveAll(skuba.CriConfDir()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly.
clusterinit.WriteScaffoldFile(file, cfg)
will write scaffold files into that directory. Thus, any files with the same names will be effectively replaced. This means that customers have to back things up before performing the migration (as in with any migration).
I think in this case, the files being replaced are the "caasp defaults", which is fine, and the customer shouldn't touch them, so we are good. Yes, it's indeed safer to backup, while not necessary.
That being said, if, for whatever reason, we fail to write these files, we should remove everything instead of sticking into an intermediary result (e.g. imagine that one file was successfully written but the other didn't). We don't want these kinds of scenarios, it's all or nothing.
That's what I am not getting my head around. For me, we should fail, and say "There is something wrong that happened here, maybe you can figure it out with this error: %err". Not deleting the whole folder, which might contain user configuration on top of our default configs. That's very disruptive to me.
I agree it's better to bail, but I don't think it's better to delete the folder.
Therefore, I think that this is actually the safest approach.
Do you mean that, because this migration only runs ONCE, it should be okay to delete the folder, as there is no user configuration yet, and that code will probably never be used for something else? I am fine with that, but I would prefer to add this explicitly in the comments. A comment just before the dir removal should be enough, to clarify that "this is okay to do so because this process only runs once, so there should be no failure, and there should be no user configuration at this point yet"
Ensure that the local files of the user are properly migrated to the new cri-o configuration format. Signed-off-by: Miquel Sabaté Solà <msabate@suse.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Why is this PR needed?
Fixes SUSE/avant-garde#1679
Fixes SUSE/avant-garde#1785
See SUSE/avant-garde#1679
See SUSE/caasp-release-notes#51
What does this PR do?
Ensure that the local files of the user are properly migrated to the new cri-o configuration format.
Info for QA
In order to test this you should first create a v4.2.2 cluster with an old skuba. Then, use a skuba built with
master
with this patch applied, and callskuba cluster upgrade config
. You should see:addons/cri/default_flags
should've been removed.addons/cri/conf.d
with01-caasp.conf
andREADME
in it.Anything else a reviewer needs to know?
With this SUSE/caasp-release-notes#51 should not be needed.