Description
tl;dr:
- Do the weekly production deployment
- Request the missing firewall rules, if needed
Weekly production update
Every Monday (we agreed that it is better to deploy the service always at the same time; we should start around 1PM so that it will be finished, hopefully, before 3PM):
-
Check with the team (on stand-up, or in the chat) whether deploying new changes from
main
branches is safe. -
Check the current state of the staging deployment, specifically:
-
If everything looks OK, proceed with the deployment.
-
Change the channel topic on both chats (Matrix and Slack) to notify users that we are going to redeploy Packit Service.
-
Move content from
main
branches to thestable
branches.
Warning
Image build takes around 40 minutes since adding additional architectures (mainly because of the aarch64
emulation).
- Go to the Actions tab (example of the respective repos) and make sure the images from the
stable
branches are built and pushed to Quay.io.
Note
Roughly once a month check Quay repository and remove older images that take up unnecessary space.
Ideally sort the repos based on “last modified” as the sorting by size doesn't account for different units (e.g., GB
vs MB
) correctly.
-
Clean up older images from Quay.
-
Create a blog post about what's been changed in Packit and Packit Service
main
branches since their laststable
branch update. -
If there were any changes in the Kubernetes/OpenShift objects or in the Vault/
Packit
/! Changelog !
that have not been already applied to the production (ask the author of the change), do so withDEPLOYMENT=prod make deploy
. -
A couple of minutes after new images are imported to Quay.io check that they are also imported to the production deployment and check the production state. Check:
-
Revert the topic change after the images are built and imported to the OpenShift.
-
Merge and publish your blog post.
-
(optional) Create a Mastodon post mentioning the top news.
Warning
Don't forget to redo the repositories for moving stable
branch or just clone the additional repo with git clone --recurse-submodules git@github.com:packit/production-monorepo.git
Requesting firewall rules
If users report that Packit cannot access a specific domain (typically for downloading sources in downstream release automation):
- verify the domain is not accessible in the clusters (both
prod
andpreprod
, e.g., usingcurl -I --max-time 5 <domain>
) - request the firewall rules as documented here
- track the progress in this issue
General instructions
Notifying users about the deployment
- Users should be notified in Matrix and Slack channels about the upcoming/ongoing production redeployment.
- This is done by adjusting the topic of the channels.
- Example messages to be added to the topic:
- We are redeploying the Packit Service production. If Packit does not react to your actions for a long time (30 minutes) try retriggering it with
/packit build
,/packit test
,/packit propose-downstream
,/packit pull-from-upstream
,/packit koji-build
,/packit create-update
. ⚠️ Production redeployment in progress.⚠️
- We are redeploying the Packit Service production. If Packit does not react to your actions for a long time (30 minutes) try retriggering it with
- Changing the topic should automatically create a message in the channel, if not send a message in the channel with the topic.
Moving the stable
branches
- You need to have a clone of the deployment repository.
- Your SSH keys have to be associated with your GitHub account to be able to use the
move-stable
script. - Run
make move-stable
. - If pushing the
stable
branch fails for a repo, e.g., because some changes had to be cherry-picked into production, force-push the current state ofmain
asstable
to sync up the history.
Redeployment details
- Newer images are automatically imported from Quay.io and pods are recreated from them in the same way as in the staging environment.
- If pods are stuck in a terminating status for a long time, kill them manually with
oc delete pod --force ‹pod-name›
. - If you encounter
Error: ImagePullBackOff
, import the images manually by the means ofDEPLOYMENT=prod make import-images
. - If the previous deployment has been reverted, check with the team and docs for further steps to be taken.
- Revert the deployment, if all jobs fail or there is significant regression in the functionality.
- Double check the latest deployed versions in https://prod.packit.dev/api/system or https://stg.packit.dev/api/system
Blog post
- Use title
Packit [in] $MONTH $YEAR
. - Use
scripts/move_stable.py github-query
to get a link to all the PRs from the past week which are marked to have release notes. You can copy those for a start, and check with other team members to make sure nothing was left out. - You can also use the blogpost template provided by
make move-stable
. - The blog post should meet the following requirements:
- It is meant to be read by general public - make it easy to read for everyone.
- Focus on new features, notable bug fixes, documentation updates, UX improvements.
- Don't talk about internal things (e.g., most
ogr
/specfile
changes), refactoring, CI changes, etc. - Write it in a way so that people with a little knowledge of the project can get a clue what is the change about:
- NO: Packit's behaviour in dealing with spec files was improved.
- YES: Packit can now find a spec file inside a bottle of rum which is very handy when you're thirsty.
- When sharing on our Mastodon account be aware that the limit for a post is 500 characters, so do a shorter version or split the content to multiple posts and post it throughout the week.
Metadata
Metadata
Assignees
Type
Projects
Status