[17:46:46] We got the zuul mariadb production database now. [17:47:08] I am puppetizing the user/password so that it is in a (config) file on zuul VMs. [17:48:03] when we are asked if we need backups and how often I just said "yes, daily". But apparently that isnt standard for "misc" cluster. [17:48:26] So that resulted in a new ticket, https://phabricator.wikimedia.org/T396322, which is only about the backup needs for that database. If you have any comments. [18:10:47] from a technical pov, zuul doesn't really "need" the database. if you lose it (or lose some recent data), then it "only" means that we will lose metadata about the builds. that means that links to the build result pages (like https://zuul.opendev.org/t/zuul/build/14bf283a39ab476ba683328251d56271) left in gerrit comments won't work, which means that user's won't be able to find the logs or [18:10:53] artifacts for those builds. but otherwise zuul won't care, and new builds will start populating in the database. missing build information may be a problem for "important" builds, like those that built production artifacts, or performed continuous deployment actions, etc. but generally, users can "recheck" changes for pre-merge builds and get new, replacement results. [18:13:04] fwiw, some commercial customers i work with do backup databases, especially because they have builds with long runtimes; opendev has chosen not to, and simply runs the risk that if it ever has a catastrophic database loss, that users will just recheck changes and within a few hours, no one will care. [18:15:20] so i think one of the questions is, whach subset of builds are most important to keep data? "daily" means you only ever lose a day's worth of builds, so probably not too hard to recheck those changes. you could back up weekly, and then a loss would be more disruptive. but what if no one ever looks at builds older than one week? in that case, it may not be worth backing up at all. [18:18:48] For Jenkins our "days to keep" is between 3 and 60 (depending on the job). Meaning all jobs older than 60 days are 404s in jenkins. [18:20:10] aside from when there are problems with a job, my intuition is few look at builds older than a week. And the downside consequence is pretty low to losing the data it sounds like. [18:20:46] I think we'd be fine with whatever the default backup is, no need to do more frequent backups. [18:32:56] thank you both for the valuable comments [18:33:27] will report back once I know what the default is