[00:44:32] i have a new ganeti VM and i can see DHCPACK and then atftpd serving lpxelinux.0 (PXE boot) but i just never see anything on the console. also nothing at all when i restart the VM.. which has status "up and up" [00:46:09] if it wasnt virtual i would think console settings are wrong in BIOS [01:57:33] weird, the only time i've never seen anything / not been able to connect to the console, no DHCPACK because I got the MAC wrong [07:54:18] _joe_: hopefully https://gerrit.wikimedia.org/r/c/operations/puppet/+/543127 will turn some of the UNKNOWNS from the check_puppetrun to warnings [08:17:17] mutante: manually created or via the cookbook? [08:17:46] and, if manual, what's the use case not (yet?) covered by the cookbook? [08:47:06] <_joe_> jbond42: I will take a look in a few :) [08:55:56] _joe_: its merged just an FYI [08:56:08] <_joe_> oh ok thanks for that :) [08:56:48] np [09:01:52] <_joe_> XioNoX: netflow2001 is running out of disk space [09:03:48] <_joe_> it is running out of space as 105 mb left on / [09:03:57] <_joe_> so pretty urgent [09:10:15] _joe_: I'll have a look it shouldn't store anything on disk so it's surprising [09:38:18] _joe_: https://gerrit.wikimedia.org/r/c/operations/puppet/+/543392 [13:58:55] jbond42: can you join is in #wikimedia-cloud-admin to help sort out a puppetmaster issue? [13:59:26] yep [15:25:37] hey yall, any good way to test a new custom fact? [15:25:51] i'm not sure if my fact is not working, or it just doesn't work in puppet compiler if it is new [15:26:12] i could set something up in labs, but it would take me another hour or two just to get a test setup in place there [15:26:39] _joe_: ^? [15:27:01] ottomata: you could copy the ruby file to /var/lib/puppet/lib/facter/ on a test machine then rune `facter -p myfact` [15:27:16] oh hm! [15:27:20] great idea :) [15:29:13] <_joe_> ottomata: it won't work in the compiler if it's new, no [15:29:27] ah ok good to konw [15:31:11] <_joe_> the compiler uses exported facts [15:31:21] <_joe_> it can't run facter on the host itself, right [15:32:58] jbond42: do I have to do something to get facter to load my new fact? [15:37:11] ottomata: i dont think so, but facter mostly fails silently if there are errors, so use -d and make sure you run with sudo if the fact needs sudo [16:05:37] I am going to run a copy operations from helium to backup1001 [16:05:52] this is a trivial operations, we do it all the time for databases, I am only commenting it here because helium storage has crashed in the past [16:06:42] I am not going to dowtime anything, because we should be notified if anything crashes, but so you are aware if something bad happens [16:06:53] also about the increase of network usage [19:00:40] cdanis: still working? I'm trying to "just statically special-case that section in the configuration files" and I think I must be missing a step. [19:00:42] Once I have $wgLBFactoryConf['sectionLoads']['labwikitestwiki'] = ['clouddb2001-dev' => 1]; [19:00:54] I assume I have to associated clouddb2001-dev with an actual hostname someplace [19:00:59] *associate [19:17:45] * andrewbogott proposes… https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/543664/ ? [19:47:21] andrewbogott: just back now from lunch [19:47:57] cdanis: you can skip straight to the patch :) [19:47:57] andrewbogott: you also need an entry in the hostsByName subarray, and I suspect you might also have to have edited the dblist files [19:48:11] although I do not know too much about the latter [19:48:32] Hm, I was assuming that 'hostByName' was the same thing as the dblist [19:48:33] except they define the mapping between wiki names and database 'sections' (e.g. s1) [19:48:36] if not… where are the dblist files? [19:49:03] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/mediawiki-config/+/master/dblists/ [19:49:11] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/mediawiki-config/+/master/dblists/wikitech.dblist [19:50:43] I haven't worked with them much; they're not something I needed to worry about when doing dbctl stuff [19:55:22] * andrewbogott can't make a lot of sense out of those [19:57:34] um... yeah now I think we have stuff in etcd. urk [19:57:52] and dblists are something entirely different [19:58:30] apergos: it's still the dblists that define which wikis are in which database 'section' as referred to in $wgLBFactoryConf['sectionLoads'] though, right? or am I mistaken? [19:58:38] uh no [19:58:39] the etcd stuff I can all speak to :) just not whatever happens before it [19:58:59] the dblist files are um just lists of database names like enwiki arwikt etc [19:59:17] so eg dblists/all.dblist [19:59:54] adding a db to a section (in the case it's not in s3 'default') is the way to determine which hosts get it [20:00:03] because they also get assigned to serve certain sections [20:00:27] so what's the purpose of e.g. s1.dblist? [20:01:08] apergos: are you looking at https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/543664/ already? [20:01:12] it maps wikis to db servers [20:01:19] Seems like that already assigns a host to a database and a database to a db server [20:01:52] mutante: what is hostsByName then? [20:02:09] names to IP addresses [20:02:11] the s*.dblist files, I do't know what uses them but they sure aren't required for the lb conf [20:03:42] cdanis: so when you say 'name' you mean 'name of a database server' right? [20:04:15] in the context of hostsByName I mean 'abbreviated hostname, as it is referred to in sectionLoads' [20:04:43] ok, then what does mutante mean by 'db servers'? [20:05:11] doesn't dblist map wikis to /database names/? [20:05:18] And then etcd maps database names to server names [20:05:23] sorry, groups of servers, like s1 is a group of servers and certain wikis are running on it [20:05:25] and then hostsByName maps server names to IPs? [20:05:51] and also the applications are nowadays connecting to dbproxy* machines who then connect to db* machines [20:06:04] at least in the cases i saw [20:06:13] because that's done (or was done anyways) in sectionLoads in db-codfw.php and db-eqiad.php [20:06:19] but now that's in etcd somehow [20:06:22] yes andrewbogott but now with with apergos is saying I'm not sure of the first step. I had thought that you would also have to edit wikitech.dblist as I thought that is telling Mediawiki that the wiki named 'labtestwiki' resides in the group of database servers called 'wikitech' [20:06:51] (there's no labtestwiki key inside $wgLBFactoryConf['sectionLoads'], but there is a wikitech key) [20:07:11] cdanis: ok, I bet you're right [20:07:15] so I need to split them out there as well [20:07:27] see, this is why I opened the ticket rather than just doing it myself — everything has changed! [20:08:25] I wish I understood the mapping of wiki names to database section names [20:10:12] anyway, before you submit your change, I do suggest doing some testing (of a few different wikis) on an mwdebug host, possibly in both eqiad and codfw, as it's easy to break something when editing those files [20:10:56] cdanis: is it safe to assume that there is no longer any one person who understands how wikis are mapped to db hosts? [20:11:05] it was in sectionLoads and now is controlled by dbctl https://wikitech.wikimedia.org/wiki/Dbctl#Usage [20:11:21] apergos: but what controls the mapping between 'enwiki' and 's1' for instance? [20:11:25] that's the question here [20:11:35] sectionsByDB [20:11:41] ohhh [20:12:13] yeah it is right there, huh [20:12:27] 🤦 [20:12:30] hey once I get my slides up to snuff I (or someone) can do a quick review of mw dbs for sres, one friday, and catch up people on the dbctl stuff too [20:12:37] okay, well, sectionsByDB is not in etcd :) [20:12:47] no that isn't, the load is but not the sections [20:12:57] yeah [20:13:00] andrewbogott: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/mediawiki-config/+/master/wmf-config/db-eqiad.php#91 [20:13:12] sorry, I guess i was answering two questions at once and therefore both badly :-D [20:13:18] then what the heck is the dblist for? [20:13:30] something else might use those, could be tendril or who knows [20:13:37] yeah I have no idea [20:13:52] maybe dbtree.wikimedia.org is built from that [20:14:16] like I have an lb factory thing in my lcalsettings.php for $reasons and I don't have those s1.dblist or whatever [20:14:41] and everything works as it ought to [20:15:06] marosteg ui would be able to tell us most likely [20:15:11] i can try to remember to ask tomorrow [20:16:26] I'm even more confused than I was before, but I have updated my patch nonetheless [20:17:34] ops/software/software/dbtools/repl_dump_shard.sh:for db in $(cat mediawiki-config/$shard.dblist); do [20:17:45] so used by some ops tools [20:19:15] if you folks have enough info to get done whatever you need to get done, I'm going to go back to goofing off, as it's 11:20 pm here and going on naptime [20:21:55] if the dblists are used by ops tools, then please update them as well andrewbogott [20:22:11] ok [20:22:21] should be another 2 line diff i think [20:22:39] cdanis: wouldn't I need to add another file named 'labtestwiki'? [20:23:54] yeah [20:24:04] but just one line there, and then minus one line in wikitest.dblist :) [20:24:28] 'k [20:25:54] cdanis: done [20:29:33] * andrewbogott doesn't understand what jenkins is on about [20:31:00] I'm guessing there's an assertion somewhere that all.dblist must be equivalent to the union of s[1-8].dblist+wikitech.dblist? [20:33:50] if there is it's not in the mediawiki-config repo [20:34:50] I'm making some tea and then I'll take another look [20:36:35] thanks [20:36:50] https://github.com/wikimedia/operations-mediawiki-config/blob/master/tests/dblistTest.php [20:37:00] and I'm running away again, gotta stop looking in here [20:37:52] ahh looks like you want line 85 andrewbogott [20:37:57] ty apergos [20:38:09] yw, good night and good luck :-D [20:38:27] * andrewbogott tries it [20:40:26] hm, nope [20:42:23] now the test passes but there's a php style violation [20:42:44] https://integration.wikimedia.org/ci/job/operations-mw-config-php72-composer-test-docker/947/console [20:57:36] dammit jenkins [22:38:34] running a workaround script to fix captcha generation, new captchas are being copied to swift. (it works if done in batches of 900 and previously failed) https://phabricator.wikimedia.org/T230245 [23:18:38] done (we have 9000 new captchas and a workaround is deployed via puppet)