[00:01:43] PROBLEM Free ram is now: WARNING on bots-4.pmtpa.wmflabs 10.4.0.64 output: Warning: 17% free memory [00:02:38] Ryan_Lane: would be nice if you can find time to "clone" the openid-wiki to an openid-wiki2 . I just "pulled" the recent core and E:OpenID. [00:03:03] and I am logged out from there [00:03:24] I mean I am not logged in to openid-wiki [00:04:49] I can't right now [00:04:54] I'll try to get this done tonight [00:05:18] crunch crunch crunch. Gawd. I haven't seen a package this long to build since Qt! [00:05:27] ok, that'll be fine ! [00:06:29] "see" you tomorrow, good night (Berlin local time: 01:6) [00:07:46] andrewbogott: can you add me as a reviewer on those changes? [00:07:53] they aren't in my list [00:10:09] Ryan_Lane: OK, done. [00:10:13] cool [00:13:07] * Coren wishes he had OGS installed so he could use qmake to build OGS. [00:29:22] RECOVERY Total processes is now: OK on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS OK: 146 processes [00:37:12] RECOVERY Total processes is now: OK on bastion1.pmtpa.wmflabs 10.4.0.54 output: PROCS OK: 148 processes [00:37:12] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [00:37:32] RECOVERY Free ram is now: OK on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: OK: 23% free memory [00:39:52] RECOVERY Free ram is now: OK on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: OK: 22% free memory [00:41:52] RECOVERY Free ram is now: OK on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: OK: 22% free memory [00:47:23] PROBLEM Total processes is now: WARNING on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS WARNING: 154 processes [00:49:29] andrewbogott: reviewed one [00:49:35] I'm reviewing the other [00:49:46] it would likely be good to stop unused volumes [00:49:51] err [00:49:53] unloved [00:50:24] unused volumes are more difficult because it could lead to flapping, if a user create/deletes/creates/deletes a single instance [00:50:54] for that situation we could have a cache file that lets us know when a volume was last stopped [00:51:04] if it was stopped within an hour, it would not stop it [00:52:52] PROBLEM Free ram is now: WARNING on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: Warning: 18% free memory [00:56:45] OK, I'll look at adding stopping. Getting things to start again will be a bit messy. [01:00:32] PROBLEM Free ram is now: WARNING on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: Warning: 17% free memory [01:02:22] RECOVERY Total processes is now: OK on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS OK: 143 processes [01:03:26] Ryan_Lane, are you starting/stopping/messing with gluster on labstore2? [01:05:11] hm, maybe not so messy [01:05:57] andrewbogott: I disabled the script temporarily [01:06:17] OK. 'gluster volume info' was telling me that gluster was down [01:06:19] But no longer [01:09:54] PROBLEM Free ram is now: WARNING on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: Warning: 19% free memory [01:10:13] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [01:14:53] PROBLEM Current Load is now: WARNING on parsoid-roundtrip6-8core.pmtpa.wmflabs 10.4.0.222 output: WARNING - load average: 8.50, 7.23, 5.92 [01:20:24] PROBLEM Current Load is now: WARNING on parsoid-roundtrip3.pmtpa.wmflabs 10.4.0.62 output: WARNING - load average: 8.64, 7.74, 5.70 [01:24:54] PROBLEM Current Load is now: WARNING on ve-roundtrip2.pmtpa.wmflabs 10.4.0.162 output: WARNING - load average: 5.98, 8.00, 6.02 [01:34:30] Ryan_Lane: I'd like to add the starting/stopping as a separate patch. So go ahead and submit comments on the patch in gerrit as it is. [01:35:15] ok [01:46:42] PROBLEM Free ram is now: CRITICAL on bots-4.pmtpa.wmflabs 10.4.0.64 output: Critical: 3% free memory [01:47:20] !log webtools installed libssl0.9.8 (dependency for OGS) [01:47:22] Logged the message, Master [01:51:43] PROBLEM Free ram is now: WARNING on bots-4.pmtpa.wmflabs 10.4.0.64 output: Warning: 6% free memory [02:00:23] PROBLEM Total processes is now: WARNING on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS WARNING: 163 processes [02:01:43] RECOVERY Free ram is now: OK on bots-4.pmtpa.wmflabs 10.4.0.64 output: OK: 78% free memory [02:09:52] RECOVERY Current Load is now: OK on ve-roundtrip2.pmtpa.wmflabs 10.4.0.162 output: OK - load average: 3.87, 4.20, 4.97 [02:11:20] Ryan_Lane, do you know how to make 'gluster volume stop' skip the 'Do you want to continue?' prompt? [02:11:24] Oddly, 'force' does not do that. [02:11:26] yep [02:13:50] Ryan_Lane: So… how? [02:13:57] one sec [02:17:53] PROBLEM Current Load is now: WARNING on ve-roundtrip2.pmtpa.wmflabs 10.4.0.162 output: WARNING - load average: 4.98, 4.93, 5.07 [02:21:10] andrewbogott: --mode=script [02:23:16] works! [02:25:23] RECOVERY Total processes is now: OK on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS OK: 147 processes [02:44:52] RECOVERY Current Load is now: OK on parsoid-roundtrip6-8core.pmtpa.wmflabs 10.4.0.222 output: OK - load average: 3.47, 3.87, 4.87 [02:49:22] PROBLEM Total processes is now: WARNING on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS WARNING: 156 processes [02:55:32] RECOVERY Current Load is now: OK on parsoid-roundtrip3.pmtpa.wmflabs 10.4.0.62 output: OK - load average: 4.47, 4.35, 4.96 [03:03:42] Ryan_Lane: I'm noticing that right after I start and stop a volume, subsequent 'gluster volume' commands fail for a few seconds. [03:03:49] yep [03:03:53] it's really annoying [03:04:24] This is suspiciously similar to the behavior in the last version when I manually restarted the service after a start or stop... [03:04:28] we should likely change the cron to run every 5 minutes [03:04:32] Suppose they fixed the memory leak by having the service restart itself? [03:04:39] heh [03:05:12] Anyway… perhaps this script should just exit after it either starts or stops a volume. Since it won't be getting any more work done after that... [03:05:30] yeah. maybe so [04:37:52] RECOVERY Free ram is now: OK on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: OK: 22% free memory [04:39:52] RECOVERY Free ram is now: OK on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: OK: 27% free memory [04:40:17] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [04:40:33] RECOVERY Free ram is now: OK on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: OK: 22% free memory [04:43:43] PROBLEM Free ram is now: WARNING on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: Warning: 14% free memory [04:53:44] PROBLEM Free ram is now: CRITICAL on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: Critical: 5% free memory [04:58:32] PROBLEM Free ram is now: WARNING on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: Warning: 17% free memory [04:58:42] PROBLEM Free ram is now: WARNING on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: Warning: 6% free memory [05:02:52] PROBLEM Free ram is now: WARNING on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: Warning: 13% free memory [05:03:12] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [05:05:52] PROBLEM Free ram is now: WARNING on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: Warning: 17% free memory [06:30:52] PROBLEM Total processes is now: WARNING on parsoid-roundtrip5-8core.pmtpa.wmflabs 10.4.0.125 output: PROCS WARNING: 151 processes [06:35:52] RECOVERY Total processes is now: OK on parsoid-roundtrip5-8core.pmtpa.wmflabs 10.4.0.125 output: PROCS OK: 150 processes [07:24:13] PROBLEM Free ram is now: WARNING on bots-cb.pmtpa.wmflabs 10.4.0.44 output: Warning: 18% free memory [07:29:12] RECOVERY Free ram is now: OK on bots-cb.pmtpa.wmflabs 10.4.0.44 output: OK: 34% free memory [08:37:52] RECOVERY Free ram is now: OK on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: OK: 22% free memory [08:38:12] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [08:38:32] RECOVERY Free ram is now: OK on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: OK: 23% free memory [08:38:42] RECOVERY Free ram is now: OK on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: OK: 27% free memory [08:40:52] RECOVERY Free ram is now: OK on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: OK: 21% free memory [08:46:42] PROBLEM Free ram is now: WARNING on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: Warning: 19% free memory [08:51:32] PROBLEM Free ram is now: WARNING on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: Warning: 19% free memory [08:56:33] RECOVERY Free ram is now: OK on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: OK: 20% free memory [09:10:52] PROBLEM Free ram is now: WARNING on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: Warning: 14% free memory [09:11:12] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [09:13:52] PROBLEM Free ram is now: WARNING on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: Warning: 17% free memory [09:18:32] PROBLEM Current Load is now: WARNING on parsoid-roundtrip3.pmtpa.wmflabs 10.4.0.62 output: WARNING - load average: 6.62, 6.58, 5.59 [09:21:42] RECOVERY Free ram is now: OK on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: OK: 20% free memory [09:55:32] PROBLEM Free ram is now: WARNING on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: Warning: 19% free memory [09:58:32] RECOVERY Current Load is now: OK on parsoid-roundtrip3.pmtpa.wmflabs 10.4.0.62 output: OK - load average: 1.55, 2.67, 4.59 [10:37:44] PROBLEM Free ram is now: WARNING on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: Warning: 19% free memory [12:02:22] PROBLEM Total processes is now: WARNING on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS WARNING: 151 processes [12:07:25] RECOVERY Total processes is now: OK on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS OK: 147 processes [12:07:42] PROBLEM Disk Space is now: WARNING on wikidata-dev-9.pmtpa.wmflabs 10.4.1.41 output: DISK WARNING - free space: / 521 MB (5% inode=71%): [12:37:42] RECOVERY Free ram is now: OK on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: OK: 27% free memory [12:38:52] RECOVERY Free ram is now: OK on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: OK: 22% free memory [12:40:32] RECOVERY Free ram is now: OK on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: OK: 24% free memory [12:40:52] RECOVERY Free ram is now: OK on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: OK: 27% free memory [12:41:12] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [12:50:42] PROBLEM Free ram is now: WARNING on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: Warning: 19% free memory [12:56:52] PROBLEM Free ram is now: WARNING on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: Warning: 17% free memory [12:58:52] PROBLEM Free ram is now: WARNING on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: Warning: 13% free memory [13:00:16] Change on 12mediawiki a page Wikimedia Labs/Tools Lab was modified, changed by Hashar link https://www.mediawiki.org/w/index.php?diff=649504 edit summary: [+7] pretty table: border="1" --> class="wikitable" [13:04:13] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [13:13:33] PROBLEM Free ram is now: WARNING on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: Warning: 19% free memory [13:17:27] Hi! At the moment, where does mail for (instance) users get sent to? I. e., output of user crontabs, at jobs, etc.? [13:30:22] PROBLEM Free ram is now: WARNING on bots-liwa.pmtpa.wmflabs 10.4.1.65 output: Warning: 18% free memory [13:40:37] scfc_de`: to the wikimedia root i think [13:42:15] scfc_de`: in a crontab you can use the MAILTO environment variable [13:45:22] RECOVERY Free ram is now: OK on bots-liwa.pmtpa.wmflabs 10.4.1.65 output: OK: 20% free memory [13:59:08] hashar: Thanks. Do you know a solution for at jobs? [14:00:20] scfc_de`: This needs a more permanent solution, clearly. [14:01:27] Coren: On Toolserver, mail is sent to the address set in LDAP (overwritable in ~/.forward with an ugly hack, "aliasd"). [14:02:48] scfc_de`: Yeah, because of the role accounts on the Tool Labs especially I'm looking at a number of possible alternatives. I'll certainly talk with CT and Geoff about the possibility of having local mailboxen and/or permit procmail. [14:03:22] PROBLEM Free ram is now: WARNING on bots-liwa.pmtpa.wmflabs 10.4.1.65 output: Warning: 19% free memory [14:04:00] scfc_de`: So that things like cron reports can be stuffed in a mbox instead of being emailed. [14:05:57] Coren: Well, whether I read it via "ssh $HOST mail" or it is delivered to my inbox doesn't matter to me, but I think the legal implications are the same :-). [14:07:21] scfc_de`: They may or may not be; I could see reasons why Geoff might want to control and/or add rules to the idea of store-and-forward; the question needs to be posed. [14:09:44] !log wikidata-dev wikidata-dev-9: Started to play with WikibaseSolr extension. Solr is in /opt, solarium inside the extension's directory, it's not working. [14:09:46] Logged the message, Master [14:13:37] Coren: Well, as I said: "I think" :-). [14:14:33] scfc_de`: Well, Geoff is paid the Big Bucks to think of such things. i'll just dump that on his desk and think about technology instead. :-) [14:16:53] Coren: If at the end some form of "mail" is implemented, I'm (well, somewhat) fine with that :-). [14:28:52] PROBLEM Current Load is now: CRITICAL on deployment-cache-upload-test.pmtpa.wmflabs 10.4.1.80 output: Connection refused by host [14:29:39] PROBLEM Disk Space is now: CRITICAL on deployment-cache-upload-test.pmtpa.wmflabs 10.4.1.80 output: Connection refused by host [14:30:02] scfc_de`: Hm, I don't expect that using command-line mail is a reasonable solution for anyone involved. :-) If there are local mailboxen, there would be some more better way to get at them. :-) [14:30:12] PROBLEM Free ram is now: CRITICAL on deployment-cache-upload-test.pmtpa.wmflabs 10.4.1.80 output: Connection refused by host [14:30:32] scfc_de`: I'm thinking procmail to make logs of automated email, the rest forwarded. :-) [14:31:42] PROBLEM Total processes is now: CRITICAL on deployment-cache-upload-test.pmtpa.wmflabs 10.4.1.80 output: Connection refused by host [14:32:22] PROBLEM dpkg-check is now: CRITICAL on deployment-cache-upload-test.pmtpa.wmflabs 10.4.1.80 output: Connection refused by host [14:34:55] Coren: What automated email do you think of? [14:35:48] scfc_de`: Cron, for one. Possibly things like (hey, your job just went BOOM!) also. [14:38:51] Coren: That's one of the things I definitely want mail for :-). I absolutely hate about SGE that the job output is saved on disk. The nicety of cron is that the output is delivered "to your doorstep", so you don't have to regularly check some log. [14:42:05] Coren: BTW, did the OGS build work? [14:43:12] scfc_de`: Some breakage caused by braindead dependencies. I'm currently porting some debian patches to gridengine. [14:48:13] do we have some documentation about puppetizing stuff? [14:48:17] Coren: gridengine = Ubuntu's gridengine-* package of SGE?! (Even though I have no clue how it is packaged ATM on Ubuntu or Debian.) [14:48:44] scfc_de`: Well, Debian's but yeah. [14:49:12] Platonides: https://labsconsole.wikimedia.org/wiki/Help:Self-hosted_puppetmaster [14:50:07] wtf, it creates a puppet class automatically? [14:50:14] how can it do so? [14:50:27] What do you mean? [14:51:17] (I haven't actually tried puppetmaster::self myself.) [14:51:21] oh, "You now have a local puppet repo" [14:51:37] I understood that as you now have a copy of your system in puppet [14:51:50] That would be nice :-). [14:52:13] that "Force another puppet run, and behold! " didn't imply that it did everything [14:52:51] (What also would be nice: A parameter to puppetmaster::self that is a link to a Git repository/branch so that the back and forth between instance and repo is easier.) [14:54:32] PROBLEM Free ram is now: CRITICAL on puppet-dev.pmtpa.wmflabs 10.4.1.85 output: Connection refused by host [14:54:42] PROBLEM dpkg-check is now: CRITICAL on puppet-dev.pmtpa.wmflabs 10.4.1.85 output: Connection refused by host [14:55:12] so I guess I should create modules/webtools, with subfolders and manifests and files [14:55:52] PROBLEM Current Load is now: CRITICAL on puppet-dev.pmtpa.wmflabs 10.4.1.85 output: Connection refused by host [14:56:02] PROBLEM Disk Space is now: CRITICAL on puppet-dev.pmtpa.wmflabs 10.4.1.85 output: Connection refused by host [14:56:02] PROBLEM Total processes is now: CRITICAL on puppet-dev.pmtpa.wmflabs 10.4.1.85 output: Connection refused by host [14:56:11] Yep, that was my idea. It's just 0 and 1s, so later we could move it to "tools" if that becomes necessary. [15:02:12] what's the difference between require and include in puppet? [15:05:06] Feb 20 14:38:25 deployment-cache-upload-test puppet-agent[891]: Did not receive certificate [15:05:12] any idea what "did not receive certificate" is ? [15:05:43] perhaps it is connectinbg through https? [15:05:48] I guess my lab is broken again [15:05:52] that is a fresh instance though [15:05:56] hashar I believe it does not receive the desired certificate :) [15:06:11] you need to show it more love [15:07:22] every time I create an instance something screw up :( [15:08:12] * hashar gives up [15:08:17] can't create any instance [15:08:18] pff [15:08:52] PROBLEM Current Load is now: CRITICAL on puppet1.pmtpa.wmflabs 10.4.0.251 output: Connection refused by host [15:09:22] hashar: Could that ^^^ be connected? :-) [15:09:28] someone deleted the 000-default symlink from /etc/apache2/sites-enabled :S [15:09:32] PROBLEM Disk Space is now: CRITICAL on puppet1.pmtpa.wmflabs 10.4.0.251 output: Connection refused by host [15:09:40] no wonder why it stopped working [15:10:12] PROBLEM Free ram is now: CRITICAL on puppet1.pmtpa.wmflabs 10.4.0.251 output: Connection refused by host [15:10:27] scfc_de`: maybe [15:10:57] !log bastion puppet1.pmtpa.wmflabs is dead. [15:10:58] Logged the message, Master [15:11:11] I give up for today [15:11:42] PROBLEM Total processes is now: CRITICAL on puppet1.pmtpa.wmflabs 10.4.0.251 output: Connection refused by host [15:12:22] PROBLEM dpkg-check is now: CRITICAL on puppet1.pmtpa.wmflabs 10.4.0.251 output: Connection refused by host [15:28:52] PROBLEM Current Load is now: CRITICAL on deployment-cache-upload-test2.pmtpa.wmflabs 10.4.1.55 output: Connection refused by host [15:29:32] PROBLEM Disk Space is now: CRITICAL on deployment-cache-upload-test2.pmtpa.wmflabs 10.4.1.55 output: Connection refused by host [15:30:12] PROBLEM Free ram is now: CRITICAL on deployment-cache-upload-test2.pmtpa.wmflabs 10.4.1.55 output: Connection refused by host [15:31:42] PROBLEM Total processes is now: CRITICAL on deployment-cache-upload-test2.pmtpa.wmflabs 10.4.1.55 output: Connection refused by host [15:32:22] PROBLEM dpkg-check is now: CRITICAL on deployment-cache-upload-test2.pmtpa.wmflabs 10.4.1.55 output: Connection refused by host [15:35:36] hashar, maybe you want o take a look if my manifest looks right? [15:36:15] I could :-D [15:36:24] though it is probably better to have it reviewed directly by someone from ops hehe [15:37:11] I don't see any of them here :) [15:37:21] and you have *some* idea of puppet :P [15:40:05] does the name of the files where you define the classes matter? [15:40:39] Platonides: in modules yes :-] [15:40:53] /modules/manifests/.pp [15:40:53] or : [15:41:03] /modules/manifests//.pp [15:41:18] but can one file contain several classes ? [15:41:30] aka apache::website would be in /modules/apache/manifests/website.pp [15:41:38] grmblbl [15:41:50] my templates above are wrong, but that last example is probably fine [15:42:03] the "apache" class would be : /modules/apache/manifests/init.pp [15:42:25] in a module context, I have no idea whether you can have sub classes in the same file [15:42:32] I guess that might work if you only include it locally [15:42:58] I think I will make another file and play safe for now [15:43:17] Platonides: Do you have a public repo to look? [15:43:35] what do you mean? [15:45:40] So that not only hashar can view your manifests :-). [15:45:41] it's not hidden [15:45:41] https://gerrit.wikimedia.org/r/50011 [15:45:42] what is the point of asking for review when the manifest does not even lint? :-] [15:45:45] https://integration.mediawiki.org/ci/job/operations-puppet-validate/1729/console [15:46:51] why doesn't it validate? [15:47:02] ah, syntax error [15:47:17] but it's a file [15:47:23] why is it trying to lint it? [15:47:34] if you install puppet on your machine (something like "gem install puppet") , you should be able to lint locally [15:47:38] ah, no [15:48:10] it's the manifest what files [15:48:15] but I have no idea why [15:48:38] which only shows my ignorance [15:49:42] hmm [15:49:46] ok, a ; at the end of each file [15:50:32] yeah [15:50:33] hmm [15:50:41] Platonides: just put each file in its own {} statement [15:50:54] and there is really no point in enabling the default apache :] [15:51:09] I copied that from changeset 49270 [15:51:16] maybe [15:52:06] Platonides: Is this just testing Puppet? I don't think UserDir will work with this config. [15:52:07] and you probably don't want to use /data/project :D [15:52:18] i am off [15:52:24] * hashar waves [15:52:59] hashar: Au revoir! [15:53:04] bye hashar [15:54:05] ;;-) [15:56:09] scfc_de`, it's not exactly UserDir [15:56:20] but it works so far [15:56:34] in its simplified form [16:01:05] And the tools are executed as the user's account? [16:02:13] not yet [16:02:51] Ah, okay. [16:11:42] RECOVERY Total processes is now: OK on puppet1.pmtpa.wmflabs 10.4.0.251 output: PROCS OK: 84 processes [16:12:22] RECOVERY dpkg-check is now: OK on puppet1.pmtpa.wmflabs 10.4.0.251 output: All packages OK [16:13:52] RECOVERY Current Load is now: OK on puppet1.pmtpa.wmflabs 10.4.0.251 output: OK - load average: 0.23, 0.67, 0.52 [16:14:32] RECOVERY Disk Space is now: OK on puppet1.pmtpa.wmflabs 10.4.0.251 output: DISK OK [16:15:13] RECOVERY Free ram is now: OK on puppet1.pmtpa.wmflabs 10.4.0.251 output: OK: 81% free memory [16:22:52] PROBLEM Current Load is now: WARNING on parsoid-roundtrip6-8core.pmtpa.wmflabs 10.4.0.222 output: WARNING - load average: 7.08, 7.35, 5.66 [16:23:32] PROBLEM Current Load is now: WARNING on parsoid-roundtrip3.pmtpa.wmflabs 10.4.0.62 output: WARNING - load average: 7.66, 7.78, 6.22 [16:35:53] PROBLEM Current Load is now: WARNING on ve-roundtrip2.pmtpa.wmflabs 10.4.0.162 output: WARNING - load average: 9.56, 7.55, 6.00 [16:38:33] RECOVERY Free ram is now: OK on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: OK: 24% free memory [16:38:53] RECOVERY Free ram is now: OK on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: OK: 22% free memory [16:39:13] RECOVERY Free ram is now: OK on sube.pmtpa.wmflabs 10.4.0.245 output: OK: 22% free memory [16:40:42] RECOVERY Free ram is now: OK on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: OK: 27% free memory [16:41:52] RECOVERY Free ram is now: OK on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: OK: 22% free memory [16:48:43] PROBLEM Free ram is now: WARNING on mediawiki-bugfix-kozuch.pmtpa.wmflabs 10.4.0.26 output: Warning: 19% free memory [16:56:23] PROBLEM dpkg-check is now: CRITICAL on puppet1.pmtpa.wmflabs 10.4.0.251 output: DPKG CRITICAL dpkg reports broken packages [16:57:13] PROBLEM Free ram is now: WARNING on sube.pmtpa.wmflabs 10.4.0.245 output: Warning: 14% free memory [17:06:22] RECOVERY dpkg-check is now: OK on puppet1.pmtpa.wmflabs 10.4.0.251 output: All packages OK [17:06:53] PROBLEM Free ram is now: WARNING on conventionextension-trial.pmtpa.wmflabs 10.4.0.165 output: Warning: 13% free memory [17:09:52] PROBLEM Free ram is now: WARNING on integration-jobbuilder.pmtpa.wmflabs 10.4.0.21 output: Warning: 17% free memory [17:11:33] PROBLEM Free ram is now: WARNING on nova-precise2.pmtpa.wmflabs 10.4.1.57 output: Warning: 19% free memory [17:29:42] heya, so, the page to sign up for developer access isn't working, it gives me "Incorrect or missing confirmation code. " (This page: https://labsconsole.wikimedia.org/wiki/Special:UserLogin/signup ) [17:35:56] greg-g, have you talked to Ryan Lane? [17:36:18] He doesn't seem to be on IRC at the moment :/ [17:36:35] Krenair: what's his nick? [17:36:46] "Confirmation code"? The one you should receive in your mail? Did you try several times? [17:37:15] I tried a few times, yes :) (thought it might have been captcha confirmation code related, or something of the sort) [17:37:39] Ah yes, there is a captcha too :/ [17:37:40] and I havne't received any email about it, just working through the new tech employee wiki page [17:38:04] you get to that signup page from: https://www.mediawiki.org/wiki/Developer_access [17:42:35] I guess you'll have to wait for someone knowledgeable about this. [17:47:35] Darkdadaah: no worries, no rush [17:47:53] RECOVERY Current Load is now: OK on parsoid-roundtrip6-8core.pmtpa.wmflabs 10.4.0.222 output: OK - load average: 3.53, 3.34, 4.67 [17:52:22] greg-g, still having login trouble? [17:55:51] andrewbogott: yeah, haven't tried anything new since I first said anything in here :) [17:56:39] greg-g, can you catch me up with what's happened so far? [17:57:39] andrewbogott: yeah, so, I'm just reading the new tech employee docs: https://office.wikimedia.org/wiki/New_tech_employee_orientation [17:57:51] andrewbogott: one of the steps is to request a Developer Account [17:58:23] RECOVERY Current Load is now: OK on parsoid-roundtrip3.pmtpa.wmflabs 10.4.0.62 output: OK - load average: 1.99, 2.89, 4.71 [17:59:29] andrewbogott: and this page: https://www.mediawiki.org/wiki/Developer_access is where I'm brought to the labconsole account creation page, and there, after filling out, gives me the "Incorrect or missing confirmation code" error [18:00:09] greg-g: OK, so, backing up… on that page, under 'Accounts', we're talking about the top two bullet points right? [18:00:50] andrewbogott: right, the first one really (since I don't think I'll need the ops stuff) [18:00:52] RECOVERY Current Load is now: OK on ve-roundtrip2.pmtpa.wmflabs 10.4.0.162 output: OK - load average: 1.65, 2.47, 4.78 [18:01:19] yep, ok. I'm clarifying because the process used to invole 'requesting' but now it's self-serve. In theory. [18:01:28] ah, gotcha, yay theories [18:01:30] OK, so you visited the labsconsole account creation page and filled out the form and... [18:01:47] and then the error message [18:02:14] Ah, ok. Any chance you just got an extra-difficult capcha? [18:02:37] well, I've tried about 3 times, I hope I'm really a human [18:02:53] lemme try again, for completeness's sake [18:03:21] [bz] (8UNCONFIRMED - created by: 2greg.chaki, priority: 4Unprioritized - 6normal) [Bug 45202] Personal information posted that needs to be removed - https://bugzilla.wikimedia.org/show_bug.cgi?id=45202 [18:04:08] andrewbogott: same :/ [18:04:37] ok, nothing interesting in /this/ log, lemme look in another [18:04:48] k [18:04:55] thanks, andrewbogott [18:05:53] greg-g, are you in the office perchance? [18:07:16] hm, confirmed in the source that that is indeed a capcha rejection. [18:07:41] (Which is not to say that you aren't human; likely something is going wrong with the validation) [18:07:59] Re: bz, I really wonder what "business disclosure agreements" might have been compromised by that post. Ts, ts, ts. [18:09:23] <^demon|lunch> We don't remove e-mails just because a name appears really. [18:09:32] <^demon|lunch> Plus, there's zero we can do about 3rd party archives. [18:10:29] Hah [18:10:36] Also, the guy's name is in his e-mail address [18:10:42] There is zero expectation of privacy there [18:11:21] Is his email address 'jowblow@Iusedtobeinthemafia.com'? [18:11:49] <^demon|lunch> He already e-mailed me as list admin, I just hadn't responded yet. [18:12:10] andrewbogott: I am, over by robla's desk [18:12:27] <^demon|lunch> andrewbogott: And no, his e-mail is fname.lname@gmail. [18:12:30] andrewbogott: firstname.lastname@foobar.com is upset that we revealed his name is Firstname Lastname [18:14:26] RoanKattouw: Nitpicking: The post doesn't even say this, and if, it wouldn't have been us, but him! :-) [18:15:39] But I try to get my head around why he is trying to purge this message 4.5 years after the fact ... [18:16:04] Yeah... [18:17:07] Perhaps trying to start a new business. Well, whatever. [18:19:32] <^demon|lunch> Marked WONTFIX. [18:19:53] <^demon|lunch> Basically, unless someone included their home address or a phone number, we don't do this. It's hugely disruptive. [18:20:59] andrewbogott: thanks for the help [18:21:46] [bz] (8NEW - created by: 2Andrew Bogott, priority: 4Unprioritized - 6normal) [Bug 45203] Dumb error message on account creation shell name collision - https://bugzilla.wikimedia.org/show_bug.cgi?id=45203 [18:21:59] Argh! [18:22:04] Man that yellow hurts my eyes [18:23:55] <^demon|lunch> Yellow? [18:34:20] ^demon: I assume RoanKattouw means the "NEW" in the bug announce, does your irc client show colors? [18:34:43] <^demon> Nope, and I forget that sometimes :p [18:50:26] Coren: why a database to back the info? [18:50:48] what kind of stuff will go into it? [18:52:01] That's my usual MO. Have management tools used to edit a db (often just a flatfile), and use that db as source to generate config from. The idea is idempotence; if an operation fails just doing the 'regenerate stuff' operation restores to a known state (as opposed to, say, account creation failing to create /some/ things and needing partial fixes) [18:52:58] In this case, the db should just contain: "tool id" "list of maintainers" "other-config-stuff-I-dunno-yet", and be the authoritative source to generate users, groups, wiki pages, directories, etc. [18:53:20] generate users? [18:53:30] we'll manage the users and groups in ldap [18:53:47] the wiki pages themselves will be generated when the users are created [18:53:53] I know, but there would be a script that takes the list of tools, makes a list of users, and syncs up LDAP [18:54:10] RoanKattouw_away so change that [18:54:30] what kind of programmers are you guys, if you can't change the config of a simple bot [18:54:46] This way, the management tools don't need to know how the users are setup on the backend [18:54:57] Makes things MUCH less brittle. [18:55:38] New tool -> just add it to the db. The rest is done backend. [18:56:09] syncs up ldap? [18:56:47] Yep, and would create the directory in /data/projects if it doesn't exist (possibly copying a skeleton), and any other things we add to this in time for the "new project" task. [18:57:12] Like: stuff.d/00sync-users stuff.d/10create-project-dir and so on [18:57:41] Each of those scripts being written to be idempotent. [18:58:13] This is why I like puppet. I'm a big fan of state definitions as opposed to "list of stuff to do" :-) [18:58:50] s/"new project"/"new tool"/ [18:59:23] I'm not sure why we'd write into a database, then into ldap [18:59:24] If one of the steps break, fix the breaking and just rerun the "sync everything up" tool. Partial failures are thus gracefully handled. [18:59:47] all of the necessary info will be in ldap and the wiki immediately [18:59:54] I don't see the need for a separate database at all [18:59:59] it's just another thing to fail [19:00:22] the scripts to manage the /data/project directories can pull from ldap [19:00:23] Hm. How flexible is out LDAP schema then? It could very well be "the" database [19:00:35] what information is really needed? [19:00:41] a tool is a user/group [19:00:57] the sudo policy can be injected into ldap immediately when the user/group is created [19:01:18] the tool information can be created immediately when the user/group is created too [19:01:21] At this time? I don't know. A tool is a user/group, but it might also be needed resources, or requirements, or even things like "only instance x" [19:01:23] as a wiki page [19:01:42] Sure, but you don't want to use the wiki page as a source for configuration. [19:02:39] To me, a DB for the tool info is necessary for proper management. Can LDAP be used for this? Sure. It might even be the best possible place depending on a few things. [19:02:50] I don't see why it is [19:03:00] if we're already using puppet, we should use it for this [19:03:18] those requirements you're mentioning can be puppet vars or classes [19:03:47] Sure, but where do you pull the puppet config /from/? Certainly, you don't intend to have a tool accessible from end-users being able to directly write to puppet config? [19:04:02] why not? [19:04:18] we already do ;) [19:04:28] the puppet config is stored in ldap [19:04:31] ... o_O [19:05:16] hm. maybe we should write an ENC [19:05:30] Wait, wait. I mean, from the labsconsole, project admin clearly can fiddle with the puppet config -- but that's not endusers. [19:05:38] end users can doo [19:05:39] *too [19:05:54] on the manage instances page, there's a "configure" action [19:06:10] also, managing what's available can be done per-project by project admins [19:06:22] PROBLEM Total processes is now: WARNING on bots-bnr1.pmtpa.wmflabs 10.4.1.68 output: PROCS WARNING: 155 processes [19:06:23] hm. this wouldn't really be doable per-tool, though [19:06:42] and the puppet config is per instance, not global [19:06:59] of course, there's nothing stopping us from changing how that works with an ENC [19:06:59] Right, and I would have expected we'd be a bit more... miserly with root on instances. [19:07:17] we can control how access works to the puppet config [19:07:45] Allright, I'm open to alternatives; there's much I don't know about the internal SOPs and MOs yet. :-) [19:07:54] for instance, we could let a tool author manage config per tool and only allow them to modify prefix'd variables [19:08:15] let's keep things simple right now [19:08:26] But I do have two objectives: (a) rock solid configuration, (b) proper compartmentalization. :-) [19:08:34] and assume that tools don't need to be configured [19:08:43] or if they do, we can handle it inside of the project [19:09:01] for instance, we could have flat files in /data/project that are config files [19:09:18] then the file/directory permissions would serve well enough [19:09:25] and we wouldn't need a custom interface [19:09:52] creation of tools can honestly be self-service [19:09:58] there's no reason it can't [19:10:03] Oh, I agree. [19:10:09] we can have an interface for that [19:10:13] That's what I am gunning for. [19:10:20] that will create the ldap user/group [19:10:30] and will also create a documentation page [19:10:34] on the wiki [19:10:49] But you'd have that tool create the ldap stuff and the documentation directly? [19:10:54] <^demon> Ryan_Lane: https://gerrit.wikimedia.org/r/#/c/49987/, https://gerrit.wikimedia.org/r/#/c/50029/ and https://gerrit.wikimedia.org/r/#/c/50016/ are all ready to go in. [19:10:58] a script can run that will create the directory in /data/project by pulling the info from ldap [19:11:05] Coren: yes [19:11:18] the script can also put in a basic config file for the tool [19:11:25]