[07:57:59] anyone online that can grant a right to a user on the commons beta server? [07:58:29] need to add JvGent to GWToolset on http://commons.wikimedia.beta.wmflabs.org/ [08:21:39] anyone online that can grant a right to a user on the commons beta server? [08:21:48] : need to add JvGent to GWToolset on http://commons.wikimedia.beta.wmflabs.org/ [09:44:08] 3Wikimedia Labs / 3tools: Update tcl-trf to version 2.1.4-dfsg-3 - 10https://bugzilla.wikimedia.org/62387#c5 (10Andre Klapper) (In reply to Marc A. Pelletier from comment #4) > I'll try to get that tomorrow (Jun 11 2014) Wondering if that happened. [13:21:08] 3Wikimedia Labs / 3Infrastructure: Create a cxserver user in labs LDAP - 10https://bugzilla.wikimedia.org/66575#c1 (10Kartik Mistry) Ping? [13:32:14] SPF|Cloud: o/ [13:32:23] :p [13:32:47] hi a930913 [13:32:52] SPF|Cloud: The one who runs CBNG is AWOL :/ [13:33:02] Who is awol [13:33:51] SPF|Cloud: You wanted to be accepted for the ClueBot NG review interface, right? [13:34:01] Yes [13:34:05] (if possible) [13:34:38] SPF|Cloud: We haven't seen the guy who runs it for quite a while. [13:35:03] And that person is the only user with the permissions to give access? [13:36:17] SPF|Cloud: Yes. [13:36:30] Ah, okay. [13:37:49] SPF|Cloud: Although if he's not on holiday and we haven't seen him by the end of the summer, in theory a sysadmin can give access to another user. [13:39:44] a930913: "The guy" = Damianz? [13:41:40] scfc_de: Yeah. [13:46:44] I'm trying ssh into a into a bastion instance for the first time and I'm getting channel 0: open failed: administratively prohibited: open failed [13:46:44] According to The Internet this could be a large number of things and wondering if anyone could help narrow it down? I can ssh into other non bastion machines okay. [13:48:08] 3Wikimedia Labs / 3tools: Update tcl-trf to version 2.1.4-dfsg-3 - 10https://bugzilla.wikimedia.org/62387#c6 (10Tim Landscheidt) I think not, as the error in comment #0 still appears. [13:48:17] I see the "If you are having access problems, please see: https://wikitech.wikimedia.org/wiki/Access#Accessing_public_and_private_instances" message but then I get "channel 0: open failed: administratively prohibited: open failed [13:48:19] ssh_exchange_identification: Connection closed by remote host" :( [13:51:11] I can also directly ssh to bastion.wmflaps so... maybe a port forwarding problem... [13:56:55] mvolz: IIRC this can be an indicator that you are trying to access a host that doesn't exist. [14:07:56] scfc_de: hmm. it is this one: https://wikitech.wikimedia.org/wiki/Nova_Resource:Services [14:08:31] I assumed the name was services.eqiad.wmflabs ? [14:14:21] i guess that was naive, how do I find out the hostname? [14:19:54] mvolz: That project has no instances yet, so you can't log into any of its servers :-). [14:20:08] (Because they do not exist.) [14:21:36] oh. [14:21:55] thanks :) [14:22:54] 3Wikimedia Labs / 3Infrastructure: Create a cxserver user in labs LDAP - 10https://bugzilla.wikimedia.org/66575#c2 (10Andrew Bogott) I'm looking in the production puppet manifests and I can't see where this user is defined there. Would you expect cxserver to exist in prod, or is there a reason we need it in... [14:29:54] 3Wikimedia Labs / 3Infrastructure: Create a cxserver user in labs LDAP - 10https://bugzilla.wikimedia.org/66575#c3 (10Kartik Mistry) See: https://gerrit.wikimedia.org/r/#/c/139095/6/manifests/role/cxserver.pp (in review) [14:33:23] 3Wikimedia Labs / 3Infrastructure: Create a cxserver user in labs LDAP - 10https://bugzilla.wikimedia.org/66575#c4 (10Andrew Bogott) that... creates a user in labs but not in production, right? So my question remains. [14:35:09] 3Wikimedia Labs / 3Infrastructure: Create a cxserver user in labs LDAP - 10https://bugzilla.wikimedia.org/66575#c5 (10Kartik Mistry) Missed that. Just in Beta cluster as of now. [14:57:00] YuviPanda|brb: ping me when you return? [15:07:49] Cron Daemon email sent with Korean text broken. Is it known bugor what should I fix? [15:08:32] *bug or [15:32:39] Revi: depends on your settings [15:32:48] hmm. [15:32:56] charset? [15:35:53] and console encodings [15:51:38] 3Wikimedia Labs / 3tools: Update tcl-trf to version 2.1.4-dfsg-3 - 10https://bugzilla.wikimedia.org/62387#c7 (10Silke Meyer (WMDE)) Ping Marc-André... any news? [16:53:38] * YuviPanda waves at andrewbogott [16:54:06] YuviPanda: hello! [16:54:38] This is non-urgent, but I'm messing around with http://wikitech-test.wmflabs.org and wondered if you want to take a few minutes to sort out why account creation is failing? [16:54:49] I've made a bit of progress since I pinged you last, not sure if the next issue is interesting or not [16:55:06] andrewbogott: sure! what do you mean by 'account creation is failing'? [16:55:09] in mw itself? [16:55:09] (Also right now I am eating lunch so not fully present) [16:55:15] ok [16:55:22] Yeah, if you create an account on the wiki it fails. [16:55:28] andrewbogott: good point, I should eat dinner. let me grab something, I'll brb in about 15m [16:55:34] ok! [17:50:03] YuviPanda: any idea how I can insert something like $wgAuth = new LdapAuthenticationPlugin(); into the config? [17:50:17] It's easy to insert wgAuth => 'new LdapAuthenticationPlugin()' but I don't think that will help me :/ [17:50:27] andrewbogott: that should work. what's your use case? [17:50:33] andrewbogott: do you want to do things to wgAuth? [17:50:46] andrewbogott: settings => { wgAuth => 'something' }; should work [17:51:05] Right but the 'something' is inserted in quotes. [17:51:13] Which, surely php will regard that as a string rather than as a function call. [17:52:11] andrewbogott: ah, right. there's another way, moment [17:54:35] andrewbogott: try putting it under settings.d folder? [17:55:11] Sorry, I don't understand... [17:55:16] you mean, within the role .pp file? [17:55:23] andrewbogott: oh damn, wait, this is labs-vagrant, If orgot [17:55:25] *forgot [17:55:34] moment [17:55:39] Instance wikitech-test-frontend [17:55:42] if you want to poke around. [17:55:49] (I have an ops meeting starting in 5) [17:57:18] andrewbogott: /vagrant/settings.d [17:57:29] andrewbogott: put a .php file there, and it gets loaded. [17:58:10] andrewbogott: you can conditionally place a file there with your role, I guess. [17:58:16] That will probably work, I can also probably just have a big run-on string in the manifest. [17:58:36] But maybe instead I/we should fix the mediawiki module to support things :) [17:58:45] andrewbogott: :D yeah, that's an option too! :D [17:59:04] Ideally it would just take "arbitrary string" as one of its args. [17:59:07] Dunno if that's possible though. [18:09:45] andrewbogott: it should, though. makes things like arrays easier too. [19:29:43] !log deployment-prep /var 0% free on deployment-bastion; looking for things to clean-up [19:29:46] Logged the message, Master [19:30:09] bd808: is this after adding the biglogs class? [19:30:32] YuviPanda: No, but deployment-bastion has a second mount for other things [19:30:37] aah, ok [19:30:59] /var/log is eating the space though :( [19:32:55] bd808: logrotate all the things? :) [19:36:22] !log deployment-prep /var/log/diamond is 787M of 1.2G total logs [19:36:25] Logged the message, Master [19:40:32] bd808: did you figure out why? [19:41:06] paravoid: It looks like it never rotates. [19:41:32] That may be a problem only in labs or it may be a systemic issue [19:42:05] probably the latter [19:42:15] chasemp: ^^ [19:42:46] oh boy nice, okay [19:42:49] looking into it [19:43:18] don't you love the rebooted ops team? :) [19:44:14] paravoid: I do! I liked the old team too, but it's nice to have enough people to fix things [19:44:20] bd808: can you give me a specific box as an example? [19:44:47] chasemp: deployment-bastion.eqiad.wmflabs [19:45:10] ah ok that is a box I thought it was a generic name oops :) [19:45:34] It ran out of disk on /var; turns out it's mostly due to the /var/log/diamond/diamond.log file [19:46:07] seems to have bounced me, andrewbogott can ops get into vm's from all projects? [19:46:36] That project may be weird [19:46:41] bd808: I know at one point the setting for days max logging was not set right [19:46:47] so it could be accumlated pre-that-setting [19:46:51] but would like to see [19:47:02] look at /etc/diamond/diamond.conf [19:47:10] does the loggging section set 5 days? [19:47:53] chasemp: with your root key, probably. [19:48:02] But I can also just add you to a project. [19:48:17] is this maybe a weird "bastion" host thing? [19:48:44] chasemp: Nope. Logging section says to use rotated_file handler, but the handler itself doesn't seem to specify when to rotate [19:49:06] bd808: can you 'puppet agent --test' ? [19:49:13] Our puppet config in the beta project looks to not have been synced with upstream for a week or so [19:49:17] ah [19:49:30] bd808: hashar put it on its own puppetmaster a while ago, I think [19:49:48] YuviPanda: I put it on it's own puppet master when we moved to eqiad :) [19:50:11] So the answer sounds like "update puppet and force a run" [19:50:11] bd808: ah, right. easy to confuse you and hashar :D [19:50:29] bd808: https://gerrit.wikimedia.org/r/#/c/138789/ [19:50:57] either I guess need to mimic that in beta or import from prod? [19:51:17] Thanks chasemp. I'm updating the puppet master now [19:52:07] unsure of sanity there, but rm'ing that big log file and restarted diamond post that change is probably best option [19:52:42] It looks like puppet may be jacked up on that host too [19:53:06] /var/log/puppet.log is full of "Run of Puppet configuration client already in progress; skipping (/var/lib/puppet/state/agent_catalog_run.lock exists)" [19:53:07] good times all around, how so tho? [19:53:12] ah [19:53:44] puppet agent --test -tags diamond [19:54:00] I think can work even with a lock? used to maybe not now, probably was a bug anyway :) [19:55:11] !log deployment-prep Truncated /var/log/diamond/diamond.log and restarted diamond on deployment-bastion [19:55:13] Logged the message, Master [19:55:25] That will buy me some time to fix other things [19:55:42] This looks like the diamond.log thing we recently saw on Tools. [19:55:52] ok, if you still have weirdness with it drop me a line [19:56:01] scfc_de: same root cause yes, should be resolved tho? [19:56:16] chasemp: Thanks. Will do. [19:56:21] scfc_de: did it fill up /var? [19:57:25] YuviPanda: if I remember yes that was it essentially. scfc_de nuked the file and restarted, but that changeset should prevent the issue [19:57:33] cool! [19:57:35] chasemp: No, I deleted diamond.logs and restarted diamonds, but the cause (diamond creating that) isn't "fixed". IMHO, on Labs we could forgo diamond.log. [19:58:00] yes, it shouldn't be default in labs anyways I think [19:58:30] it's a matter of catching it from standard in the role once it was moved into standard [19:58:56] Create only conditionally on $::realm = labs? [19:58:59] *!= [19:59:11] scfc_de: I think ori has vowed to -1 any such realm changing? :) [19:59:19] (or -2) [19:59:27] * YuviPanda remmebers reading that in one of bd808's Patches [19:59:49] what's the prefered way to say 'not labs' in a role [19:59:59] !log deployment-prep Deleted /var/lib/puppet/state/agent_catalog_run.lock on deployment-bastion after verifying that no puppet processes were running [20:00:01] Logged the message, Master [20:00:08] chasemp: unsure what bd808 did. [20:00:14] bbl [20:00:31] I think it's ok in a role. Ori wants to kill it with fire in modules [20:00:47] ah yes, makes sense [20:00:49] ah [20:00:51] right [20:01:23] Long term I suppose it should all get handled in heira or something [20:01:40] * YuviPanda read heira as heroin [20:07:32] andrewbogott: You changed the gid for the l10nupdate group to 10002 correct? [20:07:42] bd808: sounds right [20:08:19] I'm getting an error from puppet about it and some weird looking `id` output [20:08:35] `id l10nupdate` == "uid=10002(l10nupdate) gid=10002 groups=10060(l10nupdate),10002" [20:09:10] Puppet is complaining because it can't change the gid from 10060 to 10002 [20:10:06] `groups l10nupdate` == "l10nupdate : groups: cannot find name for group ID 10002" [20:10:24] bd808: probably my fault, hang on... [20:14:13] bd808: better? [20:14:54] andrewbogott: Yeah that looks better in id and groups. I'll try puppet again [20:16:35] andrewbogott: w00t. clean puppet run too. Thanks [20:19:40] !log deployment-prep Updated scap to 5adce72; trebuchet reported i-00000237 (deployment-videoscaler01) as not updating, but manual check shows it did sync properly [20:19:42] Logged the message, Master [20:22:48] so peeps if diamond is in your way you can just stop it, we won't enforce in labs anymore fyi [20:25:23] 3Wikimedia Labs / 3deployment-prep (beta): Automate updating the puppet checkout - 10https://bugzilla.wikimedia.org/66683 (10Greg Grossmeier) p:5Unprio>3Normal [20:25:25] 3Wikimedia Labs / 3deployment-prep (beta): Automate updating the puppet checkout - 10https://bugzilla.wikimedia.org/66683 (10Greg Grossmeier) 3NEW p:3Unprio s:3normal a:3None The puppet checkout on Beta Cluster should be automatic. [20:29:28] chasemp: The config on deployment-bastion looks good (rotate every day, keep 5) after updating puppet (and fixing a bunch of other unrelated problems) [20:34:14] !log deployment-prep Puppet disabled on deployment-jobrunner01 since 2014-06-03; No SAL logs explaining why [20:34:17] Logged the message, Master [20:35:37] whoa, that's been a while [20:36:18] !log deployment-prep Enabled puppet on deployment-jobrunner01 and forced a run [20:36:20] Logged the message, Master [21:16:30] !log deployment-prep Jenkins beta-scap-eqiad job broken because of missing puppet config on deployment-jobrunner01; needs role::beta::scap_target [21:16:32] Logged the message, Master [23:12:52] 3Wikimedia Labs / 3tools: Update tcl-trf to version 2.1.4-dfsg-3 - 10https://bugzilla.wikimedia.org/62387#c8 (10Andre Klapper) Coren is on holidays, if I remember correctly. CC'ing Andrew Bogott, as this ticket needs to get fixed this month. [23:50:57] !log tools Shut down diamond services and removed log files on all hosts [23:50:59] Logged the message, Master [23:51:25] scfc_de: assuming diamond is eventually the ganglia replacement, we should, once it is in production, deploy it for tools too [23:51:34] not make it log as much, probably :) [23:54:31] Yeah, as said, I would have been fine with not creating diamond.*log* on Labs; but if ops thinks the whole diamond service is unnecessary on Labs, I'm okay with that as well. But monitoring of course is and will be needed :-).