[00:05:11] 3Wikimedia-Labs-Infrastructure: Move LabsDB aliases and NAT to DNS and LabsDB servers - https://phabricator.wikimedia.org/T63897#1007190 (10yuvipanda) I think a simpler solution is to use the -A option of dnsmasq to have it return A records for these entries. We can set that in puppet now (see `dnsmasq-nova.conf... [00:25:19] 3Wikimedia-Labs-Infrastructure: Move LabsDB aliases and NAT to DNS and LabsDB servers - https://phabricator.wikimedia.org/T63897#1007210 (10scfc) That would indeed be *much* simpler :-): ``` -A, --address=//[domain/] Specify an IP address to return for any host... [00:26:24] 3Wikimedia-Labs-Infrastructure: Move LabsDB aliases and NAT to DNS and LabsDB servers - https://phabricator.wikimedia.org/T63897#1007211 (10yuvipanda) We'll just have to replicate how things are set up in curent /etc/hosts (which do *not* use NAT rules, just pick one of labsdb1001-3 as default for a particular w... [01:16:36] 3Wikimedia-Labs-Infrastructure: Move LabsDB aliases and NAT to DNS and LabsDB servers - https://phabricator.wikimedia.org/T63897#1007218 (10scfc) Ah, you mean one `-A` option per wiki! I thought just one `-A /labsdb/10.something`. [01:17:46] 3Wikimedia-Labs-Infrastructure: Move LabsDB aliases and NAT to DNS and LabsDB servers - https://phabricator.wikimedia.org/T63897#1007219 (10yuvipanda) ah, yeah. per wiki I meant. [01:38:49] PROBLEM - Puppet failure on tools-exec-04 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [01:38:55] PROBLEM - Puppet failure on tools-webgrid-04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [01:39:14] PROBLEM - Puppet failure on tools-exec-catscan is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [01:39:50] PROBLEM - Puppet failure on tools-exec-01 is CRITICAL: CRITICAL: 30.00% of data above the critical threshold [0.0] [01:41:28] PROBLEM - Puppet failure on tools-shadow is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [01:41:57] PROBLEM - Puppet failure on tools-exec-cyberbot is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [01:42:08] PROBLEM - Puppet failure on tools-submit is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [01:44:50] PROBLEM - Puppet failure on tools-webproxy is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [01:45:54] PROBLEM - Puppet failure on tools-redis is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [01:46:40] PROBLEM - Puppet failure on tools-exec-10 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [01:47:32] PROBLEM - Puppet failure on tools-exec-05 is CRITICAL: CRITICAL: 62.50% of data above the critical threshold [0.0] [01:48:22] PROBLEM - Puppet failure on tools-exec-03 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [01:48:34] PROBLEM - Puppet failure on tools-exec-06 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [01:48:48] PROBLEM - Puppet failure on tools-master is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [01:49:01] PROBLEM - Puppet failure on tools-exec-09 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [01:50:42] PROBLEM - Puppet failure on tools-exec-15 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [01:51:06] PROBLEM - Puppet failure on tools-exec-11 is CRITICAL: CRITICAL: 75.00% of data above the critical threshold [0.0] [01:51:24] PROBLEM - Puppet failure on tools-exec-gift is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [01:51:44] PROBLEM - Puppet failure on tools-mail is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [01:51:50] PROBLEM - Puppet failure on tools-exec-wmt is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [01:52:06] PROBLEM - Puppet failure on tools-webgrid-06 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [01:53:00] PROBLEM - Puppet failure on tools-static is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [01:53:12] PROBLEM - Puppet failure on tools-webgrid-05 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [01:53:40] PROBLEM - Puppet failure on tools-webgrid-02 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [01:53:40] PROBLEM - Puppet failure on tools-trusty is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [02:01:45] 3Tool-Labs, operations: Replag on labsdb - https://phabricator.wikimedia.org/T88183#1007226 (10Springle) My fault. I left pt-table-checksum running between sanitarium and labsdb. Normally a painless process except when I stupidly have it monitor the wrong db1069 instance for replag. [02:06:29] RECOVERY - Puppet failure on tools-shadow is OK: OK: Less than 1.00% above the threshold [0.0] [02:07:05] 3Tool-Labs, operations: Replag on labsdb - https://phabricator.wikimedia.org/T88183#1007228 (10Springle) @yuvipanda, godog and I have previously theorized about pushing replag (and all other stats) from the tendril log tables to graphite. Whatever the solution, we should probably also solve T87209 by switching... [02:08:46] 3operations, Labs: MySQL on wikitech keeps dying - https://phabricator.wikimedia.org/T88256#1007230 (10yuvipanda) 3NEW [02:08:53] RECOVERY - Puppet failure on tools-exec-04 is OK: OK: Less than 1.00% above the threshold [0.0] [02:08:53] RECOVERY - Puppet failure on tools-webgrid-04 is OK: OK: Less than 1.00% above the threshold [0.0] [02:09:11] RECOVERY - Puppet failure on tools-exec-catscan is OK: OK: Less than 1.00% above the threshold [0.0] [02:09:49] RECOVERY - Puppet failure on tools-webproxy is OK: OK: Less than 1.00% above the threshold [0.0] [02:09:55] RECOVERY - Puppet failure on tools-exec-01 is OK: OK: Less than 1.00% above the threshold [0.0] [02:11:36] RECOVERY - Puppet failure on tools-exec-10 is OK: OK: Less than 1.00% above the threshold [0.0] [02:12:00] RECOVERY - Puppet failure on tools-exec-cyberbot is OK: OK: Less than 1.00% above the threshold [0.0] [02:12:08] RECOVERY - Puppet failure on tools-submit is OK: OK: Less than 1.00% above the threshold [0.0] [02:12:30] RECOVERY - Puppet failure on tools-exec-05 is OK: OK: Less than 1.00% above the threshold [0.0] [02:13:23] RECOVERY - Puppet failure on tools-exec-03 is OK: OK: Less than 1.00% above the threshold [0.0] [02:14:03] RECOVERY - Puppet failure on tools-exec-09 is OK: OK: Less than 1.00% above the threshold [0.0] [02:15:51] RECOVERY - Puppet failure on tools-redis is OK: OK: Less than 1.00% above the threshold [0.0] [02:16:04] RECOVERY - Puppet failure on tools-exec-11 is OK: OK: Less than 1.00% above the threshold [0.0] [02:17:08] RECOVERY - Puppet failure on tools-webgrid-06 is OK: OK: Less than 1.00% above the threshold [0.0] [02:17:56] RECOVERY - Puppet failure on tools-static is OK: OK: Less than 1.00% above the threshold [0.0] [02:18:10] RECOVERY - Puppet failure on tools-webgrid-05 is OK: OK: Less than 1.00% above the threshold [0.0] [02:18:36] RECOVERY - Puppet failure on tools-exec-06 is OK: OK: Less than 1.00% above the threshold [0.0] [02:18:40] RECOVERY - Puppet failure on tools-webgrid-02 is OK: OK: Less than 1.00% above the threshold [0.0] [02:18:48] RECOVERY - Puppet failure on tools-master is OK: OK: Less than 1.00% above the threshold [0.0] [02:20:42] RECOVERY - Puppet failure on tools-exec-15 is OK: OK: Less than 1.00% above the threshold [0.0] [02:21:24] RECOVERY - Puppet failure on tools-exec-gift is OK: OK: Less than 1.00% above the threshold [0.0] [02:21:38] RECOVERY - Puppet failure on tools-mail is OK: OK: Less than 1.00% above the threshold [0.0] [02:21:52] RECOVERY - Puppet failure on tools-exec-wmt is OK: OK: Less than 1.00% above the threshold [0.0] [02:23:40] RECOVERY - Puppet failure on tools-trusty is OK: OK: Less than 1.00% above the threshold [0.0] [03:36:15] 3operations, Labs: MySQL on wikitech keeps dying - https://phabricator.wikimedia.org/T88256#1007295 (10Springle) virt1000 is way over extended. The kernel OOM killer is choosing mysqld, at different times triggered by memory spikes from java, apache2, and mysqld itself. I reduced -- the unpuppetized -- InnoDB... [03:38:22] can you start xtools [03:41:04] Mjbmr: it’s running... [03:41:14] thanks [03:41:21] I didn’t have to start it :) [03:41:56] hmm [03:44:03] 3operations, Labs: MySQL on wikitech keeps dying - https://phabricator.wikimedia.org/T88256#1007303 (10Springle) Also, not causing outages but indicative of something funny happening recently; heaps of 'novaold' errors: ``` 150202 2:44:02 [ERROR] Cannot find or open table novaold/volumes from the internal data... [06:42:37] PROBLEM - Puppet failure on tools-exec-10 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [06:54:38] PROBLEM - Puppet failure on tools-webgrid-02 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [06:57:02] PROBLEM - Puppet failure on tools-exec-08 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [07:12:37] RECOVERY - Puppet failure on tools-exec-10 is OK: OK: Less than 1.00% above the threshold [0.0] [07:19:43] RECOVERY - Puppet failure on tools-webgrid-02 is OK: OK: Less than 1.00% above the threshold [0.0] [07:27:00] RECOVERY - Puppet failure on tools-exec-08 is OK: OK: Less than 1.00% above the threshold [0.0] [08:17:58] PROBLEM - Puppet failure on tools-exec-08 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [08:18:48] PROBLEM - Puppet failure on tools-webgrid-01 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [08:19:48] PROBLEM - Puppet failure on tools-exec-04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [08:19:56] PROBLEM - Puppet failure on tools-dev is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [08:19:56] PROBLEM - Puppet failure on tools-webgrid-04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [08:20:08] PROBLEM - Puppet failure on tools-uwsgi-01 is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [08:20:10] PROBLEM - Puppet failure on tools-exec-catscan is CRITICAL: CRITICAL: 37.50% of data above the critical threshold [0.0] [08:20:28] PROBLEM - Puppet failure on tools-login is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [08:20:34] PROBLEM - Puppet failure on tools-exec-12 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [08:20:44] PROBLEM - Puppet failure on tools-webproxy is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [08:20:53] PROBLEM - Puppet failure on tools-exec-01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [08:21:23] PROBLEM - Puppet failure on tools-exec-02 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [08:22:05] PROBLEM - Puppet failure on tools-webgrid-03 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [08:22:33] PROBLEM - Puppet failure on tools-shadow is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [08:29:35] PROBLEM - Puppet failure on tools-exec-06 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [08:29:47] PROBLEM - Puppet failure on tools-master is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [08:30:01] PROBLEM - Puppet failure on tools-exec-09 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [0.0] [08:32:08] PROBLEM - Puppet failure on tools-exec-11 is CRITICAL: CRITICAL: 50.00% of data above the critical threshold [0.0] [08:33:10] PROBLEM - Puppet failure on tools-webgrid-06 is CRITICAL: CRITICAL: 66.67% of data above the critical threshold [0.0] [08:34:24] PROBLEM - Puppet failure on tools-exec-03 is CRITICAL: CRITICAL: 75.00% of data above the critical threshold [0.0] [08:36:54] PROBLEM - Puppet failure on tools-exec-14 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [08:37:40] PROBLEM - Puppet failure on tools-exec-13 is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [08:43:05] PROBLEM - Puppet failure on tools-exec-cyberbot is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [08:43:07] PROBLEM - Puppet failure on tools-submit is CRITICAL: CRITICAL: 55.56% of data above the critical threshold [0.0] [08:43:31] PROBLEM - Puppet failure on tools-exec-05 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [0.0] [08:43:43] PROBLEM - Puppet failure on tools-exec-10 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [08:44:18] hi [08:45:58] hi Linedwell [08:46:53] PROBLEM - Puppet failure on tools-redis is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0] [08:46:53] I have a question, since the servers reboot, none of my tools (or even a readme in my tool directory) can be displayed, is this "normal" ? [08:48:33] Linedwell: which tools are these? [08:48:45] Linedwell: I started most of the ones that were down, I might've missed some [08:49:01] these are 2 html forms [08:49:11] https://tools.wmflabs.org/linedwell/formulaire.html [08:49:24] and https://tools.wmflabs.org/linedwell/formulaire-person.html [08:50:01] (and even tools.wmflabs.org/linedwell/README can't be found, the page remains blank) [08:50:24] Linedwell: hmm, your webservice seems up, and I'm unsure why it's all getting blank pages [08:50:26] investigating [08:50:37] thanks [08:54:40] RECOVERY - Puppet failure on tools-exec-06 is OK: OK: Less than 1.00% above the threshold [0.0] [08:54:57] Linedwell: back up! [08:55:30] thanks yuvipanda, what was the problem ? (was it something i could have fixed ?) [08:55:53] Linedwell: it was mostly an underlying problem (tools-webgrid-02 has permission issues, it looks like) [08:56:10] oh, ok :D [08:56:12] Linedwell: but I did a webservice2 start, which moved your tool to the newer ubuntu trusty hosts, with newer versions of everything, and it's running great now [08:56:31] thanks you so much :) [08:56:34] yw! [08:56:39] have a nice day [08:56:41] o/ [08:56:48] Linedwell: you too! [08:56:50] ty [08:57:06] RECOVERY - Puppet failure on tools-exec-11 is OK: OK: Less than 1.00% above the threshold [0.0] [08:58:06] RECOVERY - Puppet failure on tools-webgrid-06 is OK: OK: Less than 1.00% above the threshold [0.0] [08:58:14] !log tools sudo salt -G 'fqdn:tools-webgrid-*' cmd.run 'sudo chmod 777 /var/run/lighttpd' [08:59:47] RECOVERY - Puppet failure on tools-master is OK: OK: Less than 1.00% above the threshold [0.0] [09:01:47] RECOVERY - Puppet failure on tools-exec-14 is OK: OK: Less than 1.00% above the threshold [0.0] [09:02:39] RECOVERY - Puppet failure on tools-exec-13 is OK: OK: Less than 1.00% above the threshold [0.0] [09:02:59] RECOVERY - Puppet failure on tools-exec-08 is OK: OK: Less than 1.00% above the threshold [0.0] [09:03:45] RECOVERY - Puppet failure on tools-webgrid-01 is OK: OK: Less than 1.00% above the threshold [0.0] [09:03:48] 3Tool-Labs: Document connecting to labsdb from outside of labs - https://phabricator.wikimedia.org/T85294#1007793 (10yuvipanda) 5Invalid>3Open [09:04:12] 3Tool-Labs: Document connecting to labsdb from outside of labs - https://phabricator.wikimedia.org/T85294#943483 (10yuvipanda) a:5yuvipanda>3None Sorry to have ignored this for so long. Edited and re-opened. [09:04:37] PROBLEM - Free space - all mounts on tools-webproxy is CRITICAL: CRITICAL: tools.tools-webproxy.diskspace._var.byte_percentfree.value (<11.11%) [09:04:55] RECOVERY - Puppet failure on tools-dev is OK: OK: Less than 1.00% above the threshold [0.0] [09:05:11] RECOVERY - Puppet failure on tools-uwsgi-01 is OK: OK: Less than 1.00% above the threshold [0.0] [09:05:18] RECOVERY - Puppet failure on tools-exec-catscan is OK: OK: Less than 1.00% above the threshold [0.0] [09:05:30] RECOVERY - Puppet failure on tools-login is OK: OK: Less than 1.00% above the threshold [0.0] [09:05:34] RECOVERY - Puppet failure on tools-exec-12 is OK: OK: Less than 1.00% above the threshold [0.0] [09:06:27] RECOVERY - Puppet failure on tools-exec-02 is OK: OK: Less than 1.00% above the threshold [0.0] [09:07:05] RECOVERY - Puppet failure on tools-webgrid-03 is OK: OK: Less than 1.00% above the threshold [0.0] [09:07:29] RECOVERY - Puppet failure on tools-shadow is OK: OK: Less than 1.00% above the threshold [0.0] [09:08:07] RECOVERY - Puppet failure on tools-submit is OK: OK: Less than 1.00% above the threshold [0.0] [09:09:53] RECOVERY - Puppet failure on tools-webgrid-04 is OK: OK: Less than 1.00% above the threshold [0.0] [09:09:53] RECOVERY - Puppet failure on tools-exec-04 is OK: OK: Less than 1.00% above the threshold [0.0] [09:10:45] RECOVERY - Puppet failure on tools-webproxy is OK: OK: Less than 1.00% above the threshold [0.0] [09:10:53] RECOVERY - Puppet failure on tools-exec-01 is OK: OK: Less than 1.00% above the threshold [0.0] [09:11:50] RECOVERY - Puppet failure on tools-redis is OK: OK: Less than 1.00% above the threshold [0.0] [09:12:59] RECOVERY - Puppet failure on tools-exec-cyberbot is OK: OK: Less than 1.00% above the threshold [0.0] [09:13:32] RECOVERY - Puppet failure on tools-exec-05 is OK: OK: Less than 1.00% above the threshold [0.0] [09:13:40] RECOVERY - Puppet failure on tools-exec-10 is OK: OK: Less than 1.00% above the threshold [0.0] [09:14:25] RECOVERY - Puppet failure on tools-exec-03 is OK: OK: Less than 1.00% above the threshold [0.0] [09:19:59] RECOVERY - Puppet failure on tools-exec-09 is OK: OK: Less than 1.00% above the threshold [0.0] [09:24:37] RECOVERY - Free space - all mounts on tools-webproxy is OK: OK: All targets OK [11:34:16] 3Labs: Labs available in the new data centre (with Neutron/IPv6) - https://phabricator.wikimedia.org/T85609#1008038 (10mark) p:5High>3Normal We've deprioritized this, and it's no longer a quarterly goal for Ops. There appears to be very little demand for having Labs available in codfw at this time, and even... [12:52:18] !log shinken started shinken-server-01 manually from virt1000 [12:52:23] Logged the message, Master [13:01:39] hey yuvipanda : time to take a look at https://phabricator.wikimedia.org/T88215 ? [13:01:46] hey tonythomas [13:01:50] hey :) [13:02:01] tonythomas: I looked at it, but I've no understanding of our mail stack at all, sadly :( [13:02:10] tonythomas: I guess jeff_green is your man? [13:02:23] tonythomas: also, do you have a way to verify that it works? [13:02:26] yuvipanda: he is offline as of now :( but that day too - he was telling of meeting you/mark [13:02:46] if it works - our dig command should give the IP of deployment-mx [13:02:58] tonythomas: aaah, that requires DNS changes, doesn't it? [13:03:05] of course ! [13:03:05] you mean MX records? [13:03:14] right. I'm not fully sure how to do that. [13:03:47] yeah. mx records - like dig +short -t mx wikimedia.org gives polonium [13:03:49] tonythomas: but tools has an MX record set [13:03:53] tonythomas: so it is possible [13:03:57] yuvipanda: yay :) [13:04:18] thats what we were looking for. something for dig +short -t mx beta.wmflabs.org too [13:04:46] yuvipanda: can you dig that patch which added that mx records ? [13:04:56] tonythomas: yeah, am trying to [13:06:21] okey ! [13:06:41] tonythomas: it's not in operations/puppet [13:06:43] let me try dns [13:07:00] yup ! [13:10:00] tonythomas: bah, can't find it there either [13:10:11] hmm. That's interesting ! [13:16:08] tonythomas: it might be handhacked in, I wouldn't be surprised if that was the case... [13:16:58] tonythomas: aha! [13:17:00] tonythomas: it's in LDAP! [13:17:04] tonythomas: let me dig a bi tmore [13:17:06] *dig [13:18:55] tonythomas: which domain did you want the mx record to be in? [13:18:58] tonythomas: beta.wmflabs.org? [13:19:32] back ( got into sth ) [13:19:33] yup ! [13:19:37] beta.wmflabs.org [13:19:58] tonythomas: alright, let's do this! is there a bug for this already somewhere? [13:20:07] yup ! digging in a min m [13:20:12] https://phabricator.wikimedia.org/T88215 [13:20:33] we should be adding a new file under operations/dns/templates/ ? [13:21:15] it is beta.wmflabs.org after https://phabricator.wikimedia.org/T88204 got fixed [13:22:04] tonythomas: nope. I need to make an entry in LDAP from terbium [13:22:16] yay ! :) would be great ! [13:22:32] * tonythomas should learn more of LDAP [13:23:36] tonythomas: does deployment-mx already have a public IP? [13:23:51] yuvipanda: its a labs instance as I know. let me check again [13:24:03] it does! [13:24:06] https://wikitech.wikimedia.org/w/api.php?action=query&list=novainstances&niproject=deployment-prep&niregion=eqiad&format=json [13:24:58] great ! [13:25:15] 10.68.17.78 [13:25:27] that's the internal one [13:25:34] 208.80.155.193 ! yup [13:30:42] tonythomas: see comment on https://phabricator.wikimedia.org/T88215 [13:30:52] * tonythomas checks [13:31:50] yuvipanda: yay ! [13:31:53] thats great !~ [13:31:55] and mx.beta.wmflabs.org. ? [13:31:55] tonythomas: :) [13:32:01] tonythomas: yeah, that resolves to deployment-mx [13:32:03] that would redirect to deployment-mxc ? [13:32:12] awesome ;)) [13:32:35] tonythomas: A records can be managed from https://wikitech.wikimedia.org/wiki/Special:NovaAddress [13:33:02] I dont have access to deployment-prep I guess [13:33:07] oh [13:33:08] right [13:33:17] tonythomas: anyway, yay MX records :) [13:33:43] of course. now we have one more thing to go to test the entire setup [13:33:54] yuvipanda: this one https://gerrit.wikimedia.org/r/#/c/186938/ [13:34:05] tonythomas: can you test / verify it after I merge? [13:34:06] should be a one line change. someone with +2 should do that [13:34:10] of course [13:34:26] err. someone should log into deployment-beta and do an sql query [13:34:34] I do not have access there, I guess [13:34:44] what sql query? [13:34:57] select * from bounce-records [13:35:05] you just want the output of that? [13:35:06] or something like that. wait let me create a bounce first [13:35:18] * tonythomas is creating a fake user with bad email id [13:35:34] tonythomas: do you want me to merge your gerrit patch first? [13:35:34] http://deployment.wikimedia.beta.wmflabs.org/ should be creating the bounce [13:35:58] yuvipanda: of course :) [13:36:36] yuvipanda: yay ! [13:36:42] lets wait : ) [13:37:40] should be time to create a bounce. [13:37:54] or time for exim to reload in deployment-mx ? [13:37:54] !log deployment-prep added mx record to beta.wmflabs.org, for https://phabricator.wikimedia.org/T88215 via LDAP [13:37:57] Logged the message, Master [13:38:13] tonythomas: do you not have access to deployment-prep? [13:38:25] yuvipanda: let me check. I dont have sudo afaik [13:38:31] ah, hmm [13:38:44] yeah, you don't have sudo [13:39:04] yeah :( atleast got access ! [13:39:27] tonythomas: yeah, we should fix that soon. I'll bring it up with greg-g at some point, I think [13:39:34] the reason we took away sudo is no longer valid... [13:39:46] tonythomas: anyway, puppet run on deployment-mx complete, exim4 restarted [13:39:55] thanks. btw. do you know the hostname of http://deployment.wikimedia.beta.wmflabs.org/ >? [13:40:06] we would want to manualy email verify the fake user [13:40:14] you know - othewise you wont be able to send an email [13:40:18] tonythomas: all the websites go through the varnish hosts, and hit the same mediawiki backends. [13:40:29] which use the same database machines [13:40:35] so each website isn't on a different host [13:40:49] yuvipanda: but - I should be able to get into eval.php of some instance right ? and verify the user email id ? [13:41:12] tonythomas: yeah, you shoudl be able to do that from deployment-bastion, I think? [13:41:28] tonythomas: you can also use the 'sql' command from deployment-bastion to connect to the mysql database directly. [13:41:35] and then 'use deploymentwiki;' for that particular wiki [13:42:36] yuvipanda: okey. trying that one now. [13:42:44] tonythomas: cool. [13:43:25] yuvipanda : actually - I was planning to confirm email in a different way [13:43:33] hmm=. I have written that somewhere [13:43:47] hmm, note that I actually have no clue what all of this is doing :) [13:44:00] yeah. like this https://tttwrites.wordpress.com/2014/10/23/manually-authenticating-a-mediawiki-user-e-mail-id/ [13:44:34] tonythomas: ah, nice. [13:44:41] tonythomas: you should be able to do the same on deployment-bastion [13:44:54] yuvipanda: oh. in that case. Let me try [13:46:36] yuvipanda@deployment-bastion:/srv/mediawiki-staging/php-master/maintenance$ mwscript eval.php --wiki deploymentwiki [13:46:38] tonythomas: ^ [13:47:00] ah. I was exactly looking where that one was [13:47:47] loads of undefined variable exceptions here though [13:47:49] you getting that ? [13:47:51] tonythomas: yup, yup [13:47:55] ignore and plow ahead, I’d say [13:48:03] of course. creating wrong user [13:48:16] heh [13:48:19] be careeefuulllll [13:48:19] :) [13:48:25] of course :) [13:48:41] oops looks like there is an account called verptestuser [13:48:49] maybe we would be able to use that one [13:50:00] sending mail now [13:50:37] email sent ! [13:50:54] yuvipanda: can you get me the exim logs of deployment-mx ? ( just to confirm ) [13:51:03] in /var/logs/exim4/mainlog [13:51:06] in deployment-mx ? [13:51:13] ( needs sudo :\ ) [13:51:46] tonythomas: https://phabricator.wikimedia.org/P251 [13:51:48] * /var/log/exim4/mainlog [13:52:09] yaaaaaay ! [13:52:13] looks very good as of njow [13:52:20] R=mw_verp_api T=mwverpbounceprocessor [13:52:28] :D [13:52:30] it went straight back to the VERP router [13:52:35] now time to check db [13:52:44] * tonythomas crosses fingers [13:53:35] yaayyyyyy! [13:53:36] its there [13:53:40] mysql> select * from bounce_records; [13:53:46] its showing the bounce [13:53:49] \o/ [13:53:56] to confirm. sending an email again [13:54:10] yup. 2 now [13:54:31] yuvipanda: thanks. bounce handling for beta is done :) [13:54:39] tonythomas: \o/ great! [13:54:43] tonythomas: you should send out an email :) [13:54:59] yuvipanda: one more fake one ? [13:55:09] tonythomas: no, I mean, to wikitech-l or to RelEng :) [13:55:11] mailing lists [13:55:22] oh. true. will do in a while [13:55:50] tonythomas: :) [13:56:00] you could also help make https://en.wikipedia.org/wiki/Variable_envelope_return_path better ;) [13:56:26] ah. the technology there looks ancient [14:08:15] tonythomas: yup, yup :) [14:09:49] mailed to wikitech-I : ) [14:09:56] tonythomas: cool :) [16:50:41] 3Labs: Dedicated hardware for wikitech web server - https://phabricator.wikimedia.org/T88294#1008469 (10Andrew) 3NEW a:3Andrew [16:51:32] 3hardware-requests, Labs, operations: Dedicated hardware for wikitech web server - https://phabricator.wikimedia.org/T88294#1008477 (10RobH) [17:06:00] 3hardware-requests, Labs, operations: Dedicated hardware for wikitech web server - https://phabricator.wikimedia.org/T88294#1008506 (10RobH) IRC discussion update: Joe picked out host silver from spares page for this task after we chatted about requirements. I'll setup this system with a public IP address, as... [18:06:05] 3hardware-requests, Labs, operations: eqiad: Dedicated hardware for wikitech web server - silver allocated - https://phabricator.wikimedia.org/T88294#1008636 (10RobH) [18:06:56] 3Labs: Move wikitech web interface to a dedicated server - https://phabricator.wikimedia.org/T88300#1008637 (10Andrew) 3NEW a:3Andrew [18:07:00] 3hardware-requests, Labs, operations: eqiad: Dedicated hardware for wikitech web server - silver allocated - https://phabricator.wikimedia.org/T88294#1008469 (10RobH) 5Open>3Resolved I've deployed silver in row B with a public IP address. It has been installed with basic raid1.cfg (so raid1 with a /srv xfs)... [18:08:02] 3Labs: Move wikitech web interface to a dedicated server - https://phabricator.wikimedia.org/T88300#1008650 (10RobH) [18:08:04] 3hardware-requests, Labs, operations: eqiad: Dedicated hardware for wikitech web server - silver allocated - https://phabricator.wikimedia.org/T88294#1008651 (10RobH) [18:32:23] 3Labs: New disk partition scheme for labs instances - https://phabricator.wikimedia.org/T87003#1008761 (10yuvipanda) [18:45:25] (03PS1) 10Greg Grossmeier: Announce Staging bugs to -releng [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/188090 [18:45:40] (03CR) 10jenkins-bot: [V: 04-1] Announce Staging bugs to -releng [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/188090 (owner: 10Greg Grossmeier) [18:45:58] (03PS2) 10Greg Grossmeier: Announce Staging bugs to -releng [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/188090 [18:46:18] (03CR) 10jenkins-bot: [V: 04-1] Announce Staging bugs to -releng [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/188090 (owner: 10Greg Grossmeier) [18:47:30] greg-g: heh, jenkins hates you [18:48:17] shush [18:48:33] I have no idea why [18:58:39] (03CR) 10Legoktm: Announce Staging bugs to -releng (031 comment) [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/188090 (owner: 10Greg Grossmeier) [18:59:42] 18:46:17 yaml.scanner.ScannerError: while scanning for the next token [18:59:42] 18:46:17 found character '\t' that cannot start any token [19:05:56] (03PS3) 10Legoktm: Announce Staging bugs to -releng [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/188090 (owner: 10Greg Grossmeier) [19:06:50] (03CR) 10Legoktm: [C: 032] Announce Staging bugs to -releng [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/188090 (owner: 10Greg Grossmeier) [19:07:02] (03Merged) 10jenkins-bot: Announce Staging bugs to -releng [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/188090 (owner: 10Greg Grossmeier) [19:07:44] Cyberpower678: Hi, pirsquared aded me to erwin85 project. Can you make xcontribs.php 775 so that I can write on it, please? [19:08:12] !log tools.wikibugs Updated channels.yaml to: 490e8ba1784e8ef7b04d2f51d2697f1d670d6cb1 Announce Staging bugs to -releng [19:08:17] Logged the message, Master [19:08:41] Is xcontribs.php part of the erwin85 project? [19:09:21] Nemo_bis, ^ [19:13:39] cy yes [19:13:44] Cyberpower678: yes [19:14:27] Then you already have write access. [19:15:00] What's the current permission? [19:15:33] Cyberpower678: it's owned by you, and 755 [19:15:51] owned by me? [19:16:34] Yes [19:16:35] -rw-r--r-- 1 cyberpower678 tools.erwin85 3.8K May 28 2014 xcontribs.php [19:16:40] 3Labs: Wikitech should use shared misc-host mysql - https://phabricator.wikimedia.org/T88311#1008962 (10Andrew) 3NEW [19:16:57] That's wierd. [19:17:17] Can happen. :) [19:17:28] Maybe forgot to change ownership when I migrated the tools initially. [19:17:42] 3Labs: Wikitech should use shared misc-host mysql - https://phabricator.wikimedia.org/T88311#1008974 (10Andrew) [19:17:43] 3Labs: Move wikitech web interface to a dedicated server - https://phabricator.wikimedia.org/T88300#1008973 (10Andrew) [19:18:20] Nemo_bis, take [19:18:41] try that. You should be able chmod it yourself then. I don't have quick access atm. [19:29:13] Cyberpower678: thanks, I knew there was a trick but I forgot which command [20:10:06] 3Tool-Labs-tools-Erwin's-tools: Add Gini coefficient to xcontribs - https://phabricator.wikimedia.org/T87740#1009290 (10Nemo_bis) [[https://tools.wmflabs.org/erwin85/xcontribsG.php?user=Nemo+bis&submit=Submit|Done]], but the result is not sound. The en.wiki article is useless as usual, now reading http://www.jst... [20:12:03] the en.wiki article is useless as usual [20:21:40] 3Wikibugs: Respond to "CTCP SOURCE" and "help" commands - https://phabricator.wikimedia.org/T88070#1009308 (10Aklapper) [21:03:08] 3Wikibugs: Provide more / more detailed CTCP responses. - https://phabricator.wikimedia.org/T88070 (10Legoktm) [21:03:26] 3Wikibugs: Provide more / more detailed CTCP responses. - https://phabricator.wikimedia.org/T88070#1003489 (10Legoktm) [22:01:05] 3Tool-Labs-tools-Erwin's-tools: Add Gini coefficient to xcontribs - https://phabricator.wikimedia.org/T87740#1009665 (10Nemo_bis) 5Open>3Resolved Fixed, I unded up using the Theil index which is easier to interpret and doesn't require me to add an appendix on mathematical assumptions of the formula used (e.g... [22:08:52] 3Labs: New disk partition scheme for labs instances - https://phabricator.wikimedia.org/T87003#1009704 (10mmodell) 5Open>3Resolved a:3mmodell [22:13:31] 3Tool-Labs-tools-Erwin's-tools: Add inequality index to xcontribs - https://phabricator.wikimedia.org/T87740#1009715 (10Nemo_bis) [22:14:28] 3Wikimedia-Labs-wikistats: Add 300 000 wikia wikis to stats table - https://phabricator.wikimedia.org/T38291#1009733 (10Dzahn) ah, yeah, there was just that one single duplicate. not sure how it got in there [22:51:09] (03CR) 10Greg Grossmeier: Announce Staging bugs to -releng (031 comment) [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/188090 (owner: 10Greg Grossmeier)