[01:00:35] PROBLEM - SSH on tools-exec-1218 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:05:29] RECOVERY - SSH on tools-exec-1218 is OK: SSH OK - OpenSSH_6.6.1p1 Ubuntu-2ubuntu2~wmfprecise2 (protocol 2.0) [02:00:03] 6Labs, 10Labs-Infrastructure, 10Beta-Cluster-Infrastructure: Setup real ssl certs for Beta Cluster using a restricted project - https://phabricator.wikimedia.org/T75919#785524 (10Krenair) > we shouldn't wait on that ("Arriving Summer 2015") With summer *2016* fast approaching, I think it's time to scrap thi... [02:17:06] 6Labs, 10Labs-Infrastructure, 10Beta-Cluster-Infrastructure, 6Operations: beta: Get SSL certificates for *.{projects}.beta.wmflabs.org - https://phabricator.wikimedia.org/T50501#2115233 (10Krenair) I'm going around a few tasks on this subject trying to merge everything together (this one, T70387, T75919, T... [02:17:26] 6Labs, 10Labs-Infrastructure, 10Beta-Cluster-Infrastructure, 6Operations: beta: Get SSL certificates for *.{projects}.beta.wmflabs.org - https://phabricator.wikimedia.org/T50501#2115236 (10Krenair) [02:17:48] 6Labs, 10Labs-Infrastructure, 10Beta-Cluster-Infrastructure: Setup real ssl certs for Beta Cluster using a restricted project - https://phabricator.wikimedia.org/T75919#785524 (10Krenair) 5Open>3Invalid [04:39:40] (03PS69) 10Ricordisamoa: Initial commit [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 [04:42:38] (03CR) 10Ricordisamoa: "PS69 removes another special case for statements from DragHelper.prototype.isValidDropTarget" [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 (owner: 10Ricordisamoa) [05:49:16] (03PS1) 10Ricordisamoa: HTMLPurifier: try to forbid some CSS properties [labs/toollabs] - 10https://gerrit.wikimedia.org/r/277062 [05:51:58] (03CR) 10Ricordisamoa: "Not tested, of course..." [labs/toollabs] - 10https://gerrit.wikimedia.org/r/277062 (owner: 10Ricordisamoa) [08:34:52] PROBLEM - Host tools-bastion-01 is DOWN: CRITICAL - Host Unreachable (10.68.17.228) [13:08:31] 6Labs, 7Puppet: Receiving puppet run failure alert for instance where manual puppet runs complete fine - https://phabricator.wikimedia.org/T129403#2116118 (10dschwen) Just got another one ``` Received: from root by maps-wma1.maps.eqiad.wmflabs with local (Exim 4.76) (envelope-from ) id 1af... [13:09:16] (03PS1) 10Tobias47n9e: Add .gitignore and ignore pycache directory [labs/tools/ptable] - 10https://gerrit.wikimedia.org/r/277070 [13:14:21] (03PS1) 10Tobias47n9e: Switch the canvas to a div layout [labs/tools/ptable] - 10https://gerrit.wikimedia.org/r/277072 [13:58:48] 6Labs, 7Puppet: Receiving puppet run failure alert for instance where manual puppet runs complete fine - https://phabricator.wikimedia.org/T129403#2104571 (10valhallasw) The emails are sent when puppet has not run for 24 hours. Specifically, the code checks the 'last_run' parameter in /var/lib/puppet/state/las... [14:06:44] 6Labs, 7Puppet: Receiving puppet run failure alert for instance where manual puppet runs complete fine - https://phabricator.wikimedia.org/T129403#2116149 (10valhallasw) ``` root@maps-wma1:~# bash -x /usr/local/sbin/puppet-run + set -e + touch /var/log/puppet.log + chmod 600 /var/log/puppet.log ++ puppet agent... [15:24:37] 6Labs, 10Labs-Infrastructure, 10DBA: One row event (at least) was not correctly replicated on trwiki - https://phabricator.wikimedia.org/T129678#2116181 (10jcrespo) [15:28:17] 6Labs, 10Labs-Infrastructure, 10DBA: One row event (at least) was not correctly replicated on trwiki - https://phabricator.wikimedia.org/T129678#2116187 (10jcrespo) Replication "is working", obviously -with incorrect data on all labs hosts. The rows are correct on production. This could be related to the fi... [15:28:59] 6Labs, 10Labs-Infrastructure, 10DBA: One row event (at least) was not correctly replicated on trwiki - https://phabricator.wikimedia.org/T129678#2116189 (10jcrespo) p:5Triage>3High [15:29:46] 6Labs, 10Labs-Infrastructure, 10DBA: One row event (at least) was not correctly replicated on trwiki (db1069) - https://phabricator.wikimedia.org/T129678#2112542 (10jcrespo) [15:42:33] 6Labs, 10Labs-Infrastructure, 10DBA: One row event (at least) was not correctly replicated on trwiki (db1069) - https://phabricator.wikimedia.org/T129678#2116208 (10jcrespo) There is one error on the logs that could be related to both events (the dates match): ``` 160308 18:08:29 [ERROR] Slave SQL: Error 'T... [16:39:00] hi, does wm phab allows hg, or just git? [16:43:33] Alchimista: Phabricator does support hg but the Wikimedia installation doesn't seem to have any such repos. [16:43:41] https://phabricator.wikimedia.org/diffusion/query/34LFY6iG6KL_/#R [16:44:23] Glaisher, that's my doubt, does the wm instalation suports hg, or was it removed? [16:45:02] You should try asking at #wikimedia-devtools [16:45:33] thanks :D [16:46:21] I don't think not having any hg repo means that it's disallowed. It could be just that we don't have any Mercurial repos. ;-) [16:52:36] 6Labs, 10Labs-Infrastructure, 10DBA: One row event (at least) was not correctly replicated on trwiki (db1069) - https://phabricator.wikimedia.org/T129678#2116285 (10jcrespo) [16:52:38] 6Labs, 10Labs-Infrastructure, 10DBA: Data missing in zhwiki on labs replicas - https://phabricator.wikimedia.org/T129432#2116286 (10jcrespo) [16:59:37] 6Labs, 10Labs-Infrastructure, 10DBA: Lost database changes on s2 for 3 hours on labs replicas - https://phabricator.wikimedia.org/T129432#2116288 (10jcrespo) [18:44:34] 6Labs, 7Puppet: Receiving puppet run failure alert for instance where manual puppet runs complete fine - https://phabricator.wikimedia.org/T129403#2116323 (10dschwen) Thanks, I upgraded puppet. Let's see if that makes me compliant again :-) [20:00:59] 6Labs: Allow everyone in ops group in LDAP to login to all Labs instances - https://phabricator.wikimedia.org/T87094#2116406 (10Krenair) While poking around in LDAP I found this: ```krenair@bastion-01:~$ ldapsearch -x "(&(ou:dn:=sudoers)(cn:dn:=ops))" # extended LDIF # # LDAPv3 # base (defa... [22:10:44] 6Labs, 10Labs-Infrastructure: invisible-unicorn/dynamicproxy-api should refuse to add backends to another project's domain - https://phabricator.wikimedia.org/T129800#2116584 (10AlexMonk-WMF)