[00:09:45] PROBLEM - Puppet staleness on integration-slave-jessie-1001 is CRITICAL 100.00% of data above the critical threshold [43200.0] [02:07:13] PROBLEM - Puppet staleness on deployment-elastic07 is CRITICAL 100.00% of data above the critical threshold [43200.0] [03:42:47] PROBLEM - Free space - all mounts on deployment-bastion is CRITICAL deployment-prep.deployment-bastion.diskspace._var.byte_percentfree (<50.00%) [05:30:01] PROBLEM - Puppet failure on deployment-memc04 is CRITICAL 20.00% of data above the critical threshold [0.0] [05:31:31] PROBLEM - Puppet failure on deployment-memc03 is CRITICAL 40.00% of data above the critical threshold [0.0] [05:34:55] PROBLEM - Puppet failure on deployment-sentry2 is CRITICAL 40.00% of data above the critical threshold [0.0] [05:35:42] PROBLEM - Puppet failure on deployment-test is CRITICAL 20.00% of data above the critical threshold [0.0] [05:36:04] PROBLEM - Puppet failure on deployment-mathoid is CRITICAL 44.44% of data above the critical threshold [0.0] [05:36:38] PROBLEM - Puppet failure on deployment-jobrunner01 is CRITICAL 30.00% of data above the critical threshold [0.0] [05:37:06] PROBLEM - Puppet failure on deployment-mediawiki01 is CRITICAL 33.33% of data above the critical threshold [0.0] [05:37:24] PROBLEM - Puppet failure on deployment-elastic08 is CRITICAL 50.00% of data above the critical threshold [0.0] [05:37:42] PROBLEM - Puppet failure on deployment-kafka02 is CRITICAL 20.00% of data above the critical threshold [0.0] [05:38:10] PROBLEM - Puppet failure on deployment-apertium01 is CRITICAL 33.33% of data above the critical threshold [0.0] [05:38:26] PROBLEM - Puppet failure on deployment-memc02 is CRITICAL 40.00% of data above the critical threshold [0.0] [05:38:44] PROBLEM - Puppet failure on deployment-zookeeper01 is CRITICAL 60.00% of data above the critical threshold [0.0] [05:39:18] PROBLEM - Puppet failure on deployment-sca02 is CRITICAL 66.67% of data above the critical threshold [0.0] [05:39:34] PROBLEM - Puppet failure on deployment-redis01 is CRITICAL 60.00% of data above the critical threshold [0.0] [05:39:36] PROBLEM - Puppet failure on deployment-fluoride is CRITICAL 60.00% of data above the critical threshold [0.0] [05:40:42] PROBLEM - Puppet failure on deployment-pdf02 is CRITICAL 50.00% of data above the critical threshold [0.0] [05:40:42] PROBLEM - Puppet failure on deployment-sca01 is CRITICAL 60.00% of data above the critical threshold [0.0] [05:41:06] PROBLEM - Puppet failure on deployment-zotero01 is CRITICAL 77.78% of data above the critical threshold [0.0] [05:41:07] PROBLEM - Puppet failure on deployment-upload is CRITICAL 100.00% of data above the critical threshold [0.0] [05:41:23] PROBLEM - Puppet failure on deployment-stream is CRITICAL 50.00% of data above the critical threshold [0.0] [05:41:59] PROBLEM - Puppet failure on deployment-db2 is CRITICAL 50.00% of data above the critical threshold [0.0] [05:42:17] PROBLEM - Puppet failure on deployment-db1 is CRITICAL 33.33% of data above the critical threshold [0.0] [05:43:05] PROBLEM - Puppet failure on deployment-fluorine is CRITICAL 33.33% of data above the critical threshold [0.0] [05:43:07] PROBLEM - Puppet failure on deployment-mediawiki02 is CRITICAL 55.56% of data above the critical threshold [0.0] [05:44:27] PROBLEM - Puppet failure on deployment-bastion is CRITICAL 70.00% of data above the critical threshold [0.0] [05:47:29] PROBLEM - Puppet failure on deployment-cxserver03 is CRITICAL 22.22% of data above the critical threshold [0.0] [05:47:35] PROBLEM - Puppet failure on deployment-logstash1 is CRITICAL 40.00% of data above the critical threshold [0.0] [05:48:19] PROBLEM - Puppet failure on deployment-elastic05 is CRITICAL 60.00% of data above the critical threshold [0.0] [05:50:00] PROBLEM - Puppet failure on deployment-mediawiki03 is CRITICAL 50.00% of data above the critical threshold [0.0] [05:52:10] PROBLEM - Puppet failure on deployment-elastic06 is CRITICAL 62.50% of data above the critical threshold [0.0] [06:06:46] PROBLEM - Puppet failure on deployment-salt is CRITICAL 57.14% of data above the critical threshold [0.0] [06:12:25] PROBLEM - SSH on deployment-salt is CRITICAL - Socket timeout after 10 seconds [06:17:13] RECOVERY - SSH on deployment-salt is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.4 (protocol 2.0) [06:37:46] RECOVERY - Free space - all mounts on deployment-bastion is OK All targets OK [06:44:02] PROBLEM - Puppet failure on integration-publisher is CRITICAL 30.00% of data above the critical threshold [0.0] [07:14:02] RECOVERY - Puppet failure on integration-publisher is OK Less than 1.00% above the threshold [0.0] [07:14:32] do we need process accounting on deployment-bastion? it's using almost 400mb on /var [07:21:05] also why do we have atop running everywhere? [08:21:13] 10Continuous-Integration-Infrastructure, 6operations, 7Blocked-on-Operations, 5Patch-For-Review: Build Debian package ruby-jsduck for Jessie - https://phabricator.wikimedia.org/T95008#1313971 (10akosiaris) @Dzahn, I left some comments in https://gerrit.wikimedia.org/r/#/c/213954/ . As far as the "5.3.4-1w... [08:23:31] 6Release-Engineering, 10Wikidata, 10Wikidata-Page-Banner: Setup Jenkins - https://phabricator.wikimedia.org/T100495#1313988 (10Jdlrobson) 3NEW [08:58:49] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1314032 (10mmodell) I've been playing with releeph a little locally, it's actually closer to being usable than what I had previously thought. {F169748} [09:09:22] (03CR) 1020after4: [C: 032] make-wmf-branch should check that it's running the latest origin/master [tools/release] - 10https://gerrit.wikimedia.org/r/212757 (https://phabricator.wikimedia.org/T99998) (owner: 1020after4) [09:09:29] (03Merged) 10jenkins-bot: make-wmf-branch should check that it's running the latest origin/master [tools/release] - 10https://gerrit.wikimedia.org/r/212757 (https://phabricator.wikimedia.org/T99998) (owner: 1020after4) [09:10:01] 6Release-Engineering, 10Wikidata, 5Patch-For-Review: make-wmf-branch should check if it is up to date - https://phabricator.wikimedia.org/T99998#1314053 (10mmodell) 5Open>3Resolved [09:30:10] 6Release-Engineering, 6Project-Creators: SWAT Project (Tag) - https://phabricator.wikimedia.org/T99411#1314090 (10mmodell) https://phab-01.wmflabs.org/releeph/product/1 [13:30:19] !log All Jenkins slaves are disconnected due to some ssh error. CI is down. [13:30:29] Logged the message, Master [13:35:54] !log integration-puppetmaster apparently out of memory [13:35:58] Logged the message, Master [13:38:08] !log restarted integration puppetmaster (memory leak) [13:38:13] Logged the message, Master [13:43:17] PROBLEM - Puppet failure on integration-slave-trusty-1012 is CRITICAL 50.00% of data above the critical threshold [0.0] [13:45:15] 10Continuous-Integration-Infrastructure, 6operations: Jenkins master / client ssh connection fails due to missing ssh algorithm - https://phabricator.wikimedia.org/T100509#1314411 (10hashar) 3NEW [13:49:05] hashar, so... isn't that like unbreak now? [13:49:13] yup [13:49:23] investigating it with moritzm right now [13:49:53] 10Continuous-Integration-Infrastructure, 6operations: Jenkins master / client ssh connection fails due to missing ssh algorithm - https://phabricator.wikimedia.org/T100509#1314426 (10hashar) p:5Triage>3Unbreak! Being investigated with @MoritzMuehlenhoff [13:53:54] 10Continuous-Integration-Infrastructure, 6operations: Jenkins master / client ssh connection fails due to missing ssh algorithm - https://phabricator.wikimedia.org/T100509#1314445 (10hashar) QChris sent a change for Gerrit which is related. https://gerrit.wikimedia.org/r/#/c/213216/ Turn off sshd MAC and KEX... [13:58:59] 10Continuous-Integration-Infrastructure, 6operations: Jenkins master / client ssh connection fails due to missing ssh algorithm - https://phabricator.wikimedia.org/T100509#1314450 (10hashar) Applied on hiera page https://wikitech.wikimedia.org/wiki/Hiera:Integration "ssh::server::disable_nist_kex": false... [14:01:00] 10Continuous-Integration-Infrastructure, 6operations: Jenkins master / client ssh connection fails due to missing ssh algorithm - https://phabricator.wikimedia.org/T100509#1314456 (10hashar) Gerrit had the same issue with T99990 [14:03:18] RECOVERY - Puppet failure on integration-slave-trusty-1012 is OK Less than 1.00% above the threshold [0.0] [14:08:52] 10Continuous-Integration-Infrastructure, 6operations, 5Patch-For-Review: Jenkins master / client ssh connection fails due to missing ssh algorithm - https://phabricator.wikimedia.org/T100509#1314486 (10hashar) p:5Unbreak!>3Normal Issue is fixed by cherry picked the puppet patch https://gerrit.wikimedia.o... [14:15:30] PROBLEM - Puppet failure on integration-slave-trusty-1017 is CRITICAL 44.44% of data above the critical threshold [0.0] [14:16:04] Yippee, build fixed! [14:16:06] Project browsertests-GettingStarted-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce build #485: FIXED in 58 sec: https://integration.wikimedia.org/ci/job/browsertests-GettingStarted-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/485/ [14:21:10] 10Beta-Cluster, 10MediaWiki-extensions-GettingStarted: GettingStarted on Beta Cluster periodically loses its Redis index - https://phabricator.wikimedia.org/T100515#1314525 (10Mattflaschen) 3NEW [14:21:20] 10Beta-Cluster, 10MediaWiki-extensions-GettingStarted: GettingStarted on Beta Cluster periodically loses its Redis index - https://phabricator.wikimedia.org/T100515#1314532 (10Mattflaschen) p:5Triage>3High [14:22:27] 10Beta-Cluster, 10MediaWiki-extensions-GettingStarted: GettingStarted on Beta Cluster periodically loses its Redis index - https://phabricator.wikimedia.org/T100515#1314525 (10Mattflaschen) @hashar Do you know if Redis is persistent on Labs? We're using $sessionRedis from session-labs.php. [14:27:29] !log restarting deployment-salt / some process is 100% wa/IO [14:27:33] Logged the message, Master [14:30:21] !log manually rebasing puppet git on deployment-salt (stalled) [14:30:26] Logged the message, Master [14:30:31] RECOVERY - Puppet failure on integration-slave-trusty-1017 is OK Less than 1.00% above the threshold [0.0] [14:32:39] oh boy, lots of beta cluster fun this morning. [14:33:46] yeah [14:37:29] 6Release-Engineering, 6Phabricator: Next Phabricator upgrade on 2015-05-27 (tentative) - https://phabricator.wikimedia.org/T98451#1314560 (10mmodell) 5Open>3Resolved [14:37:58] 10Continuous-Integration-Infrastructure, 6operations: Jenkins jar should ship with a more recent jsch java lib version to support hardened algorithm - https://phabricator.wikimedia.org/T100517#1314566 (10hashar) 3NEW [14:38:21] 6Release-Engineering, 6Phabricator: Next Phabricator upgrade on 2015-05-27 (to 2787ed1343a3d8d94d1d4b729793870f5b0c31b4) - https://phabricator.wikimedia.org/T98451#1314574 (10Aklapper) [14:41:02] good morning thcipriani and twentyafterfour :-} [14:41:26] hashar: o/ looks like it's been a rough afternoon for you [14:41:33] RECOVERY - Puppet failure on deployment-memc03 is OK Less than 1.00% above the threshold [0.0] [14:41:39] thcipriani: nothing out of ordinary :-} [14:41:45] some ssh algorithms have been changed [14:41:56] and are not supported by the Jenkins version we are running as a master [14:42:08] luckily Christian had the same issue with Gerrit ! [14:42:28] go java ssh implementations :) [14:42:39] they embed some old lib apparently [14:44:28] RECOVERY - Puppet failure on deployment-bastion is OK Less than 1.00% above the threshold [0.0] [14:45:02] RECOVERY - Puppet failure on deployment-memc04 is OK Less than 1.00% above the threshold [0.0] [14:46:06] RECOVERY - Puppet failure on deployment-zotero01 is OK Less than 1.00% above the threshold [0.0] [14:46:28] RECOVERY - Puppet failure on deployment-stream is OK Less than 1.00% above the threshold [0.0] [14:46:46] RECOVERY - Puppet failure on deployment-salt is OK Less than 1.00% above the threshold [0.0] [14:46:58] RECOVERY - Puppet failure on deployment-db2 is OK Less than 1.00% above the threshold [0.0] [14:47:14] RECOVERY - Puppet failure on deployment-db1 is OK Less than 1.00% above the threshold [0.0] [14:47:28] RECOVERY - Puppet failure on deployment-cxserver03 is OK Less than 1.00% above the threshold [0.0] [14:47:36] RECOVERY - Puppet failure on deployment-logstash1 is OK Less than 1.00% above the threshold [0.0] [14:47:42] RECOVERY - Puppet failure on deployment-kafka02 is OK Less than 1.00% above the threshold [0.0] [14:48:06] RECOVERY - Puppet failure on deployment-fluorine is OK Less than 1.00% above the threshold [0.0] [14:48:08] RECOVERY - Puppet failure on deployment-mediawiki02 is OK Less than 1.00% above the threshold [0.0] [14:48:10] RECOVERY - Puppet failure on deployment-apertium01 is OK Less than 1.00% above the threshold [0.0] [14:48:20] RECOVERY - Puppet failure on deployment-elastic05 is OK Less than 1.00% above the threshold [0.0] [14:48:54] 10Beta-Cluster, 10Continuous-Integration-Infrastructure: Reenable ssh MAC/KEX hardening on beta cluster and integration labs project - https://phabricator.wikimedia.org/T100518#1314596 (10hashar) 3NEW [14:49:06] 10Beta-Cluster, 10Continuous-Integration-Infrastructure: Reenable ssh MAC/KEX hardening on beta cluster and integration labs project - https://phabricator.wikimedia.org/T100518#1314603 (10hashar) [14:49:07] 10Continuous-Integration-Infrastructure, 6operations: Jenkins jar should ship with a more recent jsch java lib version to support hardened algorithm - https://phabricator.wikimedia.org/T100517#1314604 (10hashar) [14:50:00] RECOVERY - Puppet failure on deployment-mediawiki03 is OK Less than 1.00% above the threshold [0.0] [14:50:19] 10Continuous-Integration-Infrastructure, 6operations, 5Patch-For-Review: Jenkins master / client ssh connection fails due to missing ssh algorithm - https://phabricator.wikimedia.org/T100509#1314605 (10hashar) 5Open>3Resolved a:3hashar I have filled follow up tasks (T100517 and T100518). Puppet patch... [14:51:23] 10Continuous-Integration-Infrastructure, 7Jenkins: Jenkins jar should ship with a more recent jsch java lib version to support hardened algorithm - https://phabricator.wikimedia.org/T100517#1314611 (10hashar) [14:52:11] RECOVERY - Puppet failure on deployment-elastic06 is OK Less than 1.00% above the threshold [0.0] [14:52:21] RECOVERY - Puppet failure on deployment-elastic08 is OK Less than 1.00% above the threshold [0.0] [14:54:17] RECOVERY - Puppet failure on deployment-sca02 is OK Less than 1.00% above the threshold [0.0] [14:54:39] RECOVERY - Puppet failure on deployment-fluoride is OK Less than 1.00% above the threshold [0.0] [14:54:39] RECOVERY - Puppet failure on deployment-redis01 is OK Less than 1.00% above the threshold [0.0] [14:55:42] RECOVERY - Puppet failure on deployment-sca01 is OK Less than 1.00% above the threshold [0.0] [14:56:02] RECOVERY - Puppet failure on deployment-mathoid is OK Less than 1.00% above the threshold [0.0] [14:56:08] RECOVERY - Puppet failure on deployment-upload is OK Less than 1.00% above the threshold [0.0] [14:56:38] RECOVERY - Puppet failure on deployment-jobrunner01 is OK Less than 1.00% above the threshold [0.0] [14:57:06] RECOVERY - Puppet failure on deployment-mediawiki01 is OK Less than 1.00% above the threshold [0.0] [14:58:27] RECOVERY - Puppet failure on deployment-memc02 is OK Less than 1.00% above the threshold [0.0] [14:58:43] RECOVERY - Puppet failure on deployment-zookeeper01 is OK Less than 1.00% above the threshold [0.0] [14:59:55] RECOVERY - Puppet failure on deployment-sentry2 is OK Less than 1.00% above the threshold [0.0] [15:00:07] 10Deployment-Systems, 6Release-Engineering: Determine weekly triage meeting for Deployment Systems - https://phabricator.wikimedia.org/T98206#1314632 (10thcipriani) Created calendar event starting 6/1/15 Mondays, 10:50am–11:20am PDT. Invited everyone on this ticket. [15:00:43] RECOVERY - Puppet failure on deployment-pdf02 is OK Less than 1.00% above the threshold [0.0] [15:02:17] 10Deployment-Systems, 6Release-Engineering: Determine weekly triage meeting for Deployment Systems - https://phabricator.wikimedia.org/T98206#1314634 (10thcipriani) 5Open>3Resolved [15:06:24] 10Continuous-Integration-Infrastructure, 7Jenkins: Jenkins jar should ship with a more recent jsch java lib version to support hardened algorithm - https://phabricator.wikimedia.org/T100517#1314643 (10hashar) The SSH agent plugin depends on https://github.com/jenkinsci/ssh-credentials-plugin which we are runni... [15:09:04] !log Jenkins slaves are all back up. Root cause was some ssh algorithm in their sshd which is not supported by Jenkins jsch embedded lib. [15:09:09] Logged the message, Master [15:20:38] RECOVERY - Puppet failure on deployment-test is OK Less than 1.00% above the threshold [0.0] [15:50:19] what does !sal do? ;) [15:50:28] * twentyafterfour tries it [15:50:31] !sal [15:50:31] https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [15:50:38] hmm [15:53:38] just where the server admin log is [16:36:05] For 1.25 tarballs, have all git branches been created? Asking because of CirrusSearch issues with 1.25.1 in https://phabricator.wikimedia.org/T100520 [17:52:34] 5Continuous-Integration-Isolation, 6operations: Review Jenkins isolation architecture with Antoine - https://phabricator.wikimedia.org/T92324#1315059 (10akosiaris) [18:25:59] (03CR) 10Polybuildr: "recheck" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/196880 (owner: 10Polybuildr) [18:30:35] (03CR) 10Addshore: "recheck" [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/196880 (owner: 10Polybuildr) [18:37:06] (03PS4) 10Addshore: Change procedure for finding global variables in ValidGlobalNameSniff [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/196880 (owner: 10Polybuildr) [18:37:19] (03CR) 10jenkins-bot: [V: 04-1] Change procedure for finding global variables in ValidGlobalNameSniff [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/196880 (owner: 10Polybuildr) [18:37:50] (03PS5) 10Addshore: Change procedure for finding global variables in ValidGlobalNameSniff [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/196880 (owner: 10Polybuildr) [18:38:06] 10Continuous-Integration-Infrastructure: https://github.com/wikimedia/mediawiki/ release tags vanished - https://phabricator.wikimedia.org/T100409#1315211 (10demon) a:5demon>3None Unassigning from myself, I'm on vacation until the end of the month. [18:38:07] (03CR) 10jenkins-bot: [V: 04-1] Change procedure for finding global variables in ValidGlobalNameSniff [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/196880 (owner: 10Polybuildr) [18:40:32] (03PS1) 10Polybuildr: Add a comment in a test [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/214103 [18:40:43] (03CR) 10jenkins-bot: [V: 04-1] Add a comment in a test [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/214103 (owner: 10Polybuildr) [18:40:54] 10Continuous-Integration-Infrastructure: https://github.com/wikimedia/mediawiki/ release tags vanished - https://phabricator.wikimedia.org/T100409#1315224 (10demon) p:5Triage>3Unbreak! Also, this needs fixing ASAP so moving to UBN. [18:46:00] (03Abandoned) 10Polybuildr: Add a comment in a test [tools/codesniffer] - 10https://gerrit.wikimedia.org/r/214103 (owner: 10Polybuildr) [18:47:23] 6Release-Engineering, 10MediaWiki-ResourceLoader, 3Mobile-Web-Sprint-48-Voyage-of-the-Damned: ResourceLoader urls with certain version serving different content on beta labs - https://phabricator.wikimedia.org/T100542#1315239 (10Jdlrobson) 3NEW a:3Jdlrobson [19:23:32] 10Beta-Cluster, 10MediaWiki-ResourceLoader, 3Mobile-Web-Sprint-48-Voyage-of-the-Damned: ResourceLoader urls with certain version serving different content on beta cluster - https://phabricator.wikimedia.org/T100542#1315348 (10greg) [19:40:01] How do I set up tests for jenkins to run for new patches submitted to Gerrit? [19:40:08] mediawiki/extensions/SmiteSpam being the repository for a new, under-development extension. [19:54:23] polybuildr: should just need to add some configuration for jenkins job builder in the integration/config repo in gerrit. [19:54:51] thcipriani: I see. Okay, looking at the repo now. [19:56:33] cool. shout here if anything in the repo especially obtuse. I'm still pretty new to jjb, but I've made a couple updates :P [19:56:46] thcipriani: The layout.yaml file I suppose? [19:58:23] polybuildr: yeah, layout.yaml and mediawiki-extensions.yaml likely [19:59:29] well, maybe not mediawiki-extensions if your tests are all fairly standard. [20:01:25] I don't expect them to be anything out of the ordinary, no. [20:01:51] polybuildr: http://www.mediawiki.org/wiki/Continuous_integration/Tutorials/Adding_a_MediaWiki_extension [20:02:23] thcipriani: Aha. Documentation! :D Thanks. [20:45:15] 10Browser-Tests: Enable Elena and Rummana to run Jenkins jobs - https://phabricator.wikimedia.org/T100172#1315602 (10hashar) Their labs accounts are not in the `wmf` LDAP group which is what grants extra permissions. Logged in with a wmf account one can see the groups at: https://integration.wikimedia.org/ci/us... [20:50:05] 10Browser-Tests: Please add Elena Tonkovidova labs account to LDAP group wmf - https://phabricator.wikimedia.org/T100560#1315629 (10hashar) 3NEW a:3hashar [20:50:33] 10Browser-Tests, 10Ops-Access-Requests, 6operations: Please add Elena Tonkovidova labs account to LDAP group wmf - https://phabricator.wikimedia.org/T100560#1315641 (10hashar) [21:14:00] 6Release-Engineering, 10Wikimedia-Git-or-Gerrit: https://github.com/wikimedia/mediawiki/ release tags vanished - https://phabricator.wikimedia.org/T100409#1315710 (10Aklapper) [21:16:10] 10Browser-Tests, 10Ops-Access-Requests, 6operations: Please add Elena Tonkovidova labs account to LDAP group wmf - https://phabricator.wikimedia.org/T100560#1315727 (10Krenair) Why is L3 signature required to be in the wmf group? [21:16:44] 10Browser-Tests, 10Ops-Access-Requests, 6operations: Please add Rummana Yasmeen labs account to LDAP group wmf - https://phabricator.wikimedia.org/T100559#1315729 (10Dzahn) I think just adding a user to the WMF LDAP group is not considered "server access", so unless we create an actual shell account, signing... [21:19:13] 10Browser-Tests, 10Ops-Access-Requests, 6operations: Please add Rummana Yasmeen labs account to LDAP group wmf - https://phabricator.wikimedia.org/T100559#1315739 (10Dzahn) done. since i could confirm that user ryasmee already existed and uses an @wikimedia.org email address that is fine. [terbium:~] $ ldap... [21:19:40] 6Release-Engineering, 10Ops-Access-Requests, 10Wikimedia-Git-or-Gerrit, 6operations: Add all Release-Engineering team as Gerrit admins - https://phabricator.wikimedia.org/T100565#1315740 (10hashar) 3NEW [21:19:56] 10Browser-Tests: Enable Elena and Rummana to run Jenkins jobs - https://phabricator.wikimedia.org/T100172#1315751 (10Dzahn) [21:19:57] 10Browser-Tests, 10Ops-Access-Requests, 6operations: Please add Rummana Yasmeen labs account to LDAP group wmf - https://phabricator.wikimedia.org/T100559#1315750 (10Dzahn) 5Open>3Resolved [21:22:30] 6Release-Engineering, 10Wikimedia-Git-or-Gerrit, 5Patch-For-Review: https://github.com/wikimedia/mediawiki/ release tags vanished - https://phabricator.wikimedia.org/T100409#1315771 (10hashar) So in short, one need to figure out the credentials to push from Gerrit reference repositories to the github mirror... [21:22:33] 10Browser-Tests, 10Ops-Access-Requests, 6operations: Please add Elena Tonkovidova labs account to LDAP group wmf - https://phabricator.wikimedia.org/T100560#1315774 (10Dzahn) done. the user is already registered with an @wikimedia.org which is sufficient to show user is an empoylee and should be added to th... [21:22:59] 10Browser-Tests, 10Ops-Access-Requests, 6operations: Please add Elena Tonkovidova labs account to LDAP group wmf - https://phabricator.wikimedia.org/T100560#1315776 (10Dzahn) 5Open>3Resolved [21:23:00] 10Browser-Tests: Enable Elena and Rummana to run Jenkins jobs - https://phabricator.wikimedia.org/T100172#1315777 (10Dzahn) [21:34:51] 6Release-Engineering: Update wikitech:Deployment and mw.org documentation - https://phabricator.wikimedia.org/T100566#1315795 (10greg) 3NEW a:3greg [21:37:50] 6Release-Engineering: Changes to ForrestBot for change to MW deploy cadence - https://phabricator.wikimedia.org/T100567#1315804 (10greg) 3NEW a:3Legoktm [21:37:53] 6Release-Engineering: Update wikitech:Deployment and mw.org documentation - https://phabricator.wikimedia.org/T100566#1315811 (10Florian) mw.org [[ https://www.mediawiki.org/w/index.php?title=MediaWiki_1.26/Roadmap&diff=1660446&oldid=1659725 | RL 1.26 Roadmap ]] and [[ https://www.mediawiki.org/w/index.php?title... [21:38:54] 6Release-Engineering: Update wikitech:Deployment and mw.org documentation - https://phabricator.wikimedia.org/T100566#1315815 (10greg) >>! In T100566#1315811, @Florian wrote: > mw.org [[ https://www.mediawiki.org/w/index.php?title=MediaWiki_1.26/Roadmap&diff=1660446&oldid=1659725 | RL 1.26 Roadmap ]] and [[ http... [21:39:16] FlorianSW: who wrote forestbot again? [21:39:24] I mean, "again" as in, I knew/was told, but now I forgot [21:40:38] greg-g: forestbot? o.O Idk :D [21:40:59] :) [21:41:08] either lego or... the name I'm blanking on [21:41:21] greg-g: do you guys use hhvm in beta? [21:41:32] greg-g: btw, maybe i missed something, but which bot is forestbot? [21:41:57] chasemp: yeah [21:42:07] FlorianSW: https://phabricator.wikimedia.org/p/Forrestbot/ [21:42:29] greg-g: ah, than: no idea, sorry :) [21:42:35] no problem :) [21:46:19] greg-g: can you check https://wikitech.wikimedia.org/wiki/Deployments/One_week ? :) [21:48:03] FlorianSW: perfect :) [21:48:07] * greg-g cleans up the old stuff [21:48:27] aka delete [21:48:51] :D [21:48:58] 6Release-Engineering: Shorten/Simplify MW train deploy cadence to Tu->W->Th - https://phabricator.wikimedia.org/T97553#1315836 (10Florian) [21:48:59] 6Release-Engineering: Update wikitech:Deployment and mw.org documentation - https://phabricator.wikimedia.org/T100566#1315834 (10Florian) 5Open>3Resolved If i haven't forgotten one, that should be resolved with the change of the [[ https://wikitech.wikimedia.org/wiki/Deployments/One_week | One week plan ]]. [21:49:23] 6Release-Engineering, 10Wikimedia-Git-or-Gerrit, 5Patch-For-Review: https://github.com/wikimedia/mediawiki/ release tags vanished - https://phabricator.wikimedia.org/T100409#1315838 (10Legoktm) I don't see anything suspicious on https://github.com/orgs/wikimedia/audit-log [21:55:11] 6Release-Engineering, 10Wikimedia-Git-or-Gerrit, 5Patch-For-Review: https://github.com/wikimedia/mediawiki/ release tags vanished - https://phabricator.wikimedia.org/T100409#1315859 (10QChris) I forced replication of `mediawiki/core`, and that seems to have done the trick (see below). Seeing if I can find an... [22:05:22] 6Release-Engineering: Release/QA tasks at the Wikimedia Hackathon 2015 - https://phabricator.wikimedia.org/T92565#1315883 (10Qgil) 5Open>3Resolved [22:19:15] 6Release-Engineering: Hackathon Proposal: Wikimedia Site Requests Sprint - https://phabricator.wikimedia.org/T90468#1315923 (10Qgil) Did someone work on this project during #Wikimedia-Hackathon-2015? If so, please update the task with the results. If not, please remove the label. [22:20:16] 6Release-Engineering, 10Wikimedia-Git-or-Gerrit, 5Patch-For-Review: https://github.com/wikimedia/mediawiki/ release tags vanished - https://phabricator.wikimedia.org/T100409#1315925 (10Foxtrott) This seems to have only recreated the WMF specific tags (e.g. wmf/1.25wmf1). General version tags (e.g. 1.24.2) se... [22:23:14] 10Browser-Tests: Write browser tests for DonationInterface - https://phabricator.wikimedia.org/T99955#1315936 (10Qgil) Did someone work on this project during #Wikimedia-Hackathon-2015? If so, please update the task with the results. If not, please remove the label. [22:23:16] 10Continuous-Integration-Infrastructure: All new extensions should be setup automatically with Zuul - https://phabricator.wikimedia.org/T92909#1315937 (10Qgil) Did someone work on this project during #Wikimedia-Hackathon-2015? If so, please update the task with the results. If not, please remove the label. [22:23:48] 10Continuous-Integration-Infrastructure, 6Collaboration-Team, 10Flow, 6Mobile-Web, 10VisualEditor: Create Jenkins builds for Editing across repositories (MobileFrontend, VisualEditor etc) - https://phabricator.wikimedia.org/T90647#1315961 (10Qgil) Did someone work on this project during #Wikimedia-Hackat... [22:23:52] 10Deployment-Systems, 6Release-Engineering, 7HHVM: HHVM RepoAuthoritative Hackathon proof of concept - https://phabricator.wikimedia.org/T91074#1315962 (10Qgil) Did someone work on this project during #Wikimedia-Hackathon-2015? If so, please update the task with the results. If not, please remove the label. [22:24:34] 10Deployment-Systems, 6Release-Engineering, 7HHVM: HHVM RepoAuthoritative Hackathon proof of concept - https://phabricator.wikimedia.org/T91074#1315971 (10thcipriani) [22:34:20] 6Release-Engineering: Hackathon Proposal: Wikimedia Site Requests Sprint - https://phabricator.wikimedia.org/T90468#1315992 (10greg) 5Open>3declined a:3greg didn't happen, maybe next time [22:34:39] 6Release-Engineering: Organize browsertests/Selenium training - https://phabricator.wikimedia.org/T100170#1315997 (10greg) [22:34:40] 10Browser-Tests, 10Hackathon-Mexico-City-2015: Workshop: write the first browsertests/Selenium test - https://phabricator.wikimedia.org/T94024#1315996 (10greg) 5Open>3Resolved [22:35:30] 10Browser-Tests, 10Hackathon-Mexico-City-2015: Workshop: Fix broken browsertests/Selenium Jenkins jobs - https://phabricator.wikimedia.org/T94299#1315999 (10greg) 5Open>3Resolved [22:35:31] 6Release-Engineering: Organize browsertests/Selenium training - https://phabricator.wikimedia.org/T100170#1307198 (10greg) [22:35:42] .win 43 [22:35:53] 10Browser-Tests, 6Release-Engineering, 10Hackathon-Mexico-City-2015, 7I18n: Hacking: Load i18n messages from MediaWiki to browser tests - https://phabricator.wikimedia.org/T90577#1316001 (10greg) [22:36:21] 10Continuous-Integration-Infrastructure: All new extensions should be setup automatically with Zuul - https://phabricator.wikimedia.org/T92909#1316007 (10greg) [22:36:32] 10Browser-Tests, 6Release-Engineering, 10Hackathon-Mexico-City-2015: Hacking: Investigate using the sikuli-like Applitools framework for visual testing - https://phabricator.wikimedia.org/T90884#1316008 (10greg) [22:36:45] 10Continuous-Integration-Infrastructure, 6Release-Engineering, 10Wikipedia-Android-App, 10Wikipedia-iOS-App: Create end-to-end automated test for Wikipedia native app(s) - https://phabricator.wikimedia.org/T90177#1316009 (10greg) [22:40:42] thcipriani: around? [22:40:48] yup [22:41:34] do you think you can rollback the train deploy today (aka, put the wikipedias back on wmf6, and the test wikis back on wmf7 [22:42:50] I don't know. Let me look. [22:46:52] thcipriani: nvm, got mukunda on the phone :) [22:47:46] kk, that's good :) [22:48:43] 6Release-Engineering, 10Wikimedia-Git-or-Gerrit, 5Patch-For-Review: https://github.com/wikimedia/mediawiki/ release tags vanished - https://phabricator.wikimedia.org/T100409#1316031 (10QChris) >>! In T100409#1315925, @Foxtrott wrote: > This seems to have only recreated the WMF specific tags (e.g. wmf/1.25wmf... [22:55:36] 6Release-Engineering, 10Wikimedia-Git-or-Gerrit, 5Patch-For-Review: https://github.com/wikimedia/mediawiki/ release tags vanished - https://phabricator.wikimedia.org/T100409#1316048 (10QChris) 5Open>3Resolved a:3QChris >>! In T100409#1315859, @QChris wrote: > Seeing if I can find anything in the logs.... [23:00:33] 6Release-Engineering: Make sync-wikiverisons check that a valid localisation cache exists when syncing new versions - https://phabricator.wikimedia.org/T100573#1316052 (10Reedy) 3NEW [23:01:37] 10Deployment-Systems, 6Release-Engineering: Make sync-wikiverisons check that a valid localisation cache exists when syncing new versions - https://phabricator.wikimedia.org/T100573#1316059 (10greg) [23:03:27] 10Deployment-Systems, 6Release-Engineering: Make sync-wikiverisons check that a valid localisation cache exists when syncing new versions - https://phabricator.wikimedia.org/T100573#1316067 (10Reedy) p:5Triage>3High [23:06:05] greg-g: what happened with the train? [23:06:23] legoktm: which context? ;) [23:06:31] the revert right now or the change in cadence? [23:06:36] the revert [23:07:06] 22:35 < ori> greg-g: the deploy of wmf8 coincides with a number of issues: a spike in 5xx errors; a spike in memcached traffic; and some strange errors in the logs. [23:08:49] 10Deployment-Systems, 6Release-Engineering: scap only one version - https://phabricator.wikimedia.org/T100575#1316073 (10Reedy) 3NEW [23:11:15] ok [23:11:26] thanks [23:31:20] PROBLEM - Puppet failure on deployment-pdf01 is CRITICAL 100.00% of data above the critical threshold [0.0] [23:31:24] PROBLEM - Puppet staleness on deployment-urldownloader is CRITICAL 100.00% of data above the critical threshold [43200.0] [23:31:44] PROBLEM - Puppet failure on deployment-cache-text02 is CRITICAL 100.00% of data above the critical threshold [0.0] [23:32:40] 6Release-Engineering, 10Ops-Access-Requests, 10Wikimedia-Git-or-Gerrit, 6operations, 5Patch-For-Review: Add all Release-Engineering team as Gerrit admins - https://phabricator.wikimedia.org/T100565#1316156 (10Dzahn) p:5Triage>3Normal [23:37:44] greg-g: also, https://www.mediawiki.org/wiki/User:Forrestbot [23:38:03] cool [23:47:01] 6Release-Engineering, 10Hackathon-Mexico-City-2015: Hackathon Proposal: Wikimedia Site Requests Sprint - https://phabricator.wikimedia.org/T90468#1316194 (10greg) 5declined>3Open [23:47:08] 6Release-Engineering, 10Hackathon-Mexico-City-2015: Hackathon Proposal: Wikimedia Site Requests Sprint - https://phabricator.wikimedia.org/T90468#1059651 (10greg) p:5Triage>3Normal [23:47:24] 6Release-Engineering, 10Hackathon-Mexico-City-2015: Hackathon Proposal: Wikimedia Site Requests Sprint - https://phabricator.wikimedia.org/T90468#1059651 (10greg) Reopening for Wikimania Hackathon, which I forgot about when I closed this. [23:47:44] legoktm: we're probably going to need to change that bot for the new cadence :) [23:56:01] greg-g: I don't think so? James_F is still manually creating the projects, and our plan was to have the bot scrape the Deployments page rather than hardcoding anything in but it never got that far