[00:41:38] 03Scap3, 06Services (later), 15User-mobrovac: Delay repooling trending service after a restart - https://phabricator.wikimedia.org/T156687#2988590 (10mobrovac) >>! In T156687#2987059, @thcipriani wrote: > hrm. In looking through the code we're currently running checks per stage in concurently (with an arbitr... [00:44:30] 03Scap3, 10Parsoid: Saying yes (y) continues to all groups - https://phabricator.wikimedia.org/T156839#2988600 (10mobrovac) 05Open>03Invalid You can say `c` once and you will not be prompted any more till the end of the deployment ;) [00:50:14] 03Scap3, 10Parsoid: Saying yes (y) continues to all groups - https://phabricator.wikimedia.org/T156839#2988605 (10mobrovac) 05Invalid>03Open Ups, sorry misread the ticket. I thought @Arlolra was asking how not to be asked again to continue, but in fact he is experiencing the opposite: when he says `y`, Sca... [00:57:07] 10Beta-Cluster-Infrastructure, 10Wikimedia-Site-requests, 13Patch-For-Review: On beta metawiki, a mix of the beta enwiki and the production metawiki logos show - https://phabricator.wikimedia.org/T125942#2988615 (10tomasz) This does not appear to have helped at all, I think. [03:13:17] 06Release-Engineering-Team, 10MediaWiki-Vagrant, 06Operations, 07Epic: [EPIC] Migrate base image to Debian Jessie - https://phabricator.wikimedia.org/T136429#2988743 (10Krinkle) [03:39:16] 06Release-Engineering-Team, 10MediaWiki-Vagrant, 06Operations, 07Epic: [EPIC] Migrate base image to Debian Jessie - https://phabricator.wikimedia.org/T136429#2988770 (10bd808) [03:40:18] 10Gerrit, 07artificial-intelligence: Patch-wrangler -- suggests the best reviewers for a patch - https://phabricator.wikimedia.org/T155851#2956819 (10Tgr) There is a [[https://gerrit.googlesource.com/plugins%2Freviewers-by-blame|gerrit plugin]] that recommends reviewers based on blaming the changed lines. (See... [03:40:39] 06Release-Engineering-Team, 10MediaWiki-Vagrant, 06Operations, 07Epic: [EPIC] Migrate base image to Debian Jessie - https://phabricator.wikimedia.org/T136429#2334744 (10bd808) [03:57:03] Project selenium-MultimediaViewer » firefox,mediawiki,Linux,contintLabsSlave && UbuntuTrusty build #283: 04FAILURE in 2.7 sec: https://integration.wikimedia.org/ci/job/selenium-MultimediaViewer/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=mediawiki,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/283/ [05:04:44] 10Continuous-Integration-Config, 06Wikipedia-Android-App-Backlog, 07Technical-Debt: Add support to peridoic CI tests for exercising arbitrary revisions - https://phabricator.wikimedia.org/T152455#2988870 (10Niedzielski) I'm not sure but I believe this would require an account on Jenkins. For my own reference... [06:47:21] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator, 07Upstream: During Phabricator upgrade on 2017-01-26, all m3 replica dbs crashed at the same time - https://phabricator.wikimedia.org/T156373#2989006 (10Marostegui) db2012 caught up nicely so I believe this ticket can be closed. We can disc... [06:48:21] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator, 07Upstream: During Phabricator upgrade on 2017-01-26, all m3 replica dbs crashed at the same time - https://phabricator.wikimedia.org/T156373#2989008 (10Marostegui) 05Open>03Resolved a:03jcrespo [09:33:01] (03PS1) 10Hashar: Drop phpstorm-stubs [integration/config] - 10https://gerrit.wikimedia.org/r/335410 (https://phabricator.wikimedia.org/T153252) [09:39:47] (03CR) 10Hashar: [C: 032] Drop phpstorm-stubs [integration/config] - 10https://gerrit.wikimedia.org/r/335410 (https://phabricator.wikimedia.org/T153252) (owner: 10Hashar) [09:40:38] (03Merged) 10jenkins-bot: Drop phpstorm-stubs [integration/config] - 10https://gerrit.wikimedia.org/r/335410 (https://phabricator.wikimedia.org/T153252) (owner: 10Hashar) [09:46:47] ohhhh [10:57:49] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2989313 (10Marostegui) [11:12:05] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2989405 (10Marostegui) >>! In T156905#2989343, @Volans wrote: > Looks like there was some heavy load on the server in the ~20 minutes before the OOM: > > https:/... [11:24:33] chasemp: good morning! Thx for the check-nodepool-age script, I did a review on https://gerrit.wikimedia.org/r/#/c/335373/3 looks mostly all good to me. Thanks for that! [11:29:50] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2989430 (10Paladox) @Marostegui hi, it looks like the server is running version Server version: 10.0.23-MariaDB-log Should we update it to 10.0.29 the package th... [11:38:43] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2989448 (10Paladox) @Marostegui could it b a memory leak? could this https://github.com/MariaDB/server/commit/b7dc830 be the fix? [11:46:20] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2989466 (10jcrespo) > However, db1048 (the slave) also crashed and that one is only supposed to have the replication thread running right? No, at around 09:39:31... [11:47:21] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2989467 (10Marostegui) >>! In T156905#2989430, @Paladox wrote: > @Marostegui hi, it looks like the server is running version Server version: 10.0.23-MariaDB-log >... [11:51:48] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2989474 (10Paladox) does this mean that phabricator needs improvements to it's query? [12:27:07] Project beta-scap-eqiad build #140323: 04FAILURE in 1 min 24 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/140323/ [12:37:28] Yippee, build fixed! [12:37:28] Project beta-scap-eqiad build #140324: 09FIXED in 1 min 49 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/140324/ [12:52:50] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2989591 (10Marostegui) >>! In T156905#2989474, @Paladox wrote: > does this mean that phabricator needs improvements to it's query? I believe so, or at least on t... [13:18:06] !log upgraded deployment-prep to hhvm 3.12.12 [13:18:10] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [13:29:25] !log starting deployment-elastic* migration to jessie and moving data partition to /srv (T151326 / T151328) [13:29:30] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [13:29:30] T151326: Upgrade cirrus / elasticsearch to Jessie - https://phabricator.wikimedia.org/T151326 [13:29:30] T151328: move data to /srv for the cirrus / elasticsearch clusters - https://phabricator.wikimedia.org/T151328 [13:29:40] ^ this should be entirely transparent, but who knows... [13:34:42] !log shutting down and reimaging deployment-elastic05 - T151326 [13:34:45] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [13:34:46] T151326: Upgrade cirrus / elasticsearch to Jessie - https://phabricator.wikimedia.org/T151326 [13:39:46] PROBLEM - Host deployment-elastic05 is DOWN: CRITICAL - Host Unreachable (10.68.17.182) [13:47:36] Yippee, build fixed! [13:47:36] Project selenium-VisualEditor » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #292: 09FIXED in 2 min 36 sec: https://integration.wikimedia.org/ci/job/selenium-VisualEditor/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/292/ [13:48:29] PROBLEM - Puppet run on deployment-zookeeper01 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [14:06:21] !log shutting down and reimaging deployment-elastic06 - T151326 [14:06:24] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:06:25] T151326: Upgrade cirrus / elasticsearch to Jessie - https://phabricator.wikimedia.org/T151326 [14:08:04] RECOVERY - Host deployment-elastic05 is UP: PING OK - Packet loss = 0%, RTA = 2.10 ms [14:09:57] PROBLEM - Host deployment-elastic06 is DOWN: CRITICAL - Host Unreachable (10.68.17.186) [14:14:23] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2989711 (10epriestley) I'm not sure why the query is slow or requires a significant amount of memory. The structure of the query specifically tries to avoid this,... [14:16:55] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2989717 (10epriestley) One vague possibility is that you may have a few documents which are extremely large: for example, perhaps someone wrote a 1GB comment on a... [14:22:13] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2989733 (10jcrespo) [14:28:29] RECOVERY - Puppet run on deployment-zookeeper01 is OK: OK: Less than 1.00% above the threshold [0.0] [14:32:38] !log shutting down and reimaging deployment-elastic07 - T151326 [14:32:41] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:32:42] T151326: Upgrade cirrus / elasticsearch to Jessie - https://phabricator.wikimedia.org/T151326 [14:35:39] PROBLEM - Host deployment-elastic07 is DOWN: CRITICAL - Host Unreachable (10.68.17.187) [14:52:07] !log killing test node deployment-elastic08 - T151326 [14:52:11] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:52:11] T151326: Upgrade cirrus / elasticsearch to Jessie - https://phabricator.wikimedia.org/T151326 [14:53:57] !log deployment-elastic* fully migrated to Jessie and /srv as data partition - T151326 [14:54:01] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [14:57:30] PROBLEM - Host deployment-elastic08 is DOWN: CRITICAL - Host Unreachable (10.68.18.241) [15:06:03] RECOVERY - Host deployment-elastic06 is UP: PING OK - Packet loss = 0%, RTA = 3.10 ms [15:08:24] RECOVERY - Host deployment-elastic07 is UP: PING OK - Packet loss = 0%, RTA = 1.34 ms [15:12:34] hashar: thanks for the good notes, give https://gerrit.wikimedia.org/r/#/c/335373/ another look please [15:26:37] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2989959 (10Marostegui) Hey @epriestley - thanks for the fast response! >>! In T156905#2989711, @epriestley wrote: > I'm not sure why the query is slow or requi... [15:27:10] Project beta-scap-eqiad build #140341: 04FAILURE in 1 min 27 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/140341/ [15:36:56] 06Release-Engineering-Team (Long-Lived-Branches), 10scap, 06Operations: Make git 2.2.0+ (preferably 2.8.x) available - https://phabricator.wikimedia.org/T140927#2989978 (10fgiunchedi) @demon I was indeed able to build stretch's git as-is on jessie, resulting in `2.11.0-2~bpo8+1`. Uploading it internally to `... [15:37:31] Yippee, build fixed! [15:37:31] Project beta-scap-eqiad build #140342: 09FIXED in 1 min 47 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/140342/ [15:41:02] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2989990 (10Marostegui) >>! In T156905#2989959, @Marostegui wrote: > > ``` > | 50776 | root | localhost | phabricator_search | Query | 605 |... [15:42:41] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2989991 (10epriestley) Hrrm. For comparison, a similar inner query on `secure.phabricator.com` (searching for "phabricator" instead of "affecting translatewiki.ne... [15:48:21] Jenkins seems to be mysteriously failing, see for example https://integration.wikimedia.org/ci/job/mediawiki-extensions-hhvm-jessie/5111/console [15:49:58] it's failing with [15:50:01] /srv/deployment/integration/slave-scripts/bin/mw-run-phpunit-allexts.sh: line 23: 2312 Segmentation fault (core dumped) php -dzend.enable_gc=0 phpunit.php --log-junit "$JUNIT_DEST" --testsuite extensions [15:50:04] hashar ^^ [15:50:39] 06Release-Engineering-Team (Long-Lived-Branches), 10scap, 06Operations: Make git 2.2.0+ (preferably 2.8.x) available - https://phabricator.wikimedia.org/T140927#2990011 (10demon) That seems like a reasonable course to go for now. Then after further testing, perhaps roll it out further to the rest of the serv... [15:51:59] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990013 (10Paladox) What if we dropped the phabricator_search table then re created and then do the reindexing for mysql, would that work? [15:56:33] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990019 (10Marostegui) >>! In T156905#2989991, @epriestley wrote: > Hrrm. For comparison, a similar inner query on `secure.phabricator.com` (searching for "phabri... [15:58:54] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990020 (10epriestley) Here's another possible formulation of the query based on Googling "FULLTEXT initialization", although I haven't yet found a real descripti... [16:02:25] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990021 (10jcrespo) Maybe tuning `innodb_ft_cache_size` and `innodb_ft_total_cache_size` is needed. Little to no tuning was done after converting the table from M... [16:03:58] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990030 (10epriestley) Ah! That seems pretty broken -- the same simple query takes `0.03s` on `secure.phabricator.com`, so yours is ~22,000x slower for ~10x more... [16:06:25] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990034 (10jcrespo) > Is that something you're comfortable trying? Yes, that makes lot of sense, too. [16:07:39] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990035 (10epriestley) Tuning `innodb_ft_*` parameters may also be fruitful, but I don't have any direct experience with it to provide guidance. [16:08:30] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990036 (10Marostegui) >>! In T156905#2990030, @epriestley wrote: > Ah! That seems pretty broken -- the same simple query takes `0.03s` on `secure.phabricator.com... [16:09:14] (03CR) 10Hashar: "I am not sure how it would work. Specially the entries will be added to a more specific autoloader which is unlikely to actually be loaded" [integration/config] - 10https://gerrit.wikimedia.org/r/335215 (owner: 10Aleksey Bekh-Ivanov (WMDE)) [16:16:06] 10Continuous-Integration-Infrastructure: Jenkins is failing mediawiki-extensions-hhvm-jessie due to segfault - https://phabricator.wikimedia.org/T156923#2990073 (10Anomie) [16:16:14] * anomie files https://phabricator.wikimedia.org/T156923 [16:17:35] 10Continuous-Integration-Infrastructure: Jenkins is failing mediawiki-extensions-hhvm-jessie due to segfault - https://phabricator.wikimedia.org/T156923#2990089 (10Anomie) [16:18:24] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure: Jenkins is failing mediawiki-extensions-hhvm-jessie due to segfault - https://phabricator.wikimedia.org/T156923#2990090 (10Paladox) [16:19:40] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure: Jenkins is failing mediawiki-extensions-hhvm-jessie due to segfault - https://phabricator.wikimedia.org/T156923#2990073 (10Paladox) line 42 indicates https://phabricator.wikimedia.org/diffusion/CIJE/browse/master/bin/mw-phpunit.sh;b9d91a... [16:24:54] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990100 (10epriestley) If we can't get the trivial `SELECT * WHERE MATCH(...)` case performing quickly, I believe the engine isn't going to be usable no matter ho... [16:42:51] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990129 (10Paladox) We have elastic search as an experimental option. You can enable it through the pref by going to Developer Settings and under the elastic sear... [16:43:41] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990131 (10Marostegui) Good news: ``` root@MISC m3[phabricator_search]> optimize table search_documentfield; Stage: 1 of 2 'copy to tmp table' 0.023% of stage d... [16:48:19] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990173 (10Marostegui) And the big query finished in `5 rows in set (1 min 59.41 sec)`: ``` root@MISC m3[phabricator_search]> SELECT documentPHID, MAX(fieldScore... [16:49:07] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990174 (10Paladox) but our only problem with elasticsearch is you can't reindex both indexes. So elasticsearch index is out of date. we need someway to be able... [16:49:13] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990175 (10Nemo_bis) >>! In T156905#2989343, @Volans wrote: > Looks like there was some heavy load on the server in the ~20 minutes before the OOM: > > https://g... [16:49:33] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990176 (10jcrespo) Let's still try to tune the innodb parameters. 2 minutes is also too much for a simple search. [16:51:57] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990178 (10Paladox) >>! In T156905#2990175, @Nemo_bis wrote: >>>! In T156905#2989343, @Volans wrote: >> Looks like there was some heavy load on the server in the... [17:04:26] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990225 (10Marostegui) >>! In T156905#2990176, @jcrespo wrote: > Let's still try to tune the innodb parameters. 2 minutes is also too much for a simple search. A... [17:49:03] 10Continuous-Integration-Config, 10Continuous-Integration-Infrastructure, 06Operations, 07HHVM: MediaWiki tests causes HHVM to segfault on Jessie - https://phabricator.wikimedia.org/T156923#2990453 (10hashar) [17:54:42] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990460 (10Paladox) >>! In T156905#2989959, @Marostegui wrote: > Hey @epriestley - thanks for the fast response! > > > >>>! In T156905#2989711, @epriestley wro... [17:57:37] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06Operations: MediaWiki tests causes HHVM to segfault on Jessie - https://phabricator.wikimedia.org/T156923#2990468 (10hashar) **I am not available for the next two hours. Hopefully back at 8pm UTC** Updated the task details but I guess it... [17:57:56] 06Release-Engineering-Team, 06Operations, 05DC-Switchover-Prep-Q3-2016-17: Understand the preparedness of misc services for datacenter switchover - https://phabricator.wikimedia.org/T156937#2990470 (10jcrespo) [17:58:53] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06Operations, 07HHVM: MediaWiki tests causes HHVM to segfault on Jessie - https://phabricator.wikimedia.org/T156923#2990497 (10greg) [18:02:39] 06Release-Engineering-Team, 06Operations, 05DC-Switchover-Prep-Q3-2016-17: Understand the preparedness of misc services for datacenter switchover - https://phabricator.wikimedia.org/T156937#2990505 (10jcrespo) The reason I created this ticket is because, as a DBA, I have to support some of those services bel... [18:05:31] 10Gerrit, 06Release-Engineering-Team, 06Operations, 10hardware-requests, 13Patch-For-Review: Requesting 1 spare misc box for Gerrit in codfw - https://phabricator.wikimedia.org/T148187#2990512 (10jcrespo) [18:05:41] 06Release-Engineering-Team, 06Operations, 05DC-Switchover-Prep-Q3-2016-17: Understand the preparedness of misc services for datacenter switchover - https://phabricator.wikimedia.org/T156937#2990511 (10jcrespo) [18:07:11] 06Release-Engineering-Team, 06Operations, 05DC-Switchover-Prep-Q3-2016-17: Understand the preparedness of misc services for datacenter switchover - https://phabricator.wikimedia.org/T156937#2990470 (10jcrespo) [18:09:22] 06Release-Engineering-Team, 06Operations, 05DC-Switchover-Prep-Q3-2016-17: Understand the preparedness of misc services for datacenter switchover - https://phabricator.wikimedia.org/T156937#2990527 (10jcrespo) [18:14:14] 06Release-Engineering-Team, 06Discovery, 06Discovery-Search, 10Elasticsearch, and 2 others: Setup a private elasticsearch cluster for phabricator - https://phabricator.wikimedia.org/T156939#2990532 (10Paladox) [18:15:06] 06Release-Engineering-Team, 06Discovery, 06Discovery-Search, 10Elasticsearch, and 2 others: Setup a private elasticsearch cluster for phabricator - https://phabricator.wikimedia.org/T156939#2990550 (10Paladox) p:05Triage>03High Changing to high since the db keeps crashing due to full text indexes. T15... [18:16:59] 06Release-Engineering-Team, 06Discovery, 06Discovery-Search, 10Elasticsearch, and 2 others: Setup a private elasticsearch cluster for phabricator - https://phabricator.wikimedia.org/T156939#2990556 (10Paladox) [18:17:57] 06Release-Engineering-Team, 06Discovery, 06Discovery-Search, 10Elasticsearch, and 2 others: Setup a private elasticsearch cluster for phabricator - https://phabricator.wikimedia.org/T156939#2990532 (10Paladox) Also per T155299#2975104 suggestion there. [18:29:01] 10Gerrit, 06Release-Engineering-Team, 06Operations: setup/install gerrit2001/WMF6408 - https://phabricator.wikimedia.org/T152525#2990610 (10RobH) [18:29:38] 10Gerrit, 06Release-Engineering-Team, 06Operations: setup/install gerrit2001/WMF6408 - https://phabricator.wikimedia.org/T152525#2851729 (10RobH) a:05RobH>03Papaul Please update this task with the network port this system is plugged into. I neglected to ask you do to that via the sub task. Then assign... [18:32:15] 10Gerrit, 06Release-Engineering-Team, 06Operations, 10ops-codfw: setup/install gerrit2001/WMF6408 - https://phabricator.wikimedia.org/T152525#2990670 (10RobH) [18:34:23] 10Gerrit, 06Release-Engineering-Team, 06Operations, 10ops-codfw: setup/install gerrit2001/WMF6408 - https://phabricator.wikimedia.org/T152525#2990675 (10Papaul) ge-5/0/10 [18:35:02] 10Gerrit, 06Release-Engineering-Team, 06Operations, 10ops-codfw: setup/install gerrit2001/WMF6408 - https://phabricator.wikimedia.org/T152525#2990676 (10RobH) a:05Papaul>03RobH [18:38:00] 10Gerrit, 06Release-Engineering-Team, 06Operations: setup/install gerrit2001/WMF6408 - https://phabricator.wikimedia.org/T152525#2990684 (10RobH) [18:44:25] 10Gerrit, 06Release-Engineering-Team, 06Operations: setup/install gerrit2001/WMF6408 - https://phabricator.wikimedia.org/T152525#2990720 (10RobH) [18:47:13] 10Gerrit, 06Release-Engineering-Team, 06Operations: setup/install gerrit2001/WMF6408 - https://phabricator.wikimedia.org/T152525#2990763 (10RobH) [18:51:04] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06Operations, 07HHVM: MediaWiki tests causes HHVM to segfault on Jessie - https://phabricator.wikimedia.org/T156923#2990073 (10thcipriani) >>! In T156923#2990468, @hashar wrote: > To spawn an instance, Nodepool pick the youngest one. I g... [18:51:22] !log nodepool delete-image 1320 per T156923 [18:51:26] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [18:51:26] T156923: MediaWiki tests causes HHVM to segfault on Jessie - https://phabricator.wikimedia.org/T156923 [18:51:53] hopefully this should fix hhvm tests running on nodepool :\ [18:53:20] for a day [19:00:36] #1290: The MariaDB server is running with the --read-only option so it cannot execute this statement < greg-g guess you know about this? [19:00:40] (phabricator down) [19:00:46] yup, see -operations [19:01:06] sweet [19:02:23] what an Aphront ;) [19:27:48] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990866 (10Marostegui) This has happened again and crashed the master and the slave. I have run the ALTER table to optimize the table on the master while we were... [19:29:24] we're back ^ [19:30:06] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990885 (10Paladox) p:05High>03Unbreak! Due to it taking down the db + forcing us to switch to elastic search. I am upping the priority to unbreak. [19:32:43] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2990912 (10greg) p:05Unbreak!>03High Issue has been mitigated (we're using the ES backend for everyone now), lowering priority but keeping open for any follow... [20:00:40] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06Operations, 07HHVM: MediaWiki tests causes HHVM to segfault on Jessie - https://phabricator.wikimedia.org/T156923#2991055 (10hashar) https://gerrit.wikimedia.org/r/#/c/323401/ had two builds run Failed https://integration.wikimedia.org... [20:06:28] 10Continuous-Integration-Infrastructure, 06Release-Engineering-Team, 06Operations, 07HHVM: New HHVM 3.12.11 segfault at end of MediaWiki PHPUnit tests - https://phabricator.wikimedia.org/T156923#2990073 (10hashar) [20:13:43] PROBLEM - Puppet run on integration-slave-precise-1011 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [0.0] [20:29:35] Fatal error: require_once(): Failed opening required '/srv/mediawiki/php-1.29.0-wmf.9/core/maintenance/Maintenance.php' (include_path='.:/usr/share/php:/srv/mediawiki/php') in /srv/mediawiki/php-1.29.0-wmf.9/extensions/ConfirmEdit/maintenance/GenerateFancyCaptchas.php on line 30 [20:29:42] Why is there /core/ in there? :) [20:30:59] > var_dump( getenv( 'MW_INSTALL_PATH' ) ); [20:30:59] string(31) "/srv/mediawiki/php-1.29.0-wmf.9" [20:36:19] Ah [20:36:43] https://github.com/wikimedia/mediawiki-extensions-ConfirmEdit/commit/ae85f2ac6bd47a9c8fd8c95ad58c1d1afdec781c#diff-d066e10a8d8686a771ba068de6b0084aL30 [20:43:50] Project selenium-Echo » chrome,beta,Linux,contintLabsSlave && UbuntuTrusty build #290: 04FAILURE in 1 min 50 sec: https://integration.wikimedia.org/ci/job/selenium-Echo/BROWSER=chrome,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/290/ [20:43:53] Project selenium-Echo » firefox,beta,Linux,contintLabsSlave && UbuntuTrusty build #290: 04FAILURE in 1 min 53 sec: https://integration.wikimedia.org/ci/job/selenium-Echo/BROWSER=firefox,MEDIAWIKI_ENVIRONMENT=beta,PLATFORM=Linux,label=contintLabsSlave%20&&%20UbuntuTrusty/290/ [20:48:43] RECOVERY - Puppet run on integration-slave-precise-1011 is OK: OK: Less than 1.00% above the threshold [0.0] [21:33:12] 06Release-Engineering-Team (Long-Lived-Branches), 10scap, 06Operations: Make git 2.2.0+ (preferably 2.8.x) available - https://phabricator.wikimedia.org/T140927#2991384 (10hashar) beta cluster and CI can definitely benefit from a newer git version. Can't it be pushed to `jessie-wikimedia/backports` and then... [21:43:10] !log Update mobileapps to e48a88c [21:43:15] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [22:02:14] 06Release-Engineering-Team, 06Operations, 10Phabricator: reinstall iridium (phabricator) as phab1001 with jessie - https://phabricator.wikimedia.org/T152129#2839436 (10Dzahn) [22:10:44] 06Release-Engineering-Team, 06Operations, 10Phabricator, 10hardware-requests, 10ops-eqiad: replacement hardware for iridium (phabricator) - https://phabricator.wikimedia.org/T156970#2991504 (10Dzahn) procurement ticket for iridium was https://rt.wikimedia.org/Ticket/Display.html?id=6772 they (12 misc s... [22:10:49] 06Release-Engineering-Team, 06Operations, 10Phabricator, 10hardware-requests, 10ops-eqiad: replacement hardware for iridium (phabricator) - https://phabricator.wikimedia.org/T156970#2991506 (10Paladox) [22:21:44] 06Release-Engineering-Team, 06Operations, 10Phabricator, 10hardware-requests, 10ops-eqiad: replacement hardware for iridium (phabricator) - https://phabricator.wikimedia.org/T156970#2991554 (10RobH) a:03mark iridium's (current eqiad phab host) specs: * 1U system * Dual Intel Xeon CPU E5-2450 v2 @ 2.5... [22:24:33] 06Release-Engineering-Team, 06Operations, 10Phabricator, 10hardware-requests, 10ops-eqiad: replacement hardware for iridium (phabricator) - https://phabricator.wikimedia.org/T156970#2991562 (10mmodell) [22:28:41] 10Beta-Cluster-Infrastructure: Don't throttle WMF office IP(s) for account creation in beta - https://phabricator.wikimedia.org/T87841#2991566 (10Tgr) [22:43:18] 06Release-Engineering-Team, 06Operations, 10Phabricator, 13Patch-For-Review: Setup test domain for phab2001 - https://phabricator.wikimedia.org/T152132#2991626 (10Dzahn) 05stalled>03declined see reason above, we can't do this yet. Instead we are requesting new hardware for phab1001 (T156970) to test wi... [22:49:15] 10Beta-Cluster-Infrastructure: Account creation throttling too restrictive on Beta Cluster - https://phabricator.wikimedia.org/T87704#2991676 (10Tgr) [22:49:17] 10Beta-Cluster-Infrastructure: Don't throttle WMF office IP(s) for account creation in beta - https://phabricator.wikimedia.org/T87841#2991672 (10Tgr) 05Resolved>03Open This does not seem to work. I don't think it ever did; `$wgRateLimitsExcludedIPs` is unrelated to account creation throttling. (The post-Aut... [22:53:12] 10Beta-Cluster-Infrastructure: Don't throttle WMF office IP(s) for account creation in beta - https://phabricator.wikimedia.org/T87841#2991695 (10Tgr) Hm, no, the code [[https://github.com/wikimedia/mediawiki/blob/master/includes/auth/ThrottlePreAuthenticationProvider.php#L103|does check]] the ping limiter even... [23:27:45] 10Beta-Cluster-Infrastructure, 13Patch-For-Review: Don't throttle WMF office IP(s) for account creation in beta - https://phabricator.wikimedia.org/T87841#2991771 (10Tgr) For the record, the way to temporarily lift the throttle is {P4853} [23:47:19] Project beta-scap-eqiad build #140394: 04FAILURE in 1 min 32 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/140394/ [23:48:01] 06Release-Engineering-Team, 10DBA, 06Operations, 10Phabricator: Phabricator master and slave crashed - https://phabricator.wikimedia.org/T156905#2991826 (10jcrespo) removing m3 from dbstore2001: db1043-bin.001457:753455796 [23:57:45] Yippee, build fixed! [23:57:46] Project beta-scap-eqiad build #140395: 09FIXED in 1 min 53 sec: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/140395/