[00:06:30] PROBLEM - Mediawiki Error Rate on graphite-labs is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [10.0] [00:11:31] RECOVERY - Mediawiki Error Rate on graphite-labs is OK: OK: Less than 1.00% above the threshold [1.0] [00:12:57] twentyafterfour: I’m playing with a scap fetch check, and need some advice: What’s the correct way to get a path to the deploy-cache/revs/ directory that scap is deploying? [00:13:08] It seems to not be the cwd. [00:14:11] At least, my “command: bash scap/fetch_check.sh” is failing to find this checked-out shell script. [00:21:04] awight: good question. I'm not sure. I'll look into it for you [00:26:28] twentyafterfour: Looks like the checks.yaml file can be templated, which is fun—maybe there’s a variable available with the target path... [00:27:10] we can make one if it doesn't exist already [00:27:52] The scap docs are awesome, btw! [00:27:58] :) [00:31:16] PROBLEM - Long lived cherry-picks on puppetmaster on deployment-puppetmaster02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [00:33:58] twentyafterfour: fyi the only other checks.yaml I found in wikimedia repos was https://github.com/wikimedia/search-MjoLniR-deploy/blob/8ab09890485ff24b1eae69d001cca1d11af6803b/scap/checks.yaml [00:34:25] … which seems to make the same mistake I was attempting to make: the scripts that get run are actually from the previously deployed repo rather than the new one. [00:35:13] awight: interesting. [00:35:26] I thought checks.yaml were used by more repos [00:35:39] some of what's on tin has not been pushed back to the repos [00:35:46] which is problematic on it's own [00:37:07] oof! [00:37:42] Yeah uncommitted stuff is scary. [00:38:54] awight@tin:/srv/deployment$ ls */deploy/scap/checks.yaml | wc -l [00:38:56] -> 16 [00:39:01] awight: which stage is your check running? [00:39:03] * awight head-scratches [00:39:17] The tricky one is a fetch check. [00:39:23] thcipriani says we don't have a way to know the patch other than promote stage [00:39:35] and there is apparently a task to fix it already [00:39:41] * twentyafterfour looks for the task [00:39:41] 10Phabricator, 10Operations: Add some ssd's to phab1001 and phab2001 - https://phabricator.wikimedia.org/T185971#3929861 (10Paladox) [00:39:42] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-EventLogging: EventLogging broken in beta - https://phabricator.wikimedia.org/T185952#3929484 (10elukey) I think that the kafka-jumbo hosts have their disk filled up, so el is not getting any new events :( [00:39:42] 10Phabricator, 10Release-Engineering-Team, 10Operations: Add some ssd's to phab1001 and phab2001 - https://phabricator.wikimedia.org/T185971#3929874 (10Paladox) [00:39:42] 10Scap, 10Operations, 10Patch-For-Review: scap sudo violation on first puppet run - https://phabricator.wikimedia.org/T185189#3929881 (10Dzahn) p:05Triage>03High [00:39:42] 10Scap, 10Operations, 10Patch-For-Review: scap sudo violation on first puppet run - https://phabricator.wikimedia.org/T185189#3908788 (10Dzahn) p:05High>03Normal [00:41:31] twentyafterfour: The plot thickens :). /srv/deployment/citoid/deploy/scap/checks.yaml calls “depool-citoid”, which doesn’t seem to exist anywhere [00:41:44] :-o [00:42:18] Thanks, knowing that it’s a known bug is reassuring :) [00:44:33] 10Phabricator, 10Release-Engineering-Team, 10Operations: Add some ssd's to phab1001 and phab2001 - https://phabricator.wikimedia.org/T185971#3929911 (10Dzahn) p:05Triage>03Low This should automatically happen once phab1001/2001 get replaced in the future. We are planning to use only SSDs unless there is... [00:48:00] 10Phabricator, 10Operations, 10Patch-For-Review: Switch phabricator from using apache to nginx - https://phabricator.wikimedia.org/T185644#3929916 (10Dzahn) I recall one time i had to restart it recently. Have there been more restarts by others? Is it really every week? [00:49:43] 10Phabricator, 10Operations, 10Patch-For-Review: Switch phabricator from using apache to nginx - https://phabricator.wikimedia.org/T185644#3929920 (10Paladox) @dzahn yep. Apparently it went unnoticed due to us restarting apache every week. And when we didnt restart it, phab would get slow. [00:54:22] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10MediaWiki-extensions-ORES, 10MW-1.31-release-notes (WMF-deploy-2018-02-06 (1.31.0-wmf.20)), and 2 others: How do I test my extension's maintenance scripts? - https://phabricator.wikimedia.org/T184775#3929924 (10awight) 05Open>03Resolved [01:09:57] 10Beta-Cluster-Infrastructure, 10Collaboration-Team-Triage: Create Fatal-Monitor dashboard in logstash-beta - https://phabricator.wikimedia.org/T185974#3929951 (10Etonkovidova) [01:32:42] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10MediaWiki-extensions-ORES, 10MW-1.31-release-notes (WMF-deploy-2018-02-06 (1.31.0-wmf.20)), and 2 others: How do I test my extension's maintenance scripts? - https://phabricator.wikimedia.org/T184775#3929975 (10Legoktm) @awight could you add some docume... [01:39:34] 10Release-Engineering-Team, 10MediaWiki-Core-Tests, 10MediaWiki-extensions-ORES, 10MW-1.31-release-notes (WMF-deploy-2018-02-06 (1.31.0-wmf.20)), and 2 others: How do I test my extension's maintenance scripts? - https://phabricator.wikimedia.org/T184775#3929982 (10awight) 05Resolved>03Open @Legoktm Tha... [01:40:25] PROBLEM - Free space - all mounts on integration-slave-jessie-1001 is CRITICAL: CRITICAL: integration.integration-slave-jessie-1001.diskspace._mnt.byte_percentfree (No valid datapoints found)integration.integration-slave-jessie-1001.diskspace._srv.byte_percentfree (<44.44%) [02:30:15] PROBLEM - App Server Main HTTP Response on deployment-mediawiki05 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:35:18] RECOVERY - App Server Main HTTP Response on deployment-mediawiki05 is OK: HTTP OK: HTTP/1.1 200 OK - 46849 bytes in 9.606 second response time [03:17:45] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-EventLogging: EventLogging broken in beta - https://phabricator.wikimedia.org/T185952#3929484 (10Tbayer) The problem seems to be more general: it appears that all the EventLogging tables in Hive (in the new "event" database) are likewise lagging behind,... [03:23:06] 10Phabricator, 10Operations, 10Patch-For-Review: Switch phabricator from using apache to nginx - https://phabricator.wikimedia.org/T185644#3930039 (10Dzahn) I suggest(ed) we add a puppetized cron to restart Apache once a week on Sunday. But we should also try to find out what is actually happening. [03:29:42] PROBLEM - Free space - all mounts on deployment-fluorine02 is CRITICAL: CRITICAL: deployment-prep.deployment-fluorine02.diskspace._srv.byte_percentfree (<44.44%) [03:33:57] 10Gerrit: Change on gerrit about cloning pywikibot/core repository - https://phabricator.wikimedia.org/T185949#3930070 (10Dvorapa) Docs on mediawiki contain `--recursive`. See [[https://www.mediawiki.org/wiki/Manual:Pywikibot/Gerrit#For_users]]. But the plugin seems essential to me. [04:06:34] 10Phabricator, 10Operations, 10Patch-For-Review: Switch phabricator from using apache to nginx - https://phabricator.wikimedia.org/T185644#3921927 (10Joe) So before we go to such a shotgun approach, I'd like to first: # Try to understand why apache is "stalling". Using `strace(1)` on the hanging apache proc... [04:23:34] 10Gerrit: Change on gerrit about cloning pywikibot/core repository - https://phabricator.wikimedia.org/T185949#3930096 (10demon) I'd rather not install a plugin with the associated maintenance costs for a single repository. [04:38:02] (03PS1) 10BryanDavis: Add mediawiki/libs/ObjectFactory [integration/config] - 10https://gerrit.wikimedia.org/r/406795 (https://phabricator.wikimedia.org/T147167) [04:58:56] (03PS2) 10BryanDavis: Add mediawiki/libs/ObjectFactory [integration/config] - 10https://gerrit.wikimedia.org/r/406795 (https://phabricator.wikimedia.org/T147167) [05:09:47] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-EventLogging: EventLogging broken in beta - https://phabricator.wikimedia.org/T185952#3930141 (10elukey) So Andrew and I tried to check why the current kafka topic retention policy is not applied: ``` # The minimum age of a log file to be eligible for... [05:15:52] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-EventLogging: EventLogging broken in beta - https://phabricator.wikimedia.org/T185952#3930142 (10elukey) >>! In T185952#3930033, @Tbayer wrote: > The problem seems to be more general: it appears that all the EventLogging tables in Hive (in the new "even... [05:16:18] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-EventLogging, 10User-Elukey: EventLogging broken in beta - https://phabricator.wikimedia.org/T185952#3930143 (10elukey) [05:46:44] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-EventLogging, 10User-Elukey: EventLogging broken in beta - https://phabricator.wikimedia.org/T185952#3930153 (10Tbayer) >>! In T185952#3930142, @elukey wrote: >>>! In T185952#3930033, @Tbayer wrote: >> The problem seems to be more general: it appears... [05:55:14] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-EventLogging, 10User-Elukey: EventLogging broken in beta - https://phabricator.wikimedia.org/T185952#3930164 (10elukey) I don't think the two things are related, but I'll triple check with Andrew tomorrow. [06:05:26] PROBLEM - Free space - all mounts on integration-slave-jessie-1001 is CRITICAL: CRITICAL: integration.integration-slave-jessie-1001.diskspace._mnt.byte_percentfree (No valid datapoints found)integration.integration-slave-jessie-1001.diskspace._srv.byte_percentfree (<22.22%) [06:30:43] PROBLEM - Puppet errors on deployment-cache-upload04 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [07:09:39] RECOVERY - Free space - all mounts on deployment-fluorine02 is OK: OK: All targets OK [07:10:42] RECOVERY - Puppet errors on deployment-cache-upload04 is OK: OK: Less than 1.00% above the threshold [0.0] [07:13:39] 10Release-Engineering-Team (Watching / External), 10ORES, 10Operations, 10Scoring-platform-team, and 4 others: 503 spikes and resulting API slowness starting 18:45 October 26 - https://phabricator.wikimedia.org/T179156#3930185 (10demon) Is there anything left here, now that everything in the summary is done? [10:30:33] Project mwext-phpunit-coverage-publish build #362: 04FAILURE in 19 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/362/ [10:33:05] Project mwext-phpunit-coverage-publish build #363: 04STILL FAILING in 18 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/363/ [10:33:22] 10Release-Engineering-Team, 10Wikibase-Quality, 10Wikibase-Quality-External-Validation, 10Wikidata: mwext-phpunit-coverage-publish for WikibaseQualityExternalValidation fails - https://phabricator.wikimedia.org/T185697#3930349 (10Lucas_Werkmeister_WMDE) [10:33:47] 10Release-Engineering-Team, 10Wikibase-Quality, 10Wikibase-Quality-External-Validation, 10Wikidata: mwext-phpunit-coverage-publish for WikibaseQualityExternalValidation fails - https://phabricator.wikimedia.org/T185697#3923252 (10Lucas_Werkmeister_WMDE) [11:45:58] Yippee, build fixed! [11:45:59] Project mwext-phpunit-coverage-publish build #364: 09FIXED in 44 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/364/ [12:04:53] 10Phabricator: My username shows wrong - https://phabricator.wikimedia.org/T185998#3930496 (10CodeCat) [13:23:43] !log made User:Ladsgroup admin and 'crat in wikidatawiki [13:23:48] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [13:55:11] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: npm-node-6-docker tests failing for Android project. - https://phabricator.wikimedia.org/T185931#3930593 (10Dbrant) p:05Triage>03Unbreak! [14:10:15] 10Phabricator, 10Operations, 10Patch-For-Review: Switch phabricator from using apache to nginx - https://phabricator.wikimedia.org/T185644#3930673 (10Paladox) For this "Try to understand why apache is "stalling". Using strace(1) on the hanging apache processes should give us some indication of what is going... [14:33:44] Hi all, I neeed emergency deploy for T186002 - https://gerrit.wikimedia.org/r/#/c/406820/. Throttle rule for February 01. Thanks in advance. The same message is in -operation and -tech. [14:33:44] T186002: Requesting temporary lift of IP cap - https://phabricator.wikimedia.org/T186002 [14:53:14] PROBLEM - Host deployment-puppetdb01 is DOWN: CRITICAL - Host Unreachable (10.68.23.76) [15:00:10] Project mediawiki-core-code-coverage-php7 build #55: 04FAILURE in 9.6 sec: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage-php7/55/ [15:00:12] Project mediawiki-core-code-coverage build #3295: 04FAILURE in 12 sec: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage/3295/ [15:09:05] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: npm-node-6-docker tests failing for Android project. - https://phabricator.wikimedia.org/T185931#3930840 (10Dbrant) Note: I tried to "wipe out the current workspace" in the Jenkins console, and got this error: `java.nio.file.AccessDeniedExce... [15:19:56] PROBLEM - SSH on integration-slave-docker-1003 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:20:12] twentyafterfour: Are there sekret docs on developing scap? I wanted to take a stab at a simple path tweak, but didn’t want to do so blindly… [15:21:19] omg. It was right there, I just had to… scroll the page. https://doc.wikimedia.org/mw-tools-scap/dev/index.html [15:24:53] RECOVERY - SSH on integration-slave-docker-1003 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [15:39:59] twentyafterfour: thcipriani: I was going to fix the fetch-check path thing mentioned yesterday, and can think of two reasonable ways to go about it. Please lemme know which sounds better: * chdir to the deploy-cache/revs/ dir before running checks, or * set an environment variable like SCAP_TARGET_DIR before running checks. [16:21:44] HI [16:21:56] What happening again with mediawiki-core-doxygen-publish? [16:23:05] It needs to be switched off and back on [16:23:15] thcipriani for when you have a chance ^^ please :) [16:25:18] I have suggestion: to mediawiki-core-doxygen-publish and mediawiki-core-jsduck-publish be merged in one service. mediawiki-core-publish [16:33:17] PROBLEM - Puppet errors on deployment-logstash2 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [16:45:47] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-EventLogging, 10User-Elukey: EventLogging broken in beta - https://phabricator.wikimedia.org/T185952#3930959 (10Tbayer) Thanks @elukey - yes, this was just a guess, based on the fact that the Hive tables stopped updating on the same day (January 26).... [16:51:51] jouncebot: next [16:57:32] RECOVERY - Free space - all mounts on deployment-kafka-jumbo-2 is OK: OK: All targets OK [16:57:53] paladox: done [16:57:56] There’s no dependency mechanism for vagrant roles, or is there? [16:57:59] thcipriani thanks :) [16:59:15] awight: "require ::name::of::role" ? [16:59:21] and/or "include" [16:59:29] RECOVERY - Free space - all mounts on deployment-kafka-jumbo-1 is OK: OK: All targets OK [16:59:35] oh hey, interesting! [17:00:18] bd808: That might be better than what I’m looking for, I meant having top-level “vagrant roles” include one another. [17:00:39] But I could go ahead and silently pull in the dependencies using puppet… [17:02:31] awight: https://github.com/wikimedia/mediawiki-vagrant/blob/master/puppet/modules/role/manifests/commons.pp#L11-L14 [17:02:47] is that what you are wanting/needing? [17:04:44] bd808: thanks, I’m not sure whether I manufactured some contrary memories…. I remember the eventbus role complaining that I needed to enable the kafka role, but looking more closely I see https://github.com/wikimedia/mediawiki-vagrant/blob/master/puppet/modules/role/manifests/eventbus.pp#L9 [17:05:21] i.e. thanks, this will work perfectly :) [17:05:59] sweet. Puppet can fight you at some points so if you get trapped in a corner yell for help/other eyes [17:06:38] hehe well said. I’ve grown to accept it, which I hate myself for. [17:07:23] All I can imagine is that Puppet became the industry standard thanks to the even more horrifying alternatives. [17:11:37] Basically yeah, there weren't a lot of alternatives [17:15:46] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-EventLogging, 10User-Elukey: EventLogging broken in beta - https://phabricator.wikimedia.org/T185952#3931069 (10elukey) I truncated all the eventlogging topic partitions on the deployment-kafka-jumbo hosts and restarted the brokers, let's see if the l... [17:19:29] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-EventLogging, 10User-Elukey: EventLogging broken in beta - https://phabricator.wikimedia.org/T185952#3931091 (10Tgr) Thanks! Event logging in beta seems to work fine now. [17:24:45] 10Phabricator, 10Release-Engineering-Team, 10Operations: Consider ssd's for phabricator - https://phabricator.wikimedia.org/T185796#3931129 (10faidon) [17:24:48] 10Phabricator, 10Release-Engineering-Team, 10Operations: Add some ssd's to phab1001 and phab2001 - https://phabricator.wikimedia.org/T185971#3931132 (10faidon) [17:26:24] 10Phabricator, 10Release-Engineering-Team, 10Operations: Add some ssd's to phab1001 and phab2001 - https://phabricator.wikimedia.org/T185971#3931145 (10Zoranzoki21) >>! In T185971#3929911, @Dzahn wrote: > This should automatically happen once phab1001/2001 get replaced in the future. We are planning to use o... [17:27:56] 10Phabricator, 10Release-Engineering-Team, 10Operations: Add some ssd's to phab1001 and phab2001 - https://phabricator.wikimedia.org/T185971#3931170 (10Paladox) >>! In T185971#3931145, @Zoranzoki21 wrote: >>>! In T185971#3929911, @Dzahn wrote: >> This should automatically happen once phab1001/2001 get replac... [17:28:33] 10Phabricator, 10Release-Engineering-Team, 10Operations: Add some ssd's to phab1001 and phab2001 - https://phabricator.wikimedia.org/T185971#3931174 (10Dzahn) >>! In T185971#3931145, @Zoranzoki21 wrote: > Why you no change apache with ngnix? Please see T185644 for that topic. [17:28:47] PROBLEM - Puppet errors on deployment-snapshot01 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [17:29:19] 10Phabricator, 10Release-Engineering-Team, 10Operations: Add some ssd's to phab1001 and phab2001 - https://phabricator.wikimedia.org/T185971#3931176 (10Zoranzoki21) >>! In T185971#3931174, @Dzahn wrote: >>>! In T185971#3931145, @Zoranzoki21 wrote: >> Why you no change apache with ngnix? > > Please see T1856... [17:47:53] 10Phabricator, 10Release-Engineering-Team, 10Operations: Add some ssd's to phab1001 and phab2001 - https://phabricator.wikimedia.org/T185971#3931267 (10faidon) 05Open>03declined This has no problem statement, diagnosis, root cause analysis or evidence of I/O starvation -- and yet we're jumping to actiona... [17:52:09] 10Phabricator, 10Release-Engineering-Team, 10Operations: Add some ssd's to phab1001 and phab2001 - https://phabricator.wikimedia.org/T185971#3929861 (10demon) >>! In T185971#3929911, @Dzahn wrote: > While SSDs would of course not hurt performance, Its not like Phabricator is permanently slow. I think the ga... [17:55:17] 10Phabricator, 10Release-Engineering-Team, 10Operations: Add some ssd's to phab1001 and phab2001 - https://phabricator.wikimedia.org/T185971#3931345 (10demon) Oh, and if you want to have a look at IO usage, [[ https://grafana.wikimedia.org/dashboard/db/prometheus-machine-stats?from=now-3h&to=now&orgId=1&var-... [18:16:59] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: npm-node-6-docker tests failing for Android project. - https://phabricator.wikimedia.org/T185931#3931462 (10Charlotte) p:05Unbreak!>03High [18:19:12] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: npm-node-6-docker tests failing for Android project. - https://phabricator.wikimedia.org/T185931#3928972 (10Charlotte) Changed to High priority as it's preventing our release, but not breaking prod. Ticket should be picked up with urgency on... [18:23:19] 10Continuous-Integration-Infrastructure, 10MediaWiki-Core-Tests, 10Operations, 10HHVM: HHVM 3.18.5+dfsg-1+wmf3 changes parse_url causing unit tests to fail - https://phabricator.wikimedia.org/T185024#3931478 (10MoritzMuehlenhoff) A revised fix has been released (along with 3.18.8), I'll roll that into our... [18:28:51] 10Gerrit: Suggestion: Disable complete test if is only commit message updated - https://phabricator.wikimedia.org/T186032#3931525 (10Zoranzoki21) [18:30:00] 10Gerrit: Suggestion: Disable complete test if is only commit message updated - https://phabricator.wikimedia.org/T186032#3931538 (10Zoranzoki21) [18:30:44] 10Gerrit: Suggestion: Disable complete test(s) if is only commit message updated - https://phabricator.wikimedia.org/T186032#3931525 (10Zoranzoki21) [18:34:26] 10Beta-Cluster-Infrastructure: Create Fatal-Monitor dashboard in logstash-beta - https://phabricator.wikimedia.org/T185974#3931559 (10jmatazzoni) [18:54:29] PROBLEM - Puppet errors on deployment-secureredirexperiment is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [19:10:00] 10Deployments: Localization update not reflected on arwiki - https://phabricator.wikimedia.org/T186038#3931775 (10Meno25) [19:10:36] 10Deployments, 10MediaWiki-extensions-LocalisationUpdate: Localization update not reflected on arwiki - https://phabricator.wikimedia.org/T186038#3931786 (10Meno25) [19:36:35] 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team: npm-node-6-docker tests failing for Android project. - https://phabricator.wikimedia.org/T185931#3931868 (10Dbrant) According to [[ https://stackoverflow.com/questions/48401452/npm-cant-complete-browserify-installation-because-acorn5-object-s... [19:48:33] no_justification: https://gerrit.wikimedia.org/r/#/c/367781/10/maintenance/cleanupPreferences.php looks sane to me. I'm a little worried about random scripts having writen preferences outside of gadget-/userjs- long ago… [19:48:57] Long ago: could they even use them anymore? [19:49:45] https://gerrit.wikimedia.org/r/#/c/367781/10..11 [19:50:19] 10Continuous-Integration-Config, 10Gerrit: Suggestion: Disable complete test(s) if is only commit message updated - https://phabricator.wikimedia.org/T186032#3931969 (10Peachey88) [19:52:32] paladox: do you happen to know how i can turn off the Phabricator pop-up notifications while keeping the notifications number and the bell icon when i actively click on it [19:52:45] mutante nope [19:53:06] getting notifications in browser and not via email = great getting pop-up notifications that interrupt instead of me deciding to list them = no thanks [19:53:11] mutante actually [19:53:14] try https://phabricator.wikimedia.org/settings/user/Paladox/page/notifications/ [19:53:17] https://phabricator.wikimedia.org/settings/user/dzahn/page/notifications/ [19:53:40] paladox: i am set to "Web Only" [19:53:53] what would even happen with Desktop? [19:54:03] mutante ok, click off [19:54:08] tests it [19:54:27] i was worried then i dont get the traditional notifications either [19:55:16] mutante that pref is for i think live notifications [19:55:51] paladox: you are right, surprisingly i can set to "No notifications" followed by "Send test notification" and still get a .. notification.. just not the pop-up [19:55:56] and that's what i wanted. thx [19:56:01] :) [19:56:11] yea, notificatins vs. real-time notifications.. [19:56:17] even though the others are also "real-time" kind of [19:56:22] it's more about pop-up or not [19:56:50] they should call it "notification pop-up on/off" or something like that [19:56:56] yep [19:58:36] (03PS1) 10Skizzerz: Add DeleteUserPages extension [integration/config] - 10https://gerrit.wikimedia.org/r/406852 [20:03:51] twentyafterfour: whoa, scap-vagrant is really heavy. Think I could get away with commenting out the mediawiki-extensions pull? [20:04:54] (03PS2) 10Krinkle: Archive ContributionReporting [integration/config] - 10https://gerrit.wikimedia.org/r/404778 (https://phabricator.wikimedia.org/T185062) (owner: 10MaxSem) [20:04:57] (03CR) 10Krinkle: [C: 032] Archive ContributionReporting [integration/config] - 10https://gerrit.wikimedia.org/r/404778 (https://phabricator.wikimedia.org/T185062) (owner: 10MaxSem) [20:06:10] (03Merged) 10jenkins-bot: Archive ContributionReporting [integration/config] - 10https://gerrit.wikimedia.org/r/404778 (https://phabricator.wikimedia.org/T185062) (owner: 10MaxSem) [20:07:09] oh dear, it’s buried in “scap prep master”, apparently [20:09:30] !log Reloading Zuul to deploy https://gerrit.wikimedia.org/r/404778 [20:09:35] Logged the message at https://wikitech.wikimedia.org/wiki/Release_Engineering/SAL [20:11:57] (03CR) 10Paladox: [C: 031] Add DeleteUserPages extension [integration/config] - 10https://gerrit.wikimedia.org/r/406852 (owner: 10Skizzerz) [20:15:42] magic. I can’t even find what “scap prep” means [20:25:28] 10Continuous-Integration-Config, 10Gerrit: Suggestion: Disable complete test(s) if is only commit message updated - https://phabricator.wikimedia.org/T186032#3932130 (10Dvorapa) Sure this is not 100% needed, but what about commit style checks? [20:29:19] mutante: submit a suggestion upstream on their discourse! [20:31:15] 10Beta-Cluster-Infrastructure, 10ORES, 10Scoring-platform-team (Current), 10User-Ladsgroup, 10Wikimedia-log-errors: Beta Cluster ORES celery worker dies - https://phabricator.wikimedia.org/T184276#3932174 (10Halfak) [20:31:18] 10Beta-Cluster-Infrastructure, 10ORES, 10Scoring-platform-team (Current), 10User-Ladsgroup, 10Wikimedia-log-errors: Beta Cluster ORES celery worker dies - https://phabricator.wikimedia.org/T184276#3878311 (10Halfak) 05Open>03Resolved [20:31:24] 10Beta-Cluster-Infrastructure, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current), and 2 others: Move beta cluster ORES to its own machine - https://phabricator.wikimedia.org/T184282#3932173 (10Halfak) 05Open>03Resolved [20:32:16] 10Release-Engineering-Team (Watching / External), 10Scoring-platform-team (Current), 10Wikimedia-Incident: [Spike] Write reports about why Ext:ORES is helping cause server 500s and write tasks to fix - https://phabricator.wikimedia.org/T181010#3932206 (10Halfak) [20:32:19] 10Release-Engineering-Team (Watching / External), 10Scoring-platform-team (Current), 10Wikimedia-Incident: [Spike] Write reports about why Ext:ORES is helping cause server 500s and write tasks to fix - https://phabricator.wikimedia.org/T181010#3776331 (10Halfak) [20:32:28] 10Release-Engineering-Team (Watching / External), 10Scoring-platform-team (Current), 10Wikimedia-Incident: [Spike] Write reports about why Ext:ORES is helping cause server 500s and write tasks to fix - https://phabricator.wikimedia.org/T181010#3776331 (10Halfak) [20:40:37] 10Continuous-Integration-Config, 10Gerrit: Suggestion: Disable complete test(s) if is only commit message updated - https://phabricator.wikimedia.org/T186032#3932321 (10Zoranzoki21) >>! In T186032#3932130, @Dvorapa wrote: > Sure this is resource hungry, but what about commit style checks? Oh, I forgot on it.... [20:43:03] twentyafterfour: iono if you have time to help with this, but I’m trying to install scap-vagrant and I think this line is failing, > lxc.mount.entry=/scap /var/lib/lxc/$clone/rootfs/srv/deployment/scap/scap ro bind 0 0 [20:43:51] Scap vagrant is incredibly fragile [20:43:57] 10Continuous-Integration-Config: Suggestion: Disable complete test(s) if is only commit message updated - https://phabricator.wikimedia.org/T186032#3932323 (10greg) [20:44:01] (also we'd like to replace it) [20:44:55] no_justification: I can see why :) [20:45:18] We're offsite thru tomorrow btw [20:45:20] Maybe with puppet? [20:45:33] aha have fun! [20:51:59] I worked around with, sudo ln -s /srv/deployment/scap/scap /vagrant/scap [21:08:23] PROBLEM - Puppet errors on deployment-mx02 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [0.0] [21:25:56] Project mwext-phpunit-coverage-publish build #386: 04FAILURE in 2 min 32 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/386/ [21:25:59] 10Scap: Add DEPLOY_DIR env var to scap command checks - https://phabricator.wikimedia.org/T154612#2917726 (10awight) >>! In T154612#2917741, @dduvall wrote: > Should we just make it the cwd during check execution? For my purposes (T181071) that would be great, so I'm submitting a patch. I'm unsure what the rig... [21:26:25] Yippee, build fixed! [21:26:26] Project mwext-phpunit-coverage-publish build #387: 09FIXED in 28 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/387/ [21:28:34] Project mwext-phpunit-coverage-publish build #389: 04FAILURE in 49 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/389/ [21:29:47] Yippee, build fixed! [21:29:47] Project mwext-phpunit-coverage-publish build #390: 09FIXED in 1 min 13 sec: https://integration.wikimedia.org/ci/job/mwext-phpunit-coverage-publish/390/ [21:40:39] awight: Actually, I was thinking Docker or somesuch [21:40:52] We still need some containers & such so we can spawn targets to test deploy against [21:40:55] * awight gets suspicious that no_justification is from the future [21:40:58] (containers, vms, etc) [21:41:08] awight: I can't believe I'm hearing the words out of my *own* mouth! [21:42:32] Remember you can always plead the 5th when the puppet secret police catch up with us [21:52:42] no_justification apparently we will have to remove ci upstream for gerrit older then 2.13 soon (not now). uses an old os and new one's doin't support java 7 heh. [21:52:42] https://gerrit-review.googlesource.com/#/c/gerrit-ci-scripts/+/139511/ [21:53:53] * paladox had to figure out how to get chrome tests working [22:52:18] 10Release-Engineering-Team, 10Scap: Scap sync-file: report the file on IRC and SAL on failure - https://phabricator.wikimedia.org/T186064#3932707 (10Volans) [22:57:29] 10Release-Engineering-Team, 10Scap: Scap sync-file: report the file on IRC/SAL on canary error rate failure - https://phabricator.wikimedia.org/T186064#3932727 (10Volans) [22:58:31] 10Release-Engineering-Team, 10Scap: Scap: on canary failure, report the list of failed hosts - https://phabricator.wikimedia.org/T186065#3932731 (10Volans) [22:59:30] https://groups.google.com/forum/#!topic/repo-discuss/qrPo4myORfE heh [23:01:48] 10Release-Engineering-Team, 10Scap: Scap sync-file: allow to sync multiple files in different directories - https://phabricator.wikimedia.org/T186067#3932756 (10Volans) [23:28:32] 10Beta-Cluster-Infrastructure, 10Analytics, 10Analytics-EventLogging, 10User-Elukey: EventLogging broken in beta - https://phabricator.wikimedia.org/T185952#3932886 (10elukey) 05Open>03stalled [23:37:15] 10Release-Engineering-Team, 10Scap: Scap sync-file: allow to sync multiple files in different directories - https://phabricator.wikimedia.org/T186067#3932937 (10Niharika) [23:37:28] 10Release-Engineering-Team, 10Scap: Scap: on canary failure, report the list of failed hosts - https://phabricator.wikimedia.org/T186065#3932939 (10Niharika) [23:51:32] 10Release-Engineering-Team, 10Scap: Scap sync-file: report the file on IRC/SAL on canary error rate failure - https://phabricator.wikimedia.org/T186064#3932969 (10Niharika)