[06:29:09] PROBLEM - puppet on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused [06:29:17] PROBLEM - check users on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused [06:29:25] PROBLEM - check load on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused [06:29:50] PROBLEM - ORES web node labs ores-web-02 on ores.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 325 bytes in 0.024 second response time https://wikitech.wikimedia.org/wiki/ORES [06:30:11] PROBLEM - check disk on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused [06:46:28] RECOVERY - ORES web node labs ores-web-02 on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 1009 bytes in 0.047 second response time https://wikitech.wikimedia.org/wiki/ORES [06:53:11] PROBLEM - check load on ORES-web02.Experimental is CRITICAL: connect to address 172.16.6.234 port 5666: Connection refusedconnect to host ores-web-02.ores.eqiad.wmflabs port 5666: Connection refused [06:53:17] RECOVERY - check users on ORES-web01.Experimental is OK: USERS OK - 1 users currently logged in [06:53:25] RECOVERY - check load on ORES-web01.Experimental is OK: OK - load average: 0.37, 0.17, 0.36 [06:53:40] RECOVERY - puppet on ORES-web01.Experimental is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [06:53:47] PROBLEM - check disk on ORES-web02.Experimental is CRITICAL: connect to address 172.16.6.234 port 5666: Connection refusedconnect to host ores-web-02.ores.eqiad.wmflabs port 5666: Connection refused [06:53:52] PROBLEM - check users on ORES-web02.Experimental is CRITICAL: connect to address 172.16.6.234 port 5666: Connection refusedconnect to host ores-web-02.ores.eqiad.wmflabs port 5666: Connection refused [06:54:11] RECOVERY - check disk on ORES-web01.Experimental is OK: DISK OK [06:56:12] PROBLEM - puppet on ORES-web02.Experimental is CRITICAL: connect to address 172.16.6.234 port 5666: Connection refusedconnect to host ores-web-02.ores.eqiad.wmflabs port 5666: Connection refused [07:15:59] RECOVERY - puppet on ORES-web02.Experimental is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [07:17:11] RECOVERY - check load on ORES-web02.Experimental is OK: OK - load average: 0.09, 0.30, 0.33 [07:17:47] RECOVERY - check disk on ORES-web02.Experimental is OK: DISK OK [07:17:52] RECOVERY - check users on ORES-web02.Experimental is OK: USERS OK - 1 users currently logged in [07:55:32] (03CR) 10jenkins-bot: Localisation updates from https://translatewiki.net. [extensions/ORES] - 10https://gerrit.wikimedia.org/r/525018 (owner: 10L10n-bot) [12:41:30] 10ORES, 10Scoring-platform-team, 10Growth-Team, 10WMF-JobQueue, and 2 others: Fatal error during RecentChange::notifyEdit (deferred update) from ORES/RecentChangeSaveHookHandler - https://phabricator.wikimedia.org/T225199 (10akosiaris) p:05Unbreak!→03High Lowering priority from UBN, since that was on J... [13:57:42] 10Scoring-platform-team (Current), 10articlequality-modeling, 10draftquality-modeling, 10drafttopic-modeling, and 3 others: Decide on criteria for releases/versioning of model repos - https://phabricator.wikimedia.org/T228215 (10Halfak) [14:00:17] 10Jade, 10Scoring-platform-team (Current): Write API contract for Jade - https://phabricator.wikimedia.org/T217904 (10Halfak) I think a good next step for the updated schema is to submit a patchset to gerrit. I can review and approve there. Otherwise, the Jade API spec looks good at first glance. I'm goin... [14:01:31] woops. forgot to make myself back. [14:01:38] I am feeling much better today. [14:01:43] I slept a lot yesterday. [14:02:09] I think it might be getting to the point where we want to shut down icinga2. We get a slew of false alarms every day. [14:02:33] paladox, do you have any suggestions other than just moving ORES-in-labs monitoring to regular icinga? [14:03:04] 10Scoring-platform-team, 10Growth-Team: Update srwiki thresholds for goodfaith model - https://phabricator.wikimedia.org/T223273 (10Acamicamacaraca) Any updates here? [14:03:19] Nope :( [16:00:36] paladox, got it. I'll make a task. [16:00:40] accraze, o/ [16:00:42] good morning :) [16:03:54] mornin! [16:04:24] async standup? [16:08:57] my update: [16:08:57] Y: retrained huwiki model, started on editquality docs, took a look at MWcomments abstractions w/ Nate [16:08:57] T: update huwiki PR, complete training docs for editquality, add updated Jade schema patchset to gerrit [16:11:30] Y: Sick. Mostly nothing. Just managed email backlog. [16:12:34] T: Management work mostly. Might need to do something with our opportunity fund request. Review API spec and finish reviewing wireframes in the same shot. Merge huwiki stuff (probably) [16:34:43] Taking lunch. Back in an hour [17:41:01] 10ORES, 10Scoring-platform-team, 10Analytics-EventLogging, 10Analytics-Kanban, and 4 others: Fix "Must provide the 'topic' parameter" in ORES /precache endpoint - https://phabricator.wikimedia.org/T228689 (10Halfak) Is there documentation for meta.stream? Should we expect the same data in a different fie... [17:42:22] 10ORES, 10Scoring-platform-team, 10Analytics-EventLogging, 10Analytics-Kanban, and 4 others: Fix "Must provide the 'topic' parameter" in ORES /precache endpoint - https://phabricator.wikimedia.org/T228689 (10Pchelolo) I'm not sure about the doc, @Ottomata might know more, but yes, the only change is `.met... [17:44:02] 10ORES, 10Scoring-platform-team, 10Analytics-EventLogging, 10Analytics-Kanban, and 4 others: Fix "Must provide the 'topic' parameter" in ORES /precache endpoint - https://phabricator.wikimedia.org/T228689 (10Ottomata) https://github.com/wikimedia/eventgate#streams And we set `stream_field` to `meta.strea... [17:45:58] 10Scoring-platform-team, 10Growth-Team (Current Sprint), 10Patch-For-Review: Contribution page: 'Hide probably good edits' displays entries that do not get ORES highlighted - https://phabricator.wikimedia.org/T176667 (10SBisson) a:03SBisson [17:48:58] 10ORES, 10Scoring-platform-team, 10Analytics-EventLogging, 10Analytics-Kanban, and 4 others: Fix "Must provide the 'topic' parameter" in ORES /precache endpoint - https://phabricator.wikimedia.org/T228689 (10Halfak) It looks like we are using "meta.topic" here: https://github.com/wikimedia/ores/blob/maste... [17:49:35] 10ORES, 10Scoring-platform-team (Current), 10Analytics-EventLogging, 10Analytics-Kanban, and 4 others: Fix "Must provide the 'topic' parameter" in ORES /precache endpoint - https://phabricator.wikimedia.org/T228689 (10Halfak) [17:51:51] wikimedia/ores#1345 (precache_schema_field - 3f8ccad : halfak): The build failed. https://travis-ci.org/wikimedia/ores/builds/562704949 [17:52:07] 10Scoring-platform-team, 10Wikilabels: Document Wikilabels deploy process - https://phabricator.wikimedia.org/T227135 (10Halfak) p:05Triage→03Normal [17:52:14] Curses! [17:52:31] foiled again? [17:52:50] 10ORES, 10Scoring-platform-team (Current), 10Analytics-EventLogging, 10Analytics-Kanban, and 4 others: Fix "Must provide the 'topic' parameter" in ORES /precache endpoint - https://phabricator.wikimedia.org/T228689 (10Ottomata) Great! I left a comment there. It is 'meta.stream' not meta. schema. The [[... [17:59:23] 10ORES, 10Scoring-platform-team (Current), 10Analytics-EventLogging, 10Analytics-Kanban, and 4 others: Fix "Must provide the 'topic' parameter" in ORES /precache endpoint - https://phabricator.wikimedia.org/T228689 (10Halfak) Thanks! Was working too fast. I think I've got your changes. Please take anot... [17:59:41] 10Scoring-platform-team, 10Growth-Team, 10editquality-modeling, 10artificial-intelligence: Update RC Filters for new ORES capacities (July, 2019) - https://phabricator.wikimedia.org/T227094 (10Halfak) [18:00:00] 10Scoring-platform-team, 10editquality-modeling, 10artificial-intelligence: Deploy MSFT editquality model - https://phabricator.wikimedia.org/T227024 (10Halfak) p:05Triage→03Normal [18:00:24] 10ORES, 10Scoring-platform-team (Current), 10Analytics-EventLogging, 10Analytics-Kanban, and 4 others: Fix "Must provide the 'topic' parameter" in ORES /precache endpoint - https://phabricator.wikimedia.org/T228689 (10Ottomata) That should do it! [18:02:56] groceryheist, any chance you're around and available to jump on a call? [18:32:42] halfak: I was on my bike [18:33:07] No sweat. I've got some todo items to chase you for :D E.g. which wiki seems to have the least adoption [18:57:51] PROBLEM - ores-extension grafana alert on icinga1001 is CRITICAL: CRITICAL: ORES extension ( https://grafana.wikimedia.org/d/000000263/ores-extension ) is alerting: Service hits for obtaining thresholds alert. https://wikitech.wikimedia.org/wiki/ORES [18:58:22] Hmm [19:00:06] Looks like an MW deploy issue is causing this. I'm monitoring. [19:07:35] 10Scoring-platform-team, 10CommRel-Specialists-Support (Jul-Sep-2019): Community Relations Specialist support for Scoring Platform - https://phabricator.wikimedia.org/T217232 (10Halfak) [19:08:18] 10Scoring-platform-team, 10CommRel-Specialists-Support (Jul-Sep-2019): Community Relations Specialist support for Scoring Platform - https://phabricator.wikimedia.org/T217232 (10Halfak) I just applied the template that @mcruzWMF linked me to the description of the task. Let me know if you have any questions. [19:15:48] * accraze wanders off in search of food [19:17:39] 10Scoring-platform-team: Add feature for edit namespace to edit quality models - https://phabricator.wikimedia.org/T226574 (10Halfak) https://ores.wikimedia.org/v3/scores/enwiki/56782332/damaging?features ` "feature.revision.page.is_articleish": true, "feature.revision.page.is_draftspace": false, "feature.revis... [19:17:52] 10Scoring-platform-team, 10editquality-modeling, 10artificial-intelligence: Add feature for edit namespace to edit quality models - https://phabricator.wikimedia.org/T226574 (10Halfak) [19:18:31] 10Scoring-platform-team (Current): Build tool to guess what tool was used to make reverts on Wikimedia wikis - https://phabricator.wikimedia.org/T226426 (10Halfak) [19:22:03] 10Scoring-platform-team (Current), 10Patch-For-Review: Add Andy Craze to icinga notification for ORES related monitoring - https://phabricator.wikimedia.org/T226417 (10Halfak) [19:22:58] 10ORES, 10Scoring-platform-team (Current), 10Analytics-EventLogging, 10Analytics-Kanban, and 4 others: Fix "Must provide the 'topic' parameter" in ORES /precache endpoint - https://phabricator.wikimedia.org/T228689 (10Halfak) a:03Halfak [19:23:04] 10Scoring-platform-team (Current), 10Patch-For-Review: Add Andy Craze to icinga notification for ORES related monitoring - https://phabricator.wikimedia.org/T226417 (10Halfak) a:03Halfak [19:25:37] 10ORES, 10Scoring-platform-team: [Discuss] Future ORES architecture - https://phabricator.wikimedia.org/T226193 (10Halfak) @ACraze, this task should be interesting. I want to talk to you about some of ORES limitations at some point and we can work together on a roadmap that makes sense. [19:26:06] 10ORES, 10Scoring-platform-team: [Discuss] Future ORES architecture - https://phabricator.wikimedia.org/T226193 (10Halfak) p:05Triage→03Low a:03Halfak [19:27:22] 10ORES, 10Scoring-platform-team, 10Wikidata, 10editquality-modeling, 10artificial-intelligence: ORES is too slow for ORC tool - https://phabricator.wikimedia.org/T226120 (10Halfak) Honestly, it is hard to believe that ORES is too slow. But it's important that I understand what the ORC tool *needs* befor... [19:27:48] 10ORES, 10Scoring-platform-team, 10Wikidata, 10editquality-modeling, 10artificial-intelligence: ORES is too slow for ORC tool - https://phabricator.wikimedia.org/T226120 (10Halfak) p:05Triage→03Normal [19:28:31] 10Scoring-platform-team: Onboard Andy Craze -- Accounts and access - https://phabricator.wikimedia.org/T226416 (10Halfak) [19:29:42] OK looks like the MW deploy issue is still ongoing and we're still getting hammered by requests for ORES thresholds. [19:30:05] I'm going to get a snapshot of this and file a task. We shouldn't be getting DOS'd by our own systems even when they are having an event. [19:31:51] 10MediaWiki-extensions-ORES, 10Scoring-platform-team, 10Growth-Team: MW hammering ORES service with threshold lookup requests - https://phabricator.wikimedia.org/T228798 (10Halfak) [19:36:13] 10MediaWiki-extensions-ORES, 10Scoring-platform-team, 10Growth-Team: MW hammering ORES service with threshold lookup requests - https://phabricator.wikimedia.org/T228798 (10Halfak) @Catrope, do you know why ORES is getting hammered with these requests? Do all of the thresholds refresh on reboot? Maybe this... [19:36:33] 10ORES, 10Scoring-platform-team: Add all models to fakewiki - https://phabricator.wikimedia.org/T228799 (10MusikAnimal) [19:42:08] 10ORES, 10Scoring-platform-team: Add all models to fakewiki - https://phabricator.wikimedia.org/T228799 (10Halfak) Well need to come up with some clever way to produce deterministic output. Right how the editquality models use the revision Id to come up with a "prediction". We probably want to follow that p... [20:07:32] 10ORES, 10Scoring-platform-team, 10Wikidata, 10editquality-modeling, 10artificial-intelligence: ORES is too slow for ORC tool - https://phabricator.wikimedia.org/T226120 (10YMS) Hi Halfak. I'm sorry that I didn't reply when you created this ticket. I was on vacation at that time. Also, now I'm on a table... [20:11:43] 10MediaWiki-extensions-ORES, 10Scoring-platform-team, 10Growth-Team: MW hammering ORES service with threshold lookup requests - https://phabricator.wikimedia.org/T228798 (10JTannerWMF) This should be cached so the #growth-team are going to take a look at it to determine how to proceed. [20:13:07] RECOVERY - ores-extension grafana alert on icinga1001 is OK: OK: ORES extension ( https://grafana.wikimedia.org/d/000000263/ores-extension ) is not alerting. https://wikitech.wikimedia.org/wiki/ORES [21:44:06] finally just saw the racing videos, great stuff! [22:12:02] 10MediaWiki-extensions-ORES, 10Scoring-platform-team, 10Growth-Team: MW hammering ORES service with threshold lookup requests - https://phabricator.wikimedia.org/T228798 (10Catrope) 05Open→03Invalid The thresholds are cached in memcached, and our memcached servers had some issues today. The spikes you sa... [23:52:55] 10Scoring-platform-team, 10CommRel-Specialists-Support (Jul-Sep-2019): Community Relations Specialist support for Scoring Platform - https://phabricator.wikimedia.org/T217232 (10mcruzWMF) [23:53:19] 10Scoring-platform-team, 10CommRel-Specialists-Support (Jul-Sep-2019): Community Relations Specialist support for Scoring Platform - https://phabricator.wikimedia.org/T217232 (10mcruzWMF) [23:53:41] 10Scoring-platform-team, 10Community comms and outreach, 10CommRel-Specialists-Support (Jul-Sep-2019): Community Relations Specialist support for Scoring Platform - https://phabricator.wikimedia.org/T217232 (10mcruzWMF) [23:59:01] halfak: what's up?