[06:59:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:00:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 442 bytes in 0.065 second response time [07:48:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:52:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 442 bytes in 0.105 second response time [08:09:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:10:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 442 bytes in 0.066 second response time [08:43:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:46:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.074 second response time [09:05:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:06:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.075 second response time [09:11:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:19:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.551 second response time [09:22:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:23:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.039 second response time [09:24:30] It has been flip flopping a lot [09:37:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:38:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.553 second response time [09:41:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:42:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 442 bytes in 0.051 second response time [09:52:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [09:53:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.556 second response time [10:02:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:04:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.569 second response time [10:17:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:18:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.600 second response time [10:45:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:46:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.072 second response time [11:08:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:09:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.047 second response time [11:15:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:17:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 442 bytes in 0.061 second response time [11:22:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:23:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 1.068 second response time [11:29:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:32:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 442 bytes in 0.541 second response time [11:36:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:39:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.553 second response time [12:17:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:18:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.075 second response time [12:38:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:39:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.057 second response time [12:46:53] PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:47:08] Eeek! Missed the pings. Looking into this now. [12:47:43] RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 442 bytes in 0.550 second response time [12:48:42] grafana-labs is such a mess [12:51:33] RECOVERY - ORES web node labs ores-web-03 on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 457 bytes in 0.679 second response time [12:56:43] I've restarted ores-web-03's services. Let's see how long this lasts. [13:04:05] Amir1, ^ cleaned up the last issue you brought up [13:05:06] halfak: {{merrged}} \o/ [13:05:07] 10[1] 04https://meta.wikimedia.org/wiki/Template:merrged [13:05:47] \o/ [15:33:09] halfak: Amir1: o/ Here's a rough diagram for scoring step we were chatting about, https://github.com/adamwight/ores-diagrams/blob/master/scoring_dfd.pdf [15:33:24] Not sure if it's helpful to explode the "calculate score" step? [15:34:05] The flow seems funny. It should branch, I think [15:34:18] E.g. if the score is cached, it never goes to the task queue [15:35:08] ah right, so an arrow from celery to the cache [15:37:22] oops--no, "calculate score" -> cache [15:38:29] https://commons.wikimedia.org/wiki/File:ORES_request_flow_(cache%2Bcelery).svg [15:39:53] new version pushed. [15:39:57] It still looks funny to me, though [15:41:03] Not sure what "model" is doing on the left. [15:41:45] yah any of these details are fair game to remove if you think they distract from the main point of the component data flow [15:42:16] I put model there as a reminder that the serialized model precedes (thus left) and feeds into scoring [15:42:42] elide? [15:43:40] otherwise, I was actually thinking of going in the other direction, and providing more details of the inputs and intermediate steps in scoring [15:44:06] e.g. "calculate features" [15:49:31] *extract features [15:55:12] Just made it weirder by splitting out the RC stream [15:57:46] adding "precaching job" might help with that [16:00:21] 10Revision-Scoring-As-A-Service-Backlog: Auto config wikilabels using dbnames - https://phabricator.wikimedia.org/T154433#2916618 (10Halfak) Here's a parser that works. https://gist.github.com/halfak/57de7bc6709de574a6bfe7051370fea9 [16:00:57] {done} [16:01:25] * awight researches "I <3 TeX" pin [16:05:22] Another early diagram worth heads-upping about: https://github.com/adamwight/ores-diagrams/blob/master/use_cases.svg [16:05:37] (ignore multiplicity, those lines got corrupted somehow) [16:18:27] * awight quickly backs away from "I love LaTeX" search [18:27:29] o/ awight [18:27:34] sorry I got distracted [18:28:57] no worries--I'm doing a side job and on paternity leave anyway :D [18:30:14] Not sure I fully grasp the use_cases.svg, but I appreciate the different human actors. [18:31:00] I see how the model comes in now. [18:31:58] the use case diagram is pretty low-level, we might want to do another one with the bird's-eye business goals 'n' stuff [18:34:13] 06Revision-Scoring-As-A-Service, 10Research Ideas, 10Wikimedia-Developer-Summit (2017): Where to surface AI in Wikimedia Projects - https://phabricator.wikimedia.org/T148690#2917207 (10dr0ptp4kt) [18:44:16] 06Revision-Scoring-As-A-Service, 10Research Ideas, 10Wikimedia-Developer-Summit (2017): Where to surface AI in Wikimedia Projects - https://phabricator.wikimedia.org/T148690#2917225 (10Mholloway) [21:42:15] https://github.com/wiki-ai/wikilabels/pull/144 [21:42:21] Well that was way harder than I thought [22:54:03] 06Revision-Scoring-As-A-Service, 10ORES: Split wheels repo into Prod/WMFLabs branches and maintain independence - https://phabricator.wikimedia.org/T154436#2918187 (10Halfak) [22:54:25] 06Revision-Scoring-As-A-Service, 10Wikilabels: Auto config wikilabels using dbnames - https://phabricator.wikimedia.org/T154433#2918190 (10Halfak) [22:54:49] 06Revision-Scoring-As-A-Service, 10Wikilabels: Auto config wikilabels using dbnames - https://phabricator.wikimedia.org/T154433#2911298 (10Halfak) a:03Halfak [22:55:27] 06Revision-Scoring-As-A-Service, 10Wikilabels: Auto config wikilabels using dbnames - https://phabricator.wikimedia.org/T154433#2911298 (10Halfak) See https://github.com/wiki-ai/wikilabels/issues/142 [22:55:49] 06Revision-Scoring-As-A-Service, 10Wikilabels, 13Patch-For-Review, 15User-Ladsgroup: Minification and bundling for wikilabels assets - https://phabricator.wikimedia.org/T154122#2901756 (10Halfak) [22:59:34] 06Revision-Scoring-As-A-Service, 10ORES, 07Documentation: List ORES use cases - https://phabricator.wikimedia.org/T154440#2918203 (10Halfak) Actors and use cases are a but different I think. The roles a human could take are: * Model engineer -- Feature engineering and modeling work * Operations engineer --... [23:05:55] 06Revision-Scoring-As-A-Service, 10Research Ideas, 10Wikimedia-Developer-Summit (2017): What should an AI do you for you? Building an AI Wishlist. - https://phabricator.wikimedia.org/T147710#2701365 (10cscott) I'm interested in expanding our use of machine translation to aid our editors. I wonder if we migh...