[06:31:18] <icinga2-wm>	 PROBLEM - check users on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused
[06:31:25] <icinga2-wm>	 PROBLEM - check load on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused
[06:32:11] <icinga2-wm>	 PROBLEM - check disk on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused
[06:34:09] <icinga2-wm>	 PROBLEM - puppet on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused
[07:23:10] <icinga2-wm>	 RECOVERY - puppet on ORES-web01.Experimental is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures
[07:23:17] <icinga2-wm>	 RECOVERY - check users on ORES-web01.Experimental is OK: USERS OK - 1 users currently logged in
[07:23:25] <icinga2-wm>	 RECOVERY - check load on ORES-web01.Experimental is OK: OK - load average: 0.15, 0.13, 0.24
[07:24:12] <icinga2-wm>	 RECOVERY - check disk on ORES-web01.Experimental is OK: DISK OK
[14:04:01] <wikibugs>	 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Re-label huwiki damaging and badfaith edits - https://phabricator.wikimedia.org/T223882 (10Halfak) We should have the updated model deployed early this week.  Sorry for the delay!  We had a couple minor hicc...
[14:15:43] <icinga2-wm>	 PROBLEM - ssh on ORES-worker02.experimental is CRITICAL: connect to address ores-worker-02.ores.eqiad.wmflabs and port 22: No route to host
[14:15:43] <icinga2-wm>	 PROBLEM - check load on ORES-worker02.experimental is CRITICAL: connect to address 172.16.3.125 port 5666: No route to hostconnect to host ores-worker-02.ores.eqiad.wmflabs port 5666: No route to host
[14:15:50] <icinga2-wm>	 PROBLEM - check disk on ORES-worker02.experimental is CRITICAL: connect to address 172.16.3.125 port 5666: No route to hostconnect to host ores-worker-02.ores.eqiad.wmflabs port 5666: No route to host
[14:15:50] <icinga2-wm>	 PROBLEM - check users on ORES-worker02.experimental is CRITICAL: connect to address 172.16.3.125 port 5666: No route to hostconnect to host ores-worker-02.ores.eqiad.wmflabs port 5666: No route to host
[14:16:09] <icinga2-wm>	 PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: connect to address 172.16.3.125 port 5666: No route to hostconnect to host ores-worker-02.ores.eqiad.wmflabs port 5666: No route to host
[14:16:52] <icinga2-wm>	 PROBLEM - Host ORES-worker02.experimental is DOWN: CRITICAL - Host Unreachable (ores-worker-02.ores.eqiad.wmflabs)
[14:17:29] <halfak>	 Hmm.  Looks like we might actually be down. 
[14:18:51] <icinga2-wm>	 RECOVERY - Host ORES-worker02.experimental is UP: PING OK - Packet loss = 0%, RTA = 1.09 ms
[14:19:15] <icinga2-wm>	 PROBLEM - ping4 on ORES-redis02.experimental is CRITICAL: CRITICAL - Host Unreachable (ores-redis-02.ores.eqiad.wmflabs)
[14:19:45] <icinga2-wm>	 PROBLEM - Host ORES-redis02.experimental is DOWN: CRITICAL - Host Unreachable (ores-redis-02.ores.eqiad.wmflabs)
[14:20:42] <icinga-wm>	 PROBLEM - ORES web node labs ores-web-02 on ores.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 INTERNAL SERVER ERROR - 2103 bytes in 0.041 second response time https://wikitech.wikimedia.org/wiki/ORES
[14:21:00] <icinga-wm>	 PROBLEM - ORES worker labs on ores.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 INTERNAL SERVER ERROR - 2103 bytes in 1.729 second response time https://wikitech.wikimedia.org/wiki/ORES
[14:21:00] <icinga2-wm>	 RECOVERY - puppet on ORES-worker02.experimental is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures
[14:21:38] <icinga-wm>	 PROBLEM - ORES web node labs ores-web-01 on ores.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 INTERNAL SERVER ERROR - 2103 bytes in 0.028 second response time https://wikitech.wikimedia.org/wiki/ORES
[14:21:45] <icinga2-wm>	 RECOVERY - Host ORES-redis02.experimental is UP: PING OK - Packet loss = 0%, RTA = 21.63 ms
[14:22:42] <icinga2-wm>	 PROBLEM - check load on ORES-worker01.experimental is CRITICAL: connect to address 172.16.3.127 port 5666: No route to hostconnect to host ores-worker-01.ores.eqiad.wmflabs port 5666: No route to host
[14:22:42] <icinga2-wm>	 PROBLEM - ssh on ORES-worker01.experimental is CRITICAL: connect to address ores-worker-01.ores.eqiad.wmflabs and port 22: No route to host
[14:23:06] <icinga2-wm>	 PROBLEM - Host ORES-worker01.experimental is DOWN: CRITICAL - Host Unreachable (ores-worker-01.ores.eqiad.wmflabs)
[14:23:38] <halfak>	 Looks like this is planned downtime. 
[14:23:46] <halfak>	 But whole notification, batman. 
[14:24:00] <icinga-wm>	 RECOVERY - ORES web node labs ores-web-02 on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 981 bytes in 6.041 second response time https://wikitech.wikimedia.org/wiki/ORES
[14:27:04] <icinga2-wm>	 RECOVERY - Host ORES-worker01.experimental is UP: PING OK - Packet loss = 0%, RTA = 179.00 ms
[14:28:56] <icinga-wm>	 PROBLEM - ORES web node labs ores-web-02 on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/ORES
[14:29:46] <icinga-wm>	 RECOVERY - ORES web node labs ores-web-01 on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 981 bytes in 4.788 second response time https://wikitech.wikimedia.org/wiki/ORES
[14:30:24] <icinga-wm>	 RECOVERY - ORES web node labs ores-web-02 on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 981 bytes in 0.974 second response time https://wikitech.wikimedia.org/wiki/ORES
[14:30:40] <icinga-wm>	 RECOVERY - ORES worker labs on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 979 bytes in 1.372 second response time https://wikitech.wikimedia.org/wiki/ORES
[15:15:51] <icinga-wm>	 PROBLEM - ORES web node labs ores-web-02 on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/ORES
[15:17:21] <icinga-wm>	 RECOVERY - ORES web node labs ores-web-02 on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 979 bytes in 4.254 second response time https://wikitech.wikimedia.org/wiki/ORES
[16:16:52] <icinga2-wm>	 PROBLEM - check users on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: No route to hostconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: No route to host
[16:16:52] <icinga2-wm>	 PROBLEM - check disk on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: No route to hostconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: No route to host
[16:16:55] <icinga2-wm>	 PROBLEM - ssh on ORES-web01.Experimental is CRITICAL: connect to address ores-web-01.ores.eqiad.wmflabs and port 22: No route to host
[16:17:10] <icinga2-wm>	 PROBLEM - Host ORES-web01.Experimental is DOWN: CRITICAL - Host Unreachable (ores-web-01.ores.eqiad.wmflabs)
[16:18:05] <icinga-wm>	 PROBLEM - ORES web node labs ores-web-01 on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/ORES
[16:19:10] <icinga2-wm>	 RECOVERY - Host ORES-web01.Experimental is UP: PING OK - Packet loss = 0%, RTA = 0.57 ms
[16:19:25] <icinga2-wm>	 RECOVERY - ping4 on ORES-web01.Experimental is OK: PING OK - Packet loss = 0%, RTA = 0.41 ms
[16:19:33] <icinga-wm>	 RECOVERY - ORES web node labs ores-web-01 on ores.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 981 bytes in 0.099 second response time https://wikitech.wikimedia.org/wiki/ORES
[16:24:08] <wikibugs>	 10ORES, 10Scoring-platform-team (Current): ORES deployment, Early August 2019 - https://phabricator.wikimedia.org/T229848 (10Halfak)
[16:25:52] <wikibugs>	 10ORES, 10Scoring-platform-team (Current): ORES deployment, Early August 2019 - https://phabricator.wikimedia.org/T229848 (10Halfak)
[16:25:54] <wikibugs>	 10Scoring-platform-team (Current), 10articlequality-modeling, 10draftquality-modeling, 10drafttopic-modeling, and 3 others: Retrain models with revscoring 2.5.1 - https://phabricator.wikimedia.org/T229351 (10Halfak)
[16:25:56] <wikibugs>	 10Scoring-platform-team (Current), 10revscoring, 10artificial-intelligence: Wikibase references is a count of ref claims, should be reference statements - https://phabricator.wikimedia.org/T229029 (10Halfak)
[16:25:58] <wikibugs>	 10ORES, 10Scoring-platform-team (Current), 10Analytics-EventLogging, 10Analytics-Kanban, and 4 others: Fix  "Must provide the 'topic' parameter" in ORES /precache endpoint - https://phabricator.wikimedia.org/T228689 (10Halfak)
[16:26:05] <wikibugs>	 10Scoring-platform-team (Current), 10editquality-modeling, 10User-Tgr, 10artificial-intelligence: Retrain damaging/goodfaith models for huwiki - https://phabricator.wikimedia.org/T228078 (10Halfak)
[16:26:07] <wikibugs>	 10Scoring-platform-team (Current), 10revscoring, 10artificial-intelligence: On en.wikipedia, ref tags inserted by the shortened footnote template, {{sfn}}, are not counted in ORES features - https://phabricator.wikimedia.org/T227153 (10Halfak)
[16:26:07] <AsimovBot>	 10[1] 10https://meta.wikimedia.org/wiki/Template:sfn
[16:26:36] <icinga2-wm>	 PROBLEM - check disk on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused
[16:26:38] <icinga2-wm>	 PROBLEM - check users on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused
[16:27:15] <icinga2-wm>	 PROBLEM - check load on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused
[16:30:40] <icinga2-wm>	 PROBLEM - puppet on ORES-web01.Experimental is CRITICAL: connect to address 172.16.3.131 port 5666: Connection refusedconnect to host ores-web-01.ores.eqiad.wmflabs port 5666: Connection refused
[16:30:47] <wikibugs>	 10Scoring-platform-team (Current): Develop automated release strategy from travis CI - https://phabricator.wikimedia.org/T229850 (10Halfak)
[16:54:37] <icinga2-wm>	 RECOVERY - check disk on ORES-web01.Experimental is OK: DISK OK
[16:54:38] <icinga2-wm>	 RECOVERY - check users on ORES-web01.Experimental is OK: USERS OK - 1 users currently logged in
[16:55:16] <icinga2-wm>	 RECOVERY - check load on ORES-web01.Experimental is OK: OK - load average: 0.68, 0.85, 1.52
[16:55:17] <icinga2-wm>	 RECOVERY - puppet on ORES-web01.Experimental is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures
[17:14:18] <accraze>	 hey halfak is there a different proxy env var I need to set on beta? It keeps hanging when I do a git `pull`
[17:14:41] <halfak>	 hmm.  I don't think so.  Let me check what I have. 
[17:16:01] <halfak>	 No proxy needed on beta as far as I can tell
[17:16:09] <accraze>	 hmmm
[17:17:23] <accraze>	 yeah when I do a git pull I get: Failed to connect to webproxy.eqiad.wmnet port 8080: Connection timed out
[17:21:59] <halfak>	 Oh.  No proxy.  Remove that 
[17:22:24] <halfak>	 eqiad.wmnet and eqiad.wmflabs are firewalled apart 
[17:22:30] <accraze>	 ahhh
[17:25:15] <accraze>	 I forgot I had added that to my bashrc
[17:25:19] <accraze>	 alls good
[17:25:21] <accraze>	 thanks!
[17:33:55] <halfak>	 :)  no problem
[18:06:00] <accraze>	 got another question halfak
[18:06:34] <accraze>	 in the deploy docs it says to backport commits to gerrit from the wmf deploy repo...
[18:06:55] <accraze>	 does this just mean cherry-pick the commits over to the gerrit repo?
[18:08:32] <halfak>	 Hmm.  I'm not sure. 
[18:08:33] * halfak reads. 
[18:09:29] <halfak>	 Oh!  No.  This means that the gerrit (prod) and github (wmflabs) repos for ORES have totally separate histories. 
[18:09:41] <halfak>	 And thus changes need to be made in parallel or allowed to diverge on purpose. 
[18:10:08] <accraze>	 ahh ok so I need to add the changes to gerrit
[18:10:17] <accraze>	 ?
[18:13:53] <halfak>	 right.  The prod deploy repo is the primarly place we update for beta and prod. 
[18:14:07] <halfak>	 The wmflabs deploy is intended to be used to experiment, but it largely matches our prod config. 
[18:14:22] <halfak>	 Every now and then we have a model that we don't want to send to prod.  E.g. the translatewiki model. 
[18:16:51] * halfak --> lunch
[19:12:51] <accraze>	 hey halfak, it looks like the gerrit mirrors of articlequality et all have not been updated
[19:13:05] <accraze>	 is there a manual way to do that?
[19:13:25] <halfak>	 Oh yes!  We have a work-around for that.  I'll do it quick.  Damn. 
[19:17:16] * halfak pushes things to articlequality
[19:21:17] <halfak>	 accraze, https://github.com/wikimedia/editquality/pull/210
[19:21:28] <halfak>	 I just saw that I forgot to push a bunch of changes I made last week :|
[19:21:31] <halfak>	 Ha!
[19:22:56] <accraze>	 cool looks good, waiting on travis to merge
[19:23:36] <halfak>	 Blocked on gerrit nonsense. 
[19:23:40] <halfak>	 See #wikimedia-releng
[19:55:35] <halfak>	 accraze, should be good to go
[20:00:11] <accraze>	 cool seems like we're back in business
[20:02:11] <accraze>	 ahh actually the ores mirror doesn't seem to be updated still
[20:02:26] <halfak>	 Oh that's crazy. 
[20:06:28] <halfak>	 accraze, try again
[20:06:39] <accraze>	 it worked!
[20:18:56] <wikibugs>	 (03PS1) 10Accraze: Release revscoring v2.5.1 [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/528258 (https://phabricator.wikimedia.org/T229848)
[20:48:50] * accraze needs sustenance
[21:31:24] <wikibugs>	 (03CR) 10Halfak: [V: 03+2 C: 03+2] Release revscoring v2.5.1 [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/528258 (https://phabricator.wikimedia.org/T229848) (owner: 10Accraze)
[21:31:41] <halfak>	 merged!
[21:31:48] <halfak>	 Sorry for the delay.  Just saw it. 
[22:02:18] <accraze>	 thanks halfak!