[00:57:09] PROBLEM - check load on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [00:58:24] PROBLEM - check users on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [00:59:04] PROBLEM - check disk on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [00:59:39] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [03:17:13] PROBLEM - check load on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [03:18:28] PROBLEM - check users on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [03:19:08] PROBLEM - check disk on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [03:19:43] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [05:37:15] PROBLEM - check load on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [05:38:30] PROBLEM - check users on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [05:39:10] PROBLEM - check disk on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [05:39:45] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [07:57:19] PROBLEM - check load on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [07:58:33] PROBLEM - check users on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [07:59:13] PROBLEM - check disk on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [07:59:48] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [10:17:20] PROBLEM - check load on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [10:18:35] PROBLEM - check users on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [10:19:15] PROBLEM - check disk on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [10:19:50] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [10:43:48] wikimedia/revscoring#1634 (zhwiki - f3917ea : halfak): The build passed. https://travis-ci.org/wikimedia/revscoring/builds/534141365 [10:54:40] wikimedia/revscoring#1635 (zhwiki - 58ac6de : halfak): The build was broken. https://travis-ci.org/wikimedia/revscoring/builds/534143285 [12:37:23] PROBLEM - check load on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [12:38:38] PROBLEM - check users on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [12:39:18] PROBLEM - check disk on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [12:39:53] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [13:48:48] O/ halfak_ whenever you are able to work on that task for icinga2, let me know if you need help (i know you been busy so whenever your able is fine) [14:15:26] 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10Serbian-Sites, 10artificial-intelligence: Investigate srwiki goodfaith model, why is it so bad? - https://phabricator.wikimedia.org/T199355 (10Acamicamacaraca) Wow. From mud to gold :) [14:26:06] 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10Serbian-Sites, 10artificial-intelligence: Investigate srwiki goodfaith model, why is it so bad? - https://phabricator.wikimedia.org/T199355 (10Petar.petkovic) >>! In T199355#5193386, @Acamicamacaraca wrote: > Wow. From mud to gold... [14:42:37] 10Scoring-platform-team (Current), 10Wikilabels, 10editquality-modeling, 10Serbian-Sites, 10artificial-intelligence: Investigate srwiki goodfaith model, why is it so bad? - https://phabricator.wikimedia.org/T199355 (10Acamicamacaraca) [14:42:44] 10Scoring-platform-team (Research), 10Serbian-Sites: New labeling campaign for srwiki - https://phabricator.wikimedia.org/T220556 (10Acamicamacaraca) 05Open→03Resolved [14:57:26] PROBLEM - check load on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [14:58:41] PROBLEM - check users on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [14:59:21] PROBLEM - check disk on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [14:59:56] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [16:37:36] Hey Zppix [16:37:40] just digging in now [16:39:30] 10Scoring-platform-team, 10VPS-project-icinga2, 10User-Zppix: Ensure ORES experimental hosts nrpe_local.cfg allows icinga2’s ip - https://phabricator.wikimedia.org/T223578 (10Halfak) Looks like we have `allowed_hosts=172.16.7.178`. So maybe the icinga2 server changed? [16:42:11] https://github.com/wikimedia/puppet/blob/011a633fe9f09e64715dc198d1e0fae9028d7cf1/hieradata/labs.yaml#L271 [16:42:14] Looks relevant [16:42:43] 10Scoring-platform-team, 10VPS-project-icinga2, 10User-Zppix: Ensure ORES experimental hosts nrpe_local.cfg allows icinga2’s ip - https://phabricator.wikimedia.org/T223578 (10Halfak) Looks like this is set from here: https://github.com/wikimedia/puppet/blob/011a633fe9f09e64715dc198d1e0fae9028d7cf1/hieradata/... [16:43:12] halfak_: i think you have to set it manually [16:43:30] No way. The file says it is written by puppet and to not edit it manually. [16:44:09] paladox: ^ [16:45:07] wikimedia/revscoring#1637 (zhwiki - e36e0fc : halfak): The build was fixed. https://travis-ci.org/wikimedia/revscoring/builds/534218946 [16:45:21] ya damn right, Travis. [16:45:38] Lol [16:46:57] Gonna take a nap, but I'll be back in a couple of hours [16:47:38] K [17:07:08] hare yup, you have to change the var to the new format [17:07:12] uh wrong ping [17:07:19] Zppix ^^ [17:17:29] PROBLEM - check load on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [17:18:44] PROBLEM - check users on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [17:19:24] PROBLEM - check disk on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [17:19:59] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [17:23:09] 10Scoring-platform-team, 10VPS-project-icinga2, 10User-Zppix: Ensure ORES experimental hosts nrpe_local.cfg allows icinga2’s ip - https://phabricator.wikimedia.org/T223578 (10Paladox) @Halfak hi, you need to follow https://wikitech.wikimedia.org/w/index.php?title=Hiera%3AGit&type=revision&diff=1826346&oldid=... [18:09:05] 10Jade: Schema drift between Beta Cluster and Jade repo - https://phabricator.wikimedia.org/T223747 (10Harej) [18:10:21] 10Jade: Schema drift between Beta Cluster and Jade repo - https://phabricator.wikimedia.org/T223747 (10Reedy) `lang=sql MariaDB [enwiki]> explain jade_diff_judgment; +----------------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra... [18:11:56] 10Jade: Schema drift between Beta Cluster and Jade repo - https://phabricator.wikimedia.org/T223747 (10Reedy) Basically, the problem is no one added incremental patch files. Just added more columns and indexes [18:12:43] 10Jade: Schema drift between Beta Cluster and Jade repo - https://phabricator.wikimedia.org/T223747 (10Reedy) Caused by https://github.com/wikimedia/mediawiki-extensions-Jade/commit/155d040942c83bbbd3ee795e54404d7d8af65615#diff-ac5c74b64b4b8352ef2f181affb5ac2a and https://github.com/wikimedia/mediawiki-extension... [19:09:22] 10Scoring-platform-team, 10VPS-project-icinga2, 10User-Zppix: Ensure ORES experimental hosts nrpe_local.cfg allows icinga2’s ip - https://phabricator.wikimedia.org/T223578 (10Halfak) Like this? https://wikitech.wikimedia.org/w/index.php?title=Hiera%3AOres&type=revision&diff=1826556&oldid=1821792 [19:37:33] PROBLEM - check load on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [19:38:48] PROBLEM - check users on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [19:39:28] PROBLEM - check disk on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [19:40:03] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [20:16:25] PROBLEM - ores-extension grafana alert on icinga1001 is CRITICAL: CRITICAL: ORES extension ( https://grafana.wikimedia.org/d/000000263/ores-extension ) is alerting: Service hits for obtaining thresholds alert. https://wikitech.wikimedia.org/wiki/ORES [20:16:33] PROBLEM - ores grafana alert on icinga1001 is CRITICAL: CRITICAL: ORES advanced metrics ( https://grafana.wikimedia.org/d/vAN_bQemz/ores-advanced-metrics ) is alerting: Overload errors alert. https://wikitech.wikimedia.org/wiki/ORES [20:16:48] Opes. not us. [20:17:53] RECOVERY - ores-extension grafana alert on icinga1001 is OK: OK: ORES extension ( https://grafana.wikimedia.org/d/000000263/ores-extension ) is not alerting. https://wikitech.wikimedia.org/wiki/ORES [20:18:00] RECOVERY - ores grafana alert on icinga1001 is OK: OK: ORES advanced metrics ( https://grafana.wikimedia.org/d/vAN_bQemz/ores-advanced-metrics ) is not alerting. https://wikitech.wikimedia.org/wiki/ORES [20:19:26] 10Scoring-platform-team, 10editquality-modeling, 10artificial-intelligence: Include pinyin for zhwiki damaging model - https://phabricator.wikimedia.org/T223750 (10Halfak) [20:22:58] 10Scoring-platform-team, 10editquality-modeling, 10artificial-intelligence: Include pinyin for zhwiki damaging model - https://phabricator.wikimedia.org/T223750 (10zhuyifei1999) Fuzzys: * c, ch * z, zh * s, sh * en, eng * in, ing These I don't use: * an, ang * l, n * f, h * l, r * g, k * l, [20:23:51] wikimedia/revscoring#1639 (nlwiki_better_language_assets - 2787f53 : halfak): The build passed. https://travis-ci.org/wikimedia/revscoring/builds/534270560 [21:11:17] wikimedia/revscoring#1640 (zhwiki - ec65506 : halfak): The build was broken. https://travis-ci.org/wikimedia/revscoring/builds/534281007 [21:36:43] 10Scoring-platform-team, 10VPS-project-icinga2, 10User-Zppix: Ensure ORES experimental hosts nrpe_local.cfg allows icinga2’s ip - https://phabricator.wikimedia.org/T223578 (10Paladox) @Halfak nope, like: ` monitoring_hosts: - 127.0.0.1 - 172.16.1.180 ` [21:57:36] PROBLEM - check load on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [21:58:51] PROBLEM - check users on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [21:59:31] PROBLEM - check disk on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer [22:00:06] PROBLEM - puppet on ORES-worker02.experimental is CRITICAL: CHECK_NRPE: Error - Could not connect to 172.16.3.125: Connection reset by peer