[08:34:19] 10Scoring-platform-team-Backlog, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Complete Albanian Wikipedia editquality campaign - https://phabricator.wikimedia.org/T163010#3358136 (10Arianit) Done. Thanks! [14:58:11] halfak: o/ [15:01:43] CUSTOM - Host ores-05 is UP: PING OK - Packet loss = 0%, RTA = 0.46 ms test [15:10:09] CUSTOM - Host ores-redis.01 is UP: PING OK - Packet loss = 0%, RTA = 0.58 ms test [15:16:51] PROBLEM - mem-check on ores-redis.01 is UNKNOWN: NRPE: Unable to read output [15:16:51] PROBLEM - mem-check on ores-web-05 is UNKNOWN: NRPE: Unable to read output [15:16:52] PROBLEM - mem-check on ores-worker-05 is UNKNOWN: NRPE: Unable to read output [15:16:52] PROBLEM - mem-check on ores-lb-02 is UNKNOWN: NRPE: Unable to read output [15:17:04] PROBLEM - mem-check on ores.wmflabs.org is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:17:12] Oh woops [15:17:29] PROBLEM - mem-check on ores-redis.01 is UNKNOWN: NRPE: Unable to read output [15:17:30] PROBLEM - mem-check on ores-web-05 is UNKNOWN: NRPE: Unable to read output [15:17:31] PROBLEM - mem-check on ores-worker-05 is UNKNOWN: NRPE: Unable to read output [15:17:32] PROBLEM - mem-check on ores-lb-02 is UNKNOWN: NRPE: Unable to read output [15:17:50] PROBLEM - mem-check on ores.wmflabs.org is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:18:07] PROBLEM - mem-check on ores-redis.01 is UNKNOWN: NRPE: Unable to read output [15:18:08] PROBLEM - mem-check on ores-web-05 is UNKNOWN: NRPE: Unable to read output [15:18:10] PROBLEM - mem-check on ores-worker-05 is UNKNOWN: NRPE: Unable to read output [15:18:11] PROBLEM - mem-check on ores-lb-02 is UNKNOWN: NRPE: Unable to read output [15:18:38] PROBLEM - mem-check on ores.wmflabs.org is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [15:18:46] PROBLEM - mem-check on ores-redis.01 is UNKNOWN: NRPE: Unable to read output [15:18:47] PROBLEM - mem-check on ores-web-05 is UNKNOWN: NRPE: Unable to read output [15:18:50] PROBLEM - mem-check on ores-worker-05 is UNKNOWN: NRPE: Unable to read output [15:18:50] PROBLEM - mem-check on ores-lb-02 is UNKNOWN: NRPE: Unable to read output [15:19:14] fixed it now [15:23:01] PROBLEM - ssh on ores.wmflabs.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:28:08] ignore ssh on ores.wmflabs.org. it's now monotoring the site but ssh should not work for the site anyways :) [15:53:07] 10Scoring-platform-team-Backlog, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Complete Albanian Wikipedia editquality campaign - https://phabricator.wikimedia.org/T163010#3358441 (10Halfak) Great! We'll get a model built! Thanks for your work. We'll ping you when we're ready to deploy... [15:53:26] 10Scoring-platform-team, 10Wikilabels, 10editquality-modeling, 10artificial-intelligence: Complete Albanian Wikipedia editquality campaign - https://phabricator.wikimedia.org/T163010#3358442 (10Halfak) [15:53:38] 10Scoring-platform-team, 10editquality-modeling, 10artificial-intelligence: Train/test damaging & goodfaith models for Albaian Wikipedia - https://phabricator.wikimedia.org/T163009#3358443 (10Halfak) [15:57:01] halfak: o/ [15:57:34] o/ glorian_wd [15:57:46] back in a bit. getting breakfast together. [16:00:15] halfak: ok. [16:00:15] I am now working on the PR. As you said yesterday I should write: external_sources_ratio = div(all_external_sources, sources). However, sources can be 0 because there could be items that have no sources at all. In this case, I am thinking to use the try-except. [16:00:15] But I want to ask, if you could propose a better approach for dealing with this. [16:00:50] I'd prefer to ask you in advance before I got a comment from you again in PR xD [16:15:36] BTW, I am talking about your feedback: "This processing should depend on [source, external_sources] so that work isn't duplicated." [16:34:49] glorian. Look at all the other uses of "/" in features. [16:35:23] they will do / max(other_feature, 1) [16:36:27] halfak: alright [16:36:31] I'll have a look [16:37:16] halfak: BTW, I will try to re-submit the PR before I go to sleep today [16:38:31] kk I should be able to review. I need a day off, but I'll be around for a few more hours. [16:39:02] halfak: Okay :) [17:23:04] halfak: PR submitted [18:03:39] glorian_wd, I have serious concerns with the PR's performance. [18:04:02] You're really opening a file and reading through it sequentially multiple times per revision scored!? [18:04:08] I'm reading that right, right? [18:04:35] Are you familiar with big O notation for computational complexity? [18:04:42] halfak: hmm yes. [18:05:09] I'll think some better way for it [18:05:17] What's the complexity of "_process_wikimedia_sources"? [18:07:01] Currently, you're running O(M * N) where M is the number of sources and N is the number of exclusions. [18:07:07] You could instead just be O(M) [18:07:24] By putting the internal sources in a `set` [18:07:36] You should expect (practically) constant-time lookups. [18:07:52] Also you won't have to incur the (SUPER DUPER SLOW!!) IO of opening a file and reading it into memory. [18:10:17] Hmm. Your concern is more to the opening file right? [18:10:37] not about when I unpack the item.claims [18:11:44] Let me ask this. What's your assessment? [18:12:08] hi all lol [18:12:20] ha [18:13:04] I just registered the bot for an irc cloak :) [18:13:18] Also ores.wmflabs.org is now monotored by icinga2 too :) [18:13:30] Saw the memory pings a few hours ago [18:13:33] Testing? [18:14:08] halfak those were a mistake [18:14:13] gotcha [18:14:23] those wont work on your instances due to the command not being on the instance (it's in puppet but i had to manually copy the file and command) [18:14:47] I was migrating things from icinga2 director to public view in the repo [18:14:58] https://phabricator.wikimedia.org/diffusion/LICT/ [18:17:18] halfak: https://phabricator.wikimedia.org/T164671 [18:17:18] I class is missing here "başlanğıç" [18:17:58] https://github.com/wiki-ai/wikiclass/blob/master/wikiclass/extractors/trwiki.py [18:17:58] > ("baslağıç", re.compile(r"\bbaşlanğıç\b", re.I)), # start class [18:17:58] Its here but not in results. [18:20:04] Oh! Gotcha. I'll look into it. Thanks for pointing it out. [18:20:33] paladox, didn't we have a patchset for enabling it in the base ORES role? [18:20:50] halfak think so but i think that was rejected [18:21:00] mavrikant, can you find me a page in that class? [18:21:16] halfak we should just manually add it since they rejected your change. [18:21:37] paladox, what? It was rejected? [18:21:38] halfak: https://tr.wikipedia.org/w/index.php?title=Kategori:Ba%C5%9Flang%C4%B1%C3%A7-s%C4%B1n%C4%B1f_Vikiproje_sayfalar%C4%B1&action=edit&redlink=1 [18:21:39] all pages in here [18:21:39] link? [18:21:48] thanks mavrikant [18:22:08] yep [18:22:09] https://gerrit.wikimedia.org/r/#/c/358240/ [18:22:58] paladox, do we have a task for this? [18:23:08] uep [18:23:09] ep [18:23:10] yep [18:23:10] I want to ask akosiaris what he suggests in a place where we can have a good conversation [18:23:11] https://phabricator.wikimedia.org/T167602 [18:23:46] i think they should just stick those checks in base for everyone to have access including labs. [18:24:26] 10Scoring-platform-team, 10Patch-For-Review, 10User-Zppix: Create memory checks for instances - https://phabricator.wikimedia.org/T167602#3358540 (10Halfak) @akosiaris, I saw that you -2'd https://gerrit.wikimedia.org/r/#/c/358240/. It seems we want to set up these checks. So. What's a good alternative fr... [18:24:42] paladox, I replied there. We'll see. [18:24:46] ok [18:24:48] thanks [18:25:22] halfak do you use a puppet master? [18:25:33] same as everyone else [18:25:38] No custom one [18:26:07] oh [18:43:18] mavrikant, looks like "Başlangıç" is the right string. [18:43:55] But we were matching "Başlanğıç" [18:44:07] So I just replaced "ğ" with "[gğ [18:44:10] ']" [18:44:12] Woops [18:44:17] Does that make sense? [18:44:38] Could it be that "Başlanğıç" never occurs even once? [18:45:10] sorry my bad. "Başlangıç" is true one. [18:45:18] Oh! OK. [18:47:25] OK re-extracting. This will take a while, but with any luck, I'll have it ready before I need to travel tomorrow. [18:47:28] mavrikant, ^ [18:47:40] I'll become kind of unavailable next week as a do a management offsite thing. [18:48:40] okey. thanks. we can use next to examine results. no problem.