[00:17:55] 10ORES, 10Scoring-platform-team: Review prometheus ORES rules for completeness - https://phabricator.wikimedia.org/T233448 (10colewhite) Since it's not used in dashboards, what do we do with the model? I imagine it's useful, but I'm not sure how. [01:52:32] halfak: that task should go to the backlog [01:52:44] i have some partial progress on it that might be useful in the future, but i dropped it for now [17:32:21] kevinbazira, if you want to get started on deployment stuff, you can run all of the commands up until it has you run "scap" [17:32:42] You'll want to join the #wikimedia-operations channel [18:06:58] 10Scoring-platform-team (Current): Onboarding Kevin Bazira -- Accounts and Access - https://phabricator.wikimedia.org/T234222 (10Halfak) [18:07:10] 10Scoring-platform-team (Current): Onboarding Kevin Bazira -- Accounts and Access - https://phabricator.wikimedia.org/T234222 (10Halfak) 05Open→03Resolved a:03Halfak [18:51:38] 10ORES, 10Scoring-platform-team, 10articlequality-modeling, 10artificial-intelligence: ORES articlequality for euwiki works differently in production - https://phabricator.wikimedia.org/T239942 (10Halfak) [18:53:57] 10ORES, 10Scoring-platform-team, 10articlequality-modeling, 10artificial-intelligence: ORES articlequality for euwiki works differently in production - https://phabricator.wikimedia.org/T239942 (10Halfak) @kevinbazira and I looked into this. We ran the following command to look up the installed versions o... [18:54:19] 10ORES, 10Scoring-platform-team, 10articlequality-modeling, 10artificial-intelligence: ORES articlequality for euwiki works differently in production - https://phabricator.wikimedia.org/T239942 (10Halfak) p:05Triage→03High [19:07:20] WTf, this weirdness is super weird. [19:07:31] All of the machines have the same code checked out. All of them have the same dictionaries. [19:07:40] But they have different counts of English words. [19:07:41] Why? [19:14:44] that sounds maddening! [20:35:47] how are you measuring the count of English words, halfak ¿ [20:36:14] We're using enchant dictionaries. In this case, we're using hunspell-en. [20:37:15] what is giving the count? [20:48:08] We've got some code that generates a list of "word-like tokens" [20:48:24] Then we filter that using pyenchant's dict.check() method. [20:48:40] Anything that passes the check is included for the count of English words. [20:49:18] I was just able to demonstrate that we get 3 different counts depending on what server we run our code on in EQIAD. [20:49:24] What The Fark? [20:49:43] pyenchant is also in the same version? [20:49:57] where's that piece of code? [20:49:58] Right [20:50:26] I'm trying to think if depending on the load order the code could give some different results [20:50:41] but seems hard to think such heisenbug [20:52:19] Right. The code is consistent too. Each host gives a consistent count! [20:53:32] 10ORES, 10Scoring-platform-team, 10articlequality-modeling, 10artificial-intelligence: ORES articlequality for euwiki works differently in production - https://phabricator.wikimedia.org/T239942 (10Halfak) I confirmed that all nodes have the same git hash. I also checked to see if any other *spell-en* di... [20:56:16] may I see the code? [20:56:24] I guess it's at some github repo [20:56:31] https://github.com/wikimedia/revscoring/blob/master/revscoring/languages/english.py [20:57:09] Most of the business logic is here: https://github.com/wikimedia/revscoring/blob/master/revscoring/languages/features/dictionary/datasources.py [20:57:45] so len(words_to_watch) differs? [21:00:03] len(words_to_watch) is consistent between hosts. [21:00:14] But it is a different thing than the dictionary stuff. [21:03:41] I've confirmed that I can replicate the problem with the "enchant" command line utility -- bypassing python [21:03:53] Some words are "English Words" on one machine and not the other. [21:04:26] that's an improvement [21:04:35] maybe there are more dictionaries installed in some machines? [21:04:45] enchant can use more formats than hunspell [21:05:51] Yeah. I checked that. Seems that a diff of "apt-cache policy hunspell-en*" and "apt-cache policy *spell-en*" show no differences. [21:07:48] that about *spell-dictionary* ? [21:07:50] *what [21:08:18] 10ORES, 10Scoring-platform-team, 10articlequality-modeling, 10artificial-intelligence: ORES articlequality for euwiki works differently in production - https://phabricator.wikimedia.org/T239942 (10Halfak) Alright! I figured out how to replicate the issue between machines without using any of our code. In... [21:08:20] Oh I'll try that. [21:08:25] maybe myspell-dictionary, aspell-dictionary, ispell-dictionary or hunspell-dictionary [21:10:21] Ha. No difference for *spell-* [21:10:28] :/ [21:10:47] fwiw, locally it tells me de is misspelled but eta is not [21:11:02] as ores1009 [21:11:03] Same. [21:11:11] On my local laptop [21:11:30] enchant also supports configuring which backend to use [21:12:02] is /usr/share/enchant/enchant.ordering the same between them, too? [21:12:06] I honestly don't know what to check next. [21:12:08] Oh I'll look [21:12:20] I would expect them to be [21:12:29] it is part of the package [21:12:48] and there's probably not a local ~/.enchant [21:13:31] No difference. [21:13:57] perhaps if there was a .d config directory somewhere [21:14:14] with files loaded in arbitrary order depending on how they are listed by the fs [21:15:28] some years ago I would have thought if one of them could be 32 bit and the other 64 [21:15:35] not a problem we would have in 2019 [21:16:06] Right. All of these machines are managed by puppet too. [21:16:17] So differing configurations are unlikely. [21:16:59] a package named ienglish* ? [21:17:22] these are used by ispell [21:17:24] i + langname [21:17:58] 10ORES, 10Scoring-platform-team, 10articlequality-modeling, 10artificial-intelligence: ORES articlequality for euwiki works differently in production - https://phabricator.wikimedia.org/T239942 (10Halfak) I made extra-double sure there was nothing different about what spelling packages were installed. `... [21:18:33] I wonder if I should force a puppet run. [21:18:50] or a ores1001 reimage :P [21:19:34] Heh. ores1001-1004 are the same. [21:19:42] Actually, also 1005. [21:19:54] 6 & 7 are the same and 8 & 9 are the same. [21:19:56] lol [21:20:13] All of them were built and configured at the same time. [21:20:34] are they in different datacenters? [21:20:44] this reminds me the 500-mile email [21:20:46] https://www.ibiblio.org/harris/500milemail.html [21:21:11] Same datacenters [21:22:07] did you see the ienglish* I mentioned above? [21:24:50] are the locales equal, too? [21:25:05] what if you run then with LANG=C enchant -d -d en ? [21:25:38] Ha. Nice read. [21:25:46] Why the two "-d"? [21:27:02] er, the first should have been -l [21:27:13] just wanted to prepend LANG=C [21:27:17] Got it. With that, there is no change. [21:28:59] good :) [21:29:09] different locales [21:29:39] print $LANG on each server [21:30:19] beware of not expanding it locally before it gets sent to the remote host [21:30:55] there are other LC_* env variables that could matter, but most probably only LANG will be set [21:35:55] Same on both hosts [21:36:01] en_US.UTF-8 [21:36:33] env | grep LC_ ? [21:37:32] runinng locale on each host? [21:39:57] 10ORES, 10Scoring-platform-team, 10articlequality-modeling, 10artificial-intelligence: ORES articlequality for euwiki works differently in production - https://phabricator.wikimedia.org/T239942 (10Halfak) ` $ ssh ores1001.eqiad.wmnet "/srv/deployment/ores/deploy/venv/bin/python -c 'import enchant; print(en... [21:40:26] No output [21:40:30] On either. [21:41:08] the grep may not provide data if no LC_ variable is defined [21:41:14] but locale should list a bunch of variables [21:42:03] Ah. Right. Identical between the hosts. [21:44:29] well, it *is* locale related [21:45:15] any difference in the files opened? [21:45:17] strace -e open,openat enchant -l -d en [21:48:16] I guess strace writes to stderr? [21:48:19] yes [21:48:30] 2>&1 or -o [21:48:56] Oooo. Now we're getting somewhere! [21:49:46] 10ORES, 10Scoring-platform-team, 10articlequality-modeling, 10artificial-intelligence: ORES articlequality for euwiki works differently in production - https://phabricator.wikimedia.org/T239942 (10Halfak) ` $ ssh ores1001.eqiad.wmnet "cat 'de' | strace -e open,openat enchant -l -d en" &> ores1001.strace $... [21:49:50] Platonides, ^ check that out [21:50:16] one uses aspell, the other hspell [21:50:24] ? [21:50:42] no [21:50:54] one uses australian, the other us english [21:51:23] Indeed! [21:51:42] and LANG=C resets that [21:52:43] I just tried and it didn't make a difference. [21:52:50] ? [21:52:51] E.g., ssh ores1001.eqiad.wmnet "cat 'de' | LANG=C strace -e open,openat enchant -l -d en" &> ores1001.strace [21:53:12] but the LANG=C made the difference go away! [21:53:17] Nope. [21:53:26] Diff looks the same. [21:53:33] and LANG=C resets thatoops [21:53:34] Sorry to be unclear [21:53:43] sorry [21:53:47] i took "there is no change." [21:53:53] as meaning both hosts having the same behavior [21:54:03] not "still different" [21:54:16] Right. Regretfully, we haven't forced it to get into line yet,. [21:54:34] dpkg-reconfigure --force locales ? [21:55:08] Hmm. I think maybe we should leave things here before I get too aggressive with testing things out on our production hosts. [21:55:14] I'm very hesitant to make changes. [21:55:26] you can view the content, then cancel rather than accept [21:55:41] I'm not sure on which file it is stored [21:56:09] Aha. I don't have sudo anyway [21:56:14] Not on these hosts. [21:56:25] well, you accept the first option, then it asks "Default locale for the system environment:" [21:56:52] "dpkg-reconfigure: command not found" [21:57:15] sudo has it in the path though [21:57:25] it's in /usr/sbin [21:57:56] "/usr/sbin/dpkg-reconfigure must be run as root" [21:58:36] this should be able to show as user [21:58:44] even if it prins an error about some settings: [21:58:45] debconf-show locales locales/default_environment_locale [21:59:03] hmm, or just debconf-show locales [21:59:18] locales/default_environment_locale shold be the interesting piece [21:59:33] Same between the machines. [21:59:54] let's forget about locales [22:00:23] both have en_US.UTF-8 [22:00:29] yet one uses aussie [22:00:45] That's right. [22:00:52] I wonder if it is due to some install-order issue. [22:01:00] probably [22:01:06] but it should be *somewhere* [22:01:53] I have to jump into my final meetings of the day so I'll need to drop off. [22:01:59] Thanks so much Platonides! :D [22:02:10] you're welcome [22:02:32] that's an interesting puzzle [22:04:56] I see no difference between having just hunspell-en-us and only hunspell-en-au [22:05:00] these are buster ? [22:59:37] 10ORES, 10Scoring-platform-team, 10articlequality-modeling, 10artificial-intelligence: ORES articlequality for euwiki works differently in production - https://phabricator.wikimedia.org/T239942 (10Halfak) @akosiaris, could you look at my work here and see what you think? The above commands give you a nice...