[00:03:05] 10Scoring-platform-team (Current), 10DBA, 10Operations, 10cloud-services-team, 10Patch-For-Review: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3728377 (10bd808) 05Open>03Resolved >>! In T168584#3725668, @MoritzMuehlenhoff wrote: > Let's just keep 1003 running w/o r... [08:07:09] 10Scoring-platform-team, 10Wikilabels, 10Easy, 10Google-Code-in-2017: pytest for flask application of wikilabels - https://phabricator.wikimedia.org/T179015#3728708 (10Florian) Imported as https://codein.withgoogle.com/dashboard/tasks/5072119991894016/. [13:54:33] halfak: buenas días [13:54:49] halfak: Did you happen to see my question in https://phabricator.wikimedia.org/T178441#3728184 ? [13:55:28] IMO it’s best to decouple the two major upgrades, to revscoring 2 and to celery 4 [14:10:24] whooohahaha I see your all away statuses now. [14:31:31] 10Scoring-platform-team (Current), 10ORES, 10Patch-For-Review: Upgrade celery to 4.1.0 for ORES - https://phabricator.wikimedia.org/T178441#3729713 (10awight) https://github.com/wiki-ai/ores/commit/907d59ba55b5 [14:36:29] Amir1: thanks! FYI, no official OK from halfak yet but we can always re-revert :) [14:36:59] of course [14:37:07] Let's keep it DL though :D [14:37:22] lol it will be his surprise gift when he’s back from all the travel [14:45:37] (03PS1) 10Awight: Rebuild wheels, downgrade to Celery 3 [research/ores/wheels] - 10https://gerrit.wikimedia.org/r/388065 [14:48:36] (03PS2) 10Awight: Rebuild wheels, downgrade to Celery 3 [research/ores/wheels] - 10https://gerrit.wikimedia.org/r/388065 (https://phabricator.wikimedia.org/T178441) [14:49:21] (03PS1) 10Awight: Downgrade to Celery 3, bump other requirements [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/388066 [14:49:56] (03PS2) 10Awight: Downgrade to Celery 3, bump other requirements [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/388066 (https://phabricator.wikimedia.org/T178441) [14:58:31] (03PS3) 10Awight: Downgrade to Celery 3, bump other requirements [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/388066 (https://phabricator.wikimedia.org/T178441) [14:59:36] Amir1: I think those patches are ready to go, it looks good locally at least. [14:59:50] https://gerrit.wikimedia.org/r/388065 https://gerrit.wikimedia.org/r/388066 [15:00:00] (03CR) 10Ladsgroup: [C: 032] Rebuild wheels, downgrade to Celery 3 [research/ores/wheels] - 10https://gerrit.wikimedia.org/r/388065 (https://phabricator.wikimedia.org/T178441) (owner: 10Awight) [15:00:09] whahappened. I blinked. [15:01:27] Amir1: I just noticed something nasty—an ores wheel is included [15:01:39] and editquality [15:05:37] (03PS1) 10Awight: Remove wheels which are included as source [research/ores/wheels] - 10https://gerrit.wikimedia.org/r/388070 (https://phabricator.wikimedia.org/T178441) [15:06:00] (03CR) 10Ladsgroup: [C: 032] Remove wheels which are included as source [research/ores/wheels] - 10https://gerrit.wikimedia.org/r/388070 (https://phabricator.wikimedia.org/T178441) (owner: 10Awight) [15:06:08] (03PS1) 10Awight: Remove local source packages from wheels [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/388071 (https://phabricator.wikimedia.org/T178441) [15:06:18] Thanks, hopefully that’s all for a while! [15:10:20] o/ [15:10:45] halfak: hehe you almost missed it. I’m reverting Celery 4 so we can get a clean revscoring 2 deploy... [15:10:50] Just got moving. I agree with splitting the celery 4 and revscoring 2 stuff. [15:10:53] cool. [15:11:01] Wanna kick this short patch chain? https://gerrit.wikimedia.org/r/#/c/388071/ [15:11:03] Will be a great test [15:11:26] (03CR) 10Halfak: [V: 032 C: 032] Remove local source packages from wheels [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/388071 (https://phabricator.wikimedia.org/T178441) (owner: 10Awight) [15:11:44] btw, I think there’s something missing from the wheel rebuild instructions, cos this time I ended up with wheels for editquality, ores, and wikiclass. [15:11:58] Not sure how we could install just the dependencies though... [15:12:04] makefile should have removed those. [15:12:25] oooh, Makefile [15:12:29] haha I did it manually [15:12:55] https://github.com/wiki-ai/ores-wmflabs-deploy/blob/master/Makefile#L1 [15:12:57] :) [15:13:17] I see now, nice. I’ll go ahead and update https://wikitech.wikimedia.org/wiki/ORES/Deployment the next time I do this. [15:13:34] halfak: One more predecessor, https://gerrit.wikimedia.org/r/#/c/388066 [15:15:34] Arg. Should have just switched to an old commit in ORES rather than reverting :P [15:15:39] This works though. [15:16:03] huh, I wanted to keep the work since then [15:16:14] (03CR) 10Halfak: [V: 032 C: 032] Downgrade to Celery 3, bump other requirements [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/388066 (https://phabricator.wikimedia.org/T178441) (owner: 10Awight) [15:16:19] I was gonna make a CELERY_4 branch and continue work there... [15:16:23] Gotcha. Makes sense [15:16:45] I've got to head out to grab some breakfast (sitting on my hotel bed) [15:16:52] Will be gone for maybe 20 mins. [15:17:48] o/ [15:18:05] Does anyone else have anything better to do than marching up and down the square??? [15:32:56] Amir1: The revscoring 2 + celery 3 deployment looks solid. Feel like hammering on it? https://ores-beta.wmflabs.org/ [15:33:10] سعقث [15:33:13] *sure [15:33:18] What should I +2 [15:33:21] ? [15:33:36] The code is all merged, I’m just smoke-testing beta. [15:33:46] Planning to actually deploy it in 1.5hr [15:40:35] Awesome [15:40:50] Today is Wikidata day, so I'm around for fast +2s but can't work much [15:45:12] aha cool, thanks for the help so far! [15:46:07] I got into the database and confirmed that FetchScoreJob did its thing, and RecentChanges has thresholds etc. [15:51:58] 10Scoring-platform-team, 10Cloud-VPS: Could use some more disk space on ores-misc-01.ores-staging.eqiad.wmflabs:/srv - https://phabricator.wikimedia.org/T179592#3729905 (10awight) [15:53:37] 10Scoring-platform-team: Could use some more disk space on ores-misc-01.ores-staging.eqiad.wmflabs:/srv - https://phabricator.wikimedia.org/T179592#3729926 (10chasemp) This is the process to request it, we have a few options but they all involve building a new instance I believe: https://phabricator.wikimedia.or... [15:55:26] 10Scoring-platform-team: Could use some more disk space on ores-misc-01.ores-staging.eqiad.wmflabs:/srv - https://phabricator.wikimedia.org/T179592#3729933 (10awight) @Halfak @Ladsgroup @Sumit @Catrope Please clean up your stuff in the backup directory on that server, ores-misc-01:/srv/ores-compute-01-20170711/ [15:56:50] 10Scoring-platform-team: Could use some more disk space on ores-misc-01.ores-staging.eqiad.wmflabs:/srv - https://phabricator.wikimedia.org/T179592#3729949 (10awight) @chasemp We're probably fine with a rebuild, but I was hoping that using the /srv mount would have made it easier for us to stretch our elbows out... [15:58:53] 10Scoring-platform-team: Could use some more disk space on ores-misc-01.ores-staging.eqiad.wmflabs:/srv - https://phabricator.wikimedia.org/T179592#3729950 (10awight) @Catrope nvm, your dir is just a few hundred MBs. Anyone know if user "agx" is in Phabricator? [16:02:12] oh nice, I even get stalker notifications when “buddies” /back [16:02:38] Important WMF meeting starting right now BTW [16:03:52] halfak: ^ [16:04:09] I'm there :) [16:04:48] Kk I didn’t see you in the bluejeans roster [16:09:33] I'm on youtube and IRC :) [16:09:40] Much easier on the battery life [16:22:24] 10Scoring-platform-team: Could use some more disk space on ores-misc-01.ores-staging.eqiad.wmflabs:/srv - https://phabricator.wikimedia.org/T179592#3729905 (10bd808) > I was hoping that using the /srv mount would have made it easier for us to stretch our elbows out a bit? If you have already mounted the full qu... [16:27:10] 10Scoring-platform-team: Could use some more disk space on ores-misc-01.ores-staging.eqiad.wmflabs:/srv - https://phabricator.wikimedia.org/T179592#3730012 (10bd808) >>! In T179592#3729950, @awight wrote: > Anyone know if user "agx" is in Phabricator? https://wikitech.wikimedia.org/wiki/User_talk:Agx [16:45:38] 10Scoring-platform-team (Current), 10ORES, 10Operations, 10Patch-For-Review: Review and fix file handle management in worker and celery processes - https://phabricator.wikimedia.org/T174402#3730085 (10Ladsgroup) [16:48:46] 10Scoring-platform-team (Current), 10editquality-modeling, 10artificial-intelligence: Flagged revs approve model to fiwiki - https://phabricator.wikimedia.org/T166235#3730092 (10Ladsgroup) [16:52:23] 10Scoring-platform-team (Current), 10Wikimedia-Site-requests, 10User-Ladsgroup: Enable draftquality model in ORES extension for enwiki - https://phabricator.wikimedia.org/T179596#3730109 (10Ladsgroup) [16:52:35] 10Scoring-platform-team (Current), 10MediaWiki-extensions-ORES, 10draftquality-modeling, 10MW-1.31-release-notes (WMF-deploy-2017-10-24 (1.31.0-wmf.5)), and 3 others: Store draftquality data in ores extension - https://phabricator.wikimedia.org/T176183#3730123 (10Ladsgroup) [16:52:37] 10Scoring-platform-team (Current), 10Wikimedia-Site-requests, 10User-Ladsgroup: Enable draftquality model in ORES extension for enwiki - https://phabricator.wikimedia.org/T179596#3730122 (10Ladsgroup) [17:20:01] 10Scoring-platform-team, 10Wikimania-Hackathon-2017, 10Documentation, 10Easy, 10Google-Code-in-2017: [Wikimania doc sprint] docs on how to install ORES - https://phabricator.wikimedia.org/T170506#3730211 (10Ladsgroup) [17:20:53] 10Scoring-platform-team, 10Wikilabels, 10Easy, 10Google-Code-in-2017: Error messages should not contain relative paths or error codes - https://phabricator.wikimedia.org/T175726#3730212 (10Ladsgroup) [17:21:16] 10Scoring-platform-team: Could use some more disk space on ores-misc-01.ores-staging.eqiad.wmflabs:/srv - https://phabricator.wikimedia.org/T179592#3730218 (10awight) Thanks for all the help! Our team can drop about 15GB of cruft, which will buy us time. [17:21:33] 10Scoring-platform-team, 10Wikilabels, 10Easy, 10Google-Code-in-2017: qunit tests for wikilabels - https://phabricator.wikimedia.org/T171083#3730219 (10Ladsgroup) [17:22:08] 10Scoring-platform-team, 10MediaWiki-extensions-ORES, 10Easy, 10Google-Code-in-2017: Wikicode is not interpreted in system message - https://phabricator.wikimedia.org/T142406#3730224 (10Ladsgroup) [17:44:21] 5xx rate is spiking, that looks bad. [17:46:18] halfak: Amir1: Can you give me your opinion of the service graphs? [17:46:48] I don’t undertand why there would be a codfw spike, and 5xx’s blowing up although I’m not sure how much of that is normal during a deployment,. [17:47:08] Memory usage dropped in half which is surprising. [17:50:44] do you all do anything special with how python dependencies are fetched to be able to be deployed in prod? I'm looking into scap based deployment of our mjolnir repository which has a few python dependencies and wonder if i need to do anything special. Grabbing those deps from the internet has been a no-no for other languages [17:51:36] ebernhardson: We do something special and minorly horrific [17:51:42] but at the end of the day, it’s just wheels. [17:51:56] * awight digs for a link [17:52:20] https://github.com/wiki-ai/ores-wmflabs-deploy/blob/master/Makefile [17:52:20] :) thanks [17:52:55] hmm, so basically the wheel just wraps up all those deps into a single file? [17:53:15] https://phabricator.wikimedia.org/source/ores-deploy-wheels/browse/master/ [17:53:48] many files :) ok. [17:53:52] https://phabricator.wikimedia.org/source/ores-deploy/browse/master/Makefile [17:53:54] exactly [17:54:12] I’ve only run into a few small glitches. [17:54:20] The most annoying part is just generating them consistently. [17:54:42] Hence, the makefile that I only learned about today :D [17:54:51] T [17:55:28] There seem to be tricks about making the wheels compatible with your production platform, so I’d recommend building them on a labs machine with the same OS [17:55:45] thankfully we have lots less dependencies, kinda. I just checked and my virtualenv directory is 283MB :S maybe we have more than i thought [17:56:45] ok thanks for the pointers, i'll be digging into this today and tomorrow and those links should help. [17:56:57] right on, let’s chat if you have better ideas about how to do this... [17:57:16] halfak: I saw a bunch of 5xx for cswiki scoring, but it looks fine onwiki: https://cs.wikipedia.org/wiki/Speciáln%C3%AD:Posledn%C3%AD_změny?hidebots=1&hidecategorization=1&hideWikibase=1&limit=50&days=7&damaging__likelygood_color=c4&urlversion=2 [17:57:57] and when I visited the links (unfortunately, truncated, still battling logstash) they were fine. [17:58:09] Our error graphs are really annoying. [17:58:25] That is not actually 0.10 5xx errors per minute. [18:05:26] 10Scoring-platform-team (Current), 10ORES: Rewind revscoring 1/2 compatibility hacks - https://phabricator.wikimedia.org/T179602#3730423 (10awight) [18:07:28] 10Scoring-platform-team (Current), 10ORES, 10Patch-For-Review: Deploy ORES (revscoring 2.0) - https://phabricator.wikimedia.org/T175180#3585227 (10awight) [18:10:28] 10Scoring-platform-team (Current), 10ORES, 10Patch-For-Review: Deploy ORES (revscoring 2.0) - https://phabricator.wikimedia.org/T175180#3730448 (10awight) [18:16:18] 10Scoring-platform-team (Current), 10ORES: Consistent TimeoutErrors when using Celery 4 - https://phabricator.wikimedia.org/T179524#3730467 (10awight) [18:16:45] 10Scoring-platform-team (Current), 10ORES: Consistent TimeoutErrors when using Celery 4 - https://phabricator.wikimedia.org/T179524#3727468 (10awight) [18:16:47] 10Scoring-platform-team (Current), 10ORES, 10Patch-For-Review: Upgrade celery to 4.1.0 for ORES - https://phabricator.wikimedia.org/T178441#3730468 (10awight) [18:23:08] halfak: Amir1: The revscoring 2 deployment looks like it’s gonna stick! Graphs look good, unless I’m missing something. 5xx rate seems to have a 3-hour summing window, but stopped climbing. [18:23:24] I need to relocate, see you soon for the second shift. [18:23:26] \o/ great! [18:23:29] o/ [18:23:29] yesss [18:23:32] \o [18:23:35] took me long enough. [18:23:56] I can focus on Celery 4 now, or do something else? [18:54:35] yaaaay [18:56:02] I vote Celery 4 [18:56:13] awight, ^ [18:56:23] I can understand if you want a JADE break though [18:56:25] hehe [18:56:34] I want to see what you're worried about re. schema versions. [18:56:41] I have one foot in the Celery patch already, might as well continue. [18:58:10] The schema version thing is simply, if we change from recording JADE API v1 responses to v3 responses, the fields are different and stuff. Tools that digest the log would have to understand all previous versions unless we produce a new normalized log as we were discussing. I vote for the latter, it’s simple enough and seems to be the recommended practice in Kafkaland. [19:16:41] +1 then. I think we should side with making absolutely sure we don't lose data. [19:17:05] E.g. any destructive changes will require saving a history of old log stuff in the original format. [19:17:13] awight fireworks and bonfire night on sunday heh :). [19:17:14] +1 [19:17:54] paladox: Anonymous Night! [19:18:01] lol [19:18:37] though i think they may celebrate it another night as sunday is remembrance sunday. [19:19:21] did you change something in ORES a few hours ago? [19:19:28] Platonides: I did. [19:19:31] uhoh [19:19:35] Do you see anything melting? [19:19:52] We made a deployment UTC 17-18:00 [19:20:04] awight: "message": "Models ('reverted',) not available for eswiki" [19:20:15] Platonides: ah, whew! That’s actually a healthy error. [19:20:27] Are you seeing it in a 3rd-party tool, or from MediaWiki? [19:20:49] eswiki now has the more advanged “damaging” and “goodfaith” models. Using “damaging” should be a drop-in replacement for reverted. [19:21:01] ores is returning that error to a bot [19:21:07] perfect. [19:21:12] which then stopped working [19:21:15] hehe [19:21:36] so it's just s/reverted/damaging/ ? [19:21:38] we got a bit aggressive with deprecating the reverted models. An email went out on wikitech-l, but it’s been a while. [19:21:39] yes [19:21:47] right [19:22:18] thanks [19:22:27] fwiw, https://ores.wikimedia.org/v3/scores/eswiki/ [19:22:38] that URL pattern works for any language. [19:22:57] Platonides: Thanks for noticing, and feel free to forward the bot operator to us if needed. [19:23:07] Sorry for the confusion Platonides! Thanks for swinging by to check on it. :) [19:23:24] :) [19:23:41] is it normal to see timeouterrors ? [19:24:07] They happen occasionally, I’m worried if the volume increased, though. [19:24:21] What do you see? Have a link? [19:29:59] wiki-ai/ores#807 (CELERY_4 - b3c7961 : Adam Roses Wight): The build passed. https://travis-ci.org/wiki-ai/ores/builds/296436357 [19:36:25] 10Scoring-platform-team (Current), 10ORES, 10Patch-For-Review: Upgrade celery to 4.1.0 for ORES - https://phabricator.wikimedia.org/T178441#3730641 (10awight) I've created `CELERY_4` branches for ores, wheels, and ores-prod-deploy to continue the integration work. [19:39:37] halfak: Is this a stale project? https://gerrit.wikimedia.org/r/#/admin/projects/research/ores/deploy,branches [19:40:01] ran across that while searching for mediawiki/services/ores/deploy [19:40:14] Looks like it, yeah. [19:40:24] awight: just checking a few urls at random https://ores.wikimedia.org/scores/eswiki/?revids=103093800&models=damaging [19:40:56] 10Scoring-platform-team, 10Gerrit: Remove deprecated research/ores/deploy repo - https://phabricator.wikimedia.org/T179610#3730659 (10awight) [19:41:49] Platonides: That’s real bad, thanks for the report. [19:42:00] halfak: ^ shucks [19:42:16] What's wrong? [19:42:54] Timeouts from ORES... [19:43:07] I didn't get a timeout [19:43:31] https://ores.wikimedia.org/v3/scores/eswiki/103093800?models=damaging [19:43:36] Really? I am. [19:43:47] https://es.wikipedia.org/wiki/?diff=103093800&models=damaging [19:43:53] ENORMOUS page [19:43:59] timing out for a good reason :) [19:44:03] * halfak runs away [19:44:07] time for lunch [19:44:18] o/ [19:44:24] Only from eswiki though [19:44:39] haha yeah it’s only that diff [19:47:59] (03PS1) 10Awight: Use branched ores to continue Celery 4 work [services/ores/deploy] (CELERY_4) - 10https://gerrit.wikimedia.org/r/388160 (https://phabricator.wikimedia.org/T178441) [19:49:40] 10Scoring-platform-team, 10Bad-Words-Detection-System, 10revscoring, 10artificial-intelligence: Add language support for Icelandic - https://phabricator.wikimedia.org/T178524#3730691 (10Snaevar) [19:49:51] 10Scoring-platform-team, 10Bad-Words-Detection-System, 10revscoring, 10artificial-intelligence: Add language support for Icelandic - https://phabricator.wikimedia.org/T178524#3694816 (10Snaevar) I have finished reviewing the BWDS list. [19:58:53] Adam how are you doing? awight [19:59:10] Zppix: o/ swell, and you? [19:59:16] Same [19:59:31] Anything i can do for you awight? [19:59:56] Thanks for the offer. Did anything jump off of the maintenance/cleanup column? [20:00:35] Nope but as im stuck on mobile til around nov 8th im kinda stuck (pc repairs) [20:10:21] Zppix: I recently got the stack to run for me locally, so can help you if you want to set up the ORES service or anything. [20:11:16] awight: see my last msg [20:11:37] ah mobile phone, I was thinking laptop [20:11:54] I remember you had that win32 horror story... [20:12:11] hopefully they recover your docs and put a wooden stake through the OS [20:12:24] “but I’m not dead yet!" [20:12:55] Ah no they are just working on a hardware issue (i had sent it to my cpu manufacturer for repairs) [20:13:24] Plus the os issue was my own doing so i doubt anyone would fix it under warranty [20:15:22] Maybe something fortuitous will go wrong :D [20:16:26] awight: i hope not i hate to reinstall everything again and lose my ssh keys [20:16:48] * awight shudders thinking that your laptop went away without a recent backup in your possession [20:17:52] awight: i was doing a backup when my cpu decided it was time to retire [20:18:40] NOOOooooo. [20:19:01] awight: its not getting pension either [20:19:04] I’ve had that one. Knowing that the hard drive has one more read left in it, and it turns out to have half a read. [20:19:12] hehe [20:19:26] Back to work, then! [20:19:34] "Work" [20:29:21] 10Scoring-platform-team (Current), 10Wikimedia-Site-requests, 10Patch-For-Review, 10User-Ladsgroup: Enable draftquality model in ORES extension for enwiki - https://phabricator.wikimedia.org/T179596#3730814 (10kaldari) Thanks Amir! I'll schedule this for a SWAT deployment next week if that's cool with you. [20:29:40] 10Scoring-platform-team (Current), 10Wikimedia-Site-requests, 10Community-Tech-Sprint, 10Patch-For-Review, 10User-Ladsgroup: Enable draftquality model in ORES extension for enwiki - https://phabricator.wikimedia.org/T179596#3730815 (10kaldari) [20:30:20] halfak: have anything i could do for you, emails, corrospendence, etc? [20:31:30] halfak: I’ve been digging around onwiki and can’t find any of the ORES diagrams for some reason. [20:31:43] Specifically, anything you’ve drawn about how we do Celery subtasks. [20:32:54] I think I remember you were saying, part of our snowflakiness is that the parent doen’t know exactly what child jobs need to be run, that takes some redis querying. [20:33:58] halfak: Also, if you feel like kicking a trivial patch, https://gerrit.wikimedia.org/r/388160 [20:34:42] (03CR) 10Zppix: [C: 031] Use branched ores to continue Celery 4 work [services/ores/deploy] (CELERY_4) - 10https://gerrit.wikimedia.org/r/388160 (https://phabricator.wikimedia.org/T178441) (owner: 10Awight) [20:34:56] FYI, I made a CELERY_4 branch on wheels and ores-prod-deploy before the revert, but I used the head of ORES and un-reverted on the new branch. [20:35:04] Zppix: ty [20:35:08] Np [20:35:29] (03CR) 10Halfak: [V: 032 C: 032] Use branched ores to continue Celery 4 work [services/ores/deploy] (CELERY_4) - 10https://gerrit.wikimedia.org/r/388160 (https://phabricator.wikimedia.org/T178441) (owner: 10Awight) [20:48:05] 10Scoring-platform-team (Current), 10MediaWiki-extensions-ORES, 10MW-1.31-release-notes (WMF-deploy-2017-11-07 (1.31.0-wmf.7)), 10Patch-For-Review, 10Wikimedia-log-errors: ORES extension failing to parse scoring response - https://phabricator.wikimedia.org/T179430#3730860 (10greg) [20:48:41] 10Scoring-platform-team (Current), 10ORES, 10Patch-For-Review: ORES service erroring, in a way that throws exceptions in Extension:ORES - https://phabricator.wikimedia.org/T179107#3730863 (10greg) status? [20:48:43] greg-g: ^ did it come back? [20:49:18] 10Scoring-platform-team (Current), 10MediaWiki-extensions-ORES, 10MW-1.31-release-notes (WMF-deploy-2017-11-07 (1.31.0-wmf.7)), 10Patch-For-Review, 10Wikimedia-log-errors: ORES extension failing to parse scoring response - https://phabricator.wikimedia.org/T179430#3730864 (10awight) I deployed a [[ https... [20:50:23] Halfak did you recieve my msg a few mins ago? [20:50:42] Zppix, sorry I missed it. [20:50:45] * halfak thinks. [20:50:58] awight: not that I know of, it's just an UBN! task with some action so I'm curious what's next/left to do [20:51:03] Oh! Did I ask you to check out the [[mw:ORES/FAQ]] yet? [20:51:03] 10[1] 10https://www.mediawiki.org/wiki/ORES/FAQ [20:51:06] Zppix, ^ [20:51:32] Yes i did and i reviewed it already it looks okay [20:52:12] Halfak ^ [20:52:40] Zppix, still some questions without answers [20:52:46] Want to make an attempt at them? [20:53:00] I could [20:53:01] I've been adding some more stuff to it recently. [20:53:29] E.g. https://www.mediawiki.org/wiki/ORES/FAQ#What_does_ORES'_architecture_look_like? [20:53:49] Where are you gathering the questions halfak? [20:55:04] Mostly just on that page. [20:55:13] If you have a new question, feel free to add it. [20:55:23] If you want, you could start a flow thread instead. [20:55:36] halfak let me rephrase where are you finding questions to add [20:56:07] Zppix, oh! Mostly what people have asked me in a bunch of different contexts so far. [20:57:01] halfak: what if i created a google form for people to submit questions for x time and then see what are asked alot then add those to faq? [21:07:23] Zppix, why not invite people to ask questions on the talk page? [21:08:03] Sure then want me to write an announcment to ai-l letting people know? [21:11:33] Is this good? https://pl.wikipedia.org/w/index.php?hidebots=1&hidecategorization=1&hideWikibase=1&hidelog=1&limit=100&days=30&enhanced=1&damaging__likelybad_color=c3&damaging__verylikelybad_color=c5&title=Specjalna:Ostatnie_zmiany&highlight=1&urlversion=2&uselang=en I think it is releated with https://phabricator.wikimedia.org/T175180 [21:12:45] ^ i looked at it looks lad a bad scoring [21:13:10] awight: ^^ [21:13:52] "lad"? [21:14:06] I meant like [21:14:28] Zppix: I’m not seeing it off-hand, but you could report on https://www.mediawiki.org/wiki/ORES/Issues/Edit_quality [21:14:28] So I should start phab task? [21:14:54] awight: all the edits on that page are marked with likey have problems [21:15:33] Awight https://usercontent.irccloud-cdn.com/file/dFfGjkOX/IMG_0136.PNG [21:15:45] https://usercontent.irccloud-cdn.com/file/QEaCsxRr/IMG_0137.PNG [21:18:39] wargo: Please do file a task if you don’t mind. [21:18:47] I see the same thing, going straight to the ORES service: https://ores.wikimedia.org/v3/scores/plwiki/50818911?models=damaging [21:19:04] awight: bad model? [21:19:12] I think so [21:19:33] awight: hopefully an easy fix... :/ [21:19:37] It seems to only be plwiki [21:19:41] (i hope) [21:21:14] * awight runs away [21:30:12] looking at plwiki issue [21:30:52] Looks like there might be something weird with the model. I'll do some local tests. [21:33:37] OK so I've confirmed that when I run the model on my local machine, it's not making bad predictions. [21:34:08] beta has the problem [21:34:22] I suspect there's something weird with the pl dictionary on the deploy machines. [21:34:39] Looks like it shows up in wmflabs too. [21:36:04] Do you have https://phabricator.wikimedia.org/T175180 locally? [21:41:49] wargo, yeah. Working with that model. Still trying to narrow down what is causing the problem. [21:42:01] It seems there's something weird on WMF servers. [21:46:54] Weird. if I manually generate the score on ores-misc, it all works correctly. [21:48:18] Confirmed that we've got the right model-version deployed. [21:52:27] Ha! OK. So when I run this model through the "score" utility, it behaves correctly, but when I start up an ORES instance it doesn't. [21:53:17] 10Scoring-platform-team, 10ORES: Issues with ORES model on plwiki - https://phabricator.wikimedia.org/T179621#3731071 (10Wargo) [21:55:11] wargo, thanks for posting that. I'll be moving notes there soon. [21:57:12] OK... So when I take the features that are generated by ores.wikimedia.org and give them to the model and 'score' utility local, it still gives an OK prediction. [21:57:27] Even on WMF servers [21:57:52] The only think I could think of is that we're accidentally generating the score with the WRONG model in ORES. [22:06:35] It doesn't look like there's anything wrong with the config. [22:08:57] I am very confused. I tried lots of obvious things. [22:18:54] Ha! I just rubber-ducked it and got it figured out [22:19:03] We switched the damaging and goodfaith models! [22:19:12] * halfak lights his own hair on fire. [22:21:28] (03PS1) 10Halfak: Fixes damaging/goodfaith config for plwiki [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/388252 [22:24:35] (03CR) 10Halfak: [V: 032 C: 032] "Tested locally. Relatively simple config change. Self-merging" [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/388252 (owner: 10Halfak) [22:25:02] Looks like we're going to need to blow out some cache. [22:27:57] (03PS2) 10Halfak: Fixes damaging/goodfaith config for plwiki [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/388252 (https://phabricator.wikimedia.org/T179621) [22:28:08] (03CR) 10Halfak: [V: 032 C: 032] Fixes damaging/goodfaith config for plwiki [services/ores/deploy] - 10https://gerrit.wikimedia.org/r/388252 (https://phabricator.wikimedia.org/T179621) (owner: 10Halfak) [22:29:30] deploying fix to beta [22:47:48] deploying fix to production [23:00:58] deploying to ores-staging while I wait for prod deploy to complete [23:20:35] 10Scoring-platform-team, 10ORES, 10Patch-For-Review: Issues with ORES model on plwiki - https://phabricator.wikimedia.org/T179621#3731296 (10Halfak) The deploy finished. However, it looks like a bunch of bad scores got cached and will continue to show up in Wikipedia until we can get them purged (@Ladsgroup... [23:53:26] 10Scoring-platform-team (Current), 10DBA, 10Operations, 10cloud-services-team, 10Patch-For-Review: Labsdb* servers need to be rebooted - https://phabricator.wikimedia.org/T168584#3731324 (10madhuvishy) fyi @Cmjohnson We are not doing the labsdb1003 reboot on Tuesday Nov 7, due to T179464.