[06:04:07] 10Analytics: Yarn NM stopping due to failures while creating native threads - https://phabricator.wikimedia.org/T281792 (10elukey) [06:07:40] 10Analytics: Yarn NM stopping due to failures while creating native threads - https://phabricator.wikimedia.org/T281792 (10elukey) [06:48:19] Good morning [06:56:40] bonjour :) [07:46:57] 10Analytics-Radar, 10BetaFeatures, 10BlueSpice, 10CheckUser, and 46 others: Prepare User group methods for hard deprecation - https://phabricator.wikimedia.org/T275148 (10Aklapper) [07:47:36] 10Analytics-Radar, 10BetaFeatures, 10BlueSpice, 10CheckUser, and 46 others: Prepare User group methods for hard deprecation - https://phabricator.wikimedia.org/T275148 (10Aklapper) Re-added codebase tags, so people interested in tickets about a codebase can find these tickets about that codebase. [07:57:23] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for next deploy (already fixed in prod)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/684314 (owner: 10Joal) [08:29:42] (03CR) 10Kosta Harlan: [C: 03+2] HompageVisit: Add welcomesurvey-originalcontext to referer_route [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/684042 (https://phabricator.wikimedia.org/T281232) (owner: 10Gergő Tisza) [08:30:54] (03Merged) 10jenkins-bot: HompageVisit: Add welcomesurvey-originalcontext to referer_route [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/684042 (https://phabricator.wikimedia.org/T281232) (owner: 10Gergő Tisza) [09:11:08] 10Analytics, 10Dumps-Generation, 10Wikidata, 10wdwb-tech: Wikidata all-json dumps not available from 2021-04-26 - https://phabricator.wikimedia.org/T281808 (10JAllemandou) [09:46:31] (03PS1) 10Kosta Harlan: helppanel: Add machineSuggestion as valid editor_interface [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/684821 (https://phabricator.wikimedia.org/T280564) [09:47:16] 10Analytics: Requesting a kerberos identity for user sihe - https://phabricator.wikimedia.org/T281809 (10Silvan_WMDE) [09:47:30] (03PS2) 10Kosta Harlan: helppanel: Add machineSuggestion as valid editor_interface [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/684821 (https://phabricator.wikimedia.org/T280564) [10:09:15] Hi hnowlan - Quick question - Can I merge/deploy the dual-loading oozie job? More precisely, is the C3-cluster happy to be loaded? [10:18:38] joal: it is! Assuming it's okay that the data is wiped at some point in future when we're ready to do things for real [10:19:01] Great hnowlan - I'll deploy and restart the jobs today :) [10:19:02] joal: once we see that the jobs are doing what we want, we can start with the snapshot and import process though [10:19:08] \o/ [10:19:09] excellent! [11:26:50] morning team :) [11:35:12] Hi fdans [12:11:24] 10Analytics-Clusters, 10Analytics-Kanban: Migrate eventlog1002 to buster - https://phabricator.wikimedia.org/T278137 (10hnowlan) Do we plan on decommissioning or reclaiming the hardware for the old eventlog1002? [12:11:42] 10Analytics, 10Event-Platform, 10MW-1.36-notes (1.36.0-wmf.37; 2021-03-30), 10MW-1.37-notes (1.37.0-wmf.3; 2021-04-27), and 2 others: extensions/EventBus - hard deprecate User group methods - https://phabricator.wikimedia.org/T281825 (10Vlad.shapik) [12:12:07] 10Analytics, 10Event-Platform, 10MW-1.36-notes (1.36.0-wmf.37; 2021-03-30), 10MW-1.37-notes (1.37.0-wmf.3; 2021-04-27), and 2 others: extensions/EventBus - hard deprecate User group methods - https://phabricator.wikimedia.org/T281825 (10Vlad.shapik) p:05Triage→03Medium [12:27:51] (03CR) 10Thiemo Kreuz (WMDE): [C: 03+1] "I can't +2 in this codebase yet. But this change is obviously fine." [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682752 (owner: 10Awight) [12:33:36] (03CR) 10Thiemo Kreuz (WMDE): [C: 03+1] "Note: Lowering the threshold down to 40% does the trick." [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682748 (https://phabricator.wikimedia.org/T193169) (owner: 10Awight) [13:30:08] 10Analytics, 10Dumps-Generation, 10Wikidata, 10wdwb-tech: Wikidata all-json dumps not available from 2021-04-26 - https://phabricator.wikimedia.org/T281808 (10WMDE-leszek) There are lexeme dumps in https://dumps.wikimedia.org/wikidatawiki/entities/20210430/ @ArielGlenn @JAllemandou who of you would be the... [13:30:34] 10Analytics-Clusters, 10Analytics-Kanban: Migrate eventlog1002 to buster - https://phabricator.wikimedia.org/T278137 (10elukey) Yes let's fully decommission eventlog1002 once we are ok with 1003 :) [13:34:35] (03Abandoned) 10Mholloway: [DNM] Test Gerrit voting rights [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/684468 (owner: 10Mholloway) [13:46:32] 10Analytics-Clusters, 10Analytics-Kanban: Migrate eventlog1002 to buster - https://phabricator.wikimedia.org/T278137 (10hnowlan) I think I'm okay with 1003 so far - it's been running all processors since 15:00 on the 29th of April and it seems to be [[ https://grafana.wikimedia.org/d/000000377/host-overview?or... [14:09:28] o/ [14:16:41] (03CR) 10Bearloga: "> Patch Set 8:" (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/680798 (https://phabricator.wikimedia.org/T254891) (owner: 10Neil P. Quinn-WMF) [14:26:02] (03CR) 10Neil P. Quinn-WMF: Create content_translation_event schema (032 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/680798 (https://phabricator.wikimedia.org/T254891) (owner: 10Neil P. Quinn-WMF) [14:32:57] joal: looks like the only thing on the train is my fix from last week, I'm happy to deploy it if nobody else adds anything at standup [14:41:38] 10Analytics-Clusters, 10Analytics-Kanban, 10DBA, 10Data-Services, and 2 others: Convert labsdb1012 from multi-source to multi-instance - https://phabricator.wikimedia.org/T269211 (10Bstorm) [15:03:20] (03CR) 10Bearloga: [C: 03+2] Create content_translation_event schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/680798 (https://phabricator.wikimedia.org/T254891) (owner: 10Neil P. Quinn-WMF) [15:03:36] (03CR) 10Bearloga: [C: 03+2] "LGTM" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/680798 (https://phabricator.wikimedia.org/T254891) (owner: 10Neil P. Quinn-WMF) [15:04:03] (03Merged) 10jenkins-bot: Create content_translation_event schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/680798 (https://phabricator.wikimedia.org/T254891) (owner: 10Neil P. Quinn-WMF) [15:18:52] 10Analytics, 10Dumps-Generation, 10Wikidata, 10wdwb-tech: Wikidata all-json dumps not available from 2021-04-26 - https://phabricator.wikimedia.org/T281808 (10WMDE-leszek) Apologies for the confusing comment above, I misread the description. [15:25:44] (03CR) 10Mforns: [V: 03+2 C: 03+2] "LGTM!" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/682702 (https://phabricator.wikimedia.org/T193170) (owner: 10Awight) [15:28:34] (03PS2) 10Mforns: Add cawiki to the databases we check for preferences [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/684303 (https://phabricator.wikimedia.org/T271894) (owner: 10Awight) [15:28:43] (03CR) 10Mforns: [V: 03+2 C: 03+2] "LGTM!" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/684303 (https://phabricator.wikimedia.org/T271894) (owner: 10Awight) [16:12:25] (03PS5) 10Joal: Update refinery-cassandra to cassandra 3.11 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/681605 (https://phabricator.wikimedia.org/T280649) [16:15:28] (03PS3) 10Joal: Update cassandra jobs for double loading [analytics/refinery] - 10https://gerrit.wikimedia.org/r/681678 (https://phabricator.wikimedia.org/T280649) [16:23:54] mforns: ping staff? [16:24:00] mforns: we're in da cave [17:01:10] ok, going for train [17:01:19] milimetric: do you wish we do it together? [17:06:47] 10Analytics, 10Dumps-Generation, 10Wikidata, 10wdwb-tech: Wikidata all-json dumps not available from 2021-04-26 - https://phabricator.wikimedia.org/T281808 (10hoo) [17:08:07] 10Analytics, 10Dumps-Generation, 10Wikidata, 10wdwb-tech: Wikidata all-json dumps not available from 2021-04-26 - https://phabricator.wikimedia.org/T281808 (10hoo) The runs last week failed (and no one noticed in time, see {T281267} for that). The JSON dumps should appear as usual this week. [17:08:34] milimetric, fdans, mforns, razzi - anything you'd like me to deploy in addition to what's already in the train etherpad? https://etherpad.wikimedia.org/p/analytics-weekly-train [17:16:31] ok, no answer, proceeding with current status [17:17:03] (03CR) 10Joal: [C: 03+2] "Merging for deploy" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/681605 (https://phabricator.wikimedia.org/T280649) (owner: 10Joal) [17:18:22] joal: was grabbing some food, I can launch and babysit the job to make sure all's well, thanks for the deploy [17:19:32] milimetric: I just started - I'll ping you when ready :) [17:26:17] (03Merged) 10jenkins-bot: Update refinery-cassandra to cassandra 3.11 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/681605 (https://phabricator.wikimedia.org/T280649) (owner: 10Joal) [17:30:37] (03PS1) 10Joal: Update changelog for v0.1.10 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/685006 [17:30:55] milimetric, mforns, fdans - anyone for a merge --^ please ? [17:32:27] (03CR) 10Fdans: [V: 03+2 C: 03+2] Update changelog for v0.1.10 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/685006 (owner: 10Joal) [17:32:39] Thanks fdans :) [17:33:16] joal: I always self-merge with the changelog CR, but I respect your adherence to the laws :) [17:33:49] fdans: I do it if noone is around - otherwise I take that as an opportunity to bother you :-P [17:45:38] hm - archiva hits the same problem Andrew was having yesterday [17:50:33] May I summon an elukey for some archiva help? [17:50:54] archiva gives me: T: Failed to read artifact descriptor for commons-codec:commons-codec:jar:1.15-SNAPSHOT: Could not transfer artifact commons-codec:commons-codec:pom:1.15-SNAPSHOT from/to wmf-mirror-spark (https://archiva.wikimedia.org/repository/mirror-spark/): Failed to transfer file [17:50:55] sure! [17:50:59] https://archiva.wikimedia.org/repository/mirror-spark/commons-codec/commons-codec/1.15-SNAPSHOT/commons-codec-1.15-SNAPSHOT.pom with status code 500 [17:51:47] The full error line is actually: [ERROR] Failed to execute goal on project refinery-job: Could not resolve dependencies for project org.wikimedia.analytics.refinery.job:refinery-job:jar:0.1.10-SNAPSHOT: Failed to collect dependencies at com.criteo:rsvd:jar:1.0 -> org.apache.spark:spark-core_2.11:jar:2.3.1 -> net.java.dev.jets3t:jets3t:jar:0.9.4 -> commons-codec:commons-codec:jar:1.15-SNAPSHOT: [17:51:53] Failed to read artifact descriptor for commons-codec:commons-codec:jar:1.15-SNAPSHOT: Could not transfer artifact commons-codec:commons-codec:pom:1.15-SNAPSHOT from/to wmf-mirror-spark (https://archiva.wikimedia.org/repository/mirror-spark/): Failed to transfer file https://archiva.wikimedia.org/repository/mirror-spark/commons-codec/commons-codec/1.15-SNAPSHOT/commons-codec-1.15-SNAPSHOT.pom [17:51:59] with status code 500 -> [Help 1] [17:52:02] sorry for the spam elukey [17:52:26] ah lovely so archiva returns 500s [17:53:06] lemme check if there are some weird logs [17:53:57] ah so archiva is trying, behind the scenes, to fetch [17:53:58] https://dl.bintray.com/spark-packages/maven/commons-codec/commons-codec/1.15-SNAPSHOT/commons-codec-1.15-SNAPSHOT.pom [17:54:03] but this gives a 403 [17:54:07] meh [17:54:40] the whole dl.bintray.com seems returning 403, that seems weird [17:54:47] either the site is decommed or they are having issues [17:55:10] it started yesterday, andrew had issues [17:55:33] joal: ah look at https://bintray.com/ [17:55:38] Thanks for supporting Bintray! This service has now been sunset, and to assist with migration to the JFrog Platform, paid accounts can login until July 4th. [17:55:47] * elukey plays sad_trombone.wav [17:55:58] /o\ [17:56:46] do we have bintray in our poms, or is it pulled in by another dep? [17:57:00] ok - And this is a down-the-line dependency from com.criteo:rsvd:jar:1.0 [17:57:09] ah okok [17:57:10] hm - trying a hack [17:57:39] ideally the rsvd criteo jar should have a more recent version with this dep fixed [17:57:42] "ideally" [17:58:07] last commit 2019, nope [17:58:34] elukey: worth case we'll fork [17:58:57] elukey: we'll need to do that in any case to migrate to Spark3 [18:00:09] makes sense yes [18:00:18] maybe we can try anyway to send a pull request [18:00:30] any idea about how to unblock this specific use case? [18:00:50] elukey: I'll fork and send PR - I for the moment try to hack a local solution :) [18:01:36] ack lemme know what I can do to help [18:01:56] elukey: letting me know what the problem root cause was already helped a lot :) [18:01:59] thanks mate [18:02:02] <3 [18:02:09] I am around if needed, ping me :) [18:02:12] ok I think I have a hotfix [18:05:08] 10Analytics, 10Product-Analytics: Top read repeats - https://phabricator.wikimedia.org/T280011 (10kzimmerman) Thanks @Astinson ! @JAllemandou I raised the question of defaults (like "Cleopatra") with Partnerships and @Nicholas_Perry said he's going to see if he can get more info from our partners. Moving thi... [18:11:19] 10Analytics, 10Product-Analytics: Top read repeats - https://phabricator.wikimedia.org/T280011 (10JAllemandou) Thanks @kzimmerman for the heads up :) On our side we don't forget the improvement of heuristics. [18:14:05] (03PS1) 10Joal: Fix com.criteo:rsvd dependency issue [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/685028 [18:14:33] joal: I am heading to dinner, will get back in a few, ok if I leave or do you need me? [18:14:35] elukey: --^ [18:14:39] ah! [18:14:40] :D [18:14:47] before diner if you can (very small :) [18:14:48] * elukey reads [18:14:52] sorry to bother elukey [18:15:04] it is a pleasure joal [18:16:14] (03CR) 10Elukey: [C: 03+1] Fix com.criteo:rsvd dependency issue [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/685028 (owner: 10Joal) [18:16:26] awesome elukey - thanks - Letting you go now :) [18:16:31] no need for new source version etc.. right? [18:16:36] all good [18:16:40] super [18:16:46] all right will check in ~30 mins! [18:16:47] ttl [18:16:55] (03CR) 10Joal: [C: 03+2] "Merging hotfix for deploy" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/685028 (owner: 10Joal) [18:17:21] (03CR) 10Joal: [V: 03+2 C: 03+2] Fix com.criteo:rsvd dependency issue [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/685028 (owner: 10Joal) [18:38:01] joal: how is it going? [18:40:11] mmm last build seems failed :( [18:42:10] (03CR) 10Razzi: "Ping, I think this is worth including" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/673128 (owner: 10Razzi) [18:49:28] mwarf [18:49:29] (03PS1) 10Joal: Fix error in com.criteo:rsvd dependency fix [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/685036 [18:49:58] elukey: I'm just dumb and should have more careful - Will wait for jenkins to test that fix before self merging --^ [18:50:28] joal: ahhh I thought that the lz4 stuff was on purpose and didn't ask [18:50:52] (03CR) 10Elukey: [C: 03+1] "Should have asked, this now makes more sense" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/685036 (owner: 10Joal) [18:56:51] (will check later on) [18:57:39] 10Analytics, 10Analytics-Kanban: Switch off skipTrash for some data purging - https://phabricator.wikimedia.org/T270431 (10mforns) [19:06:20] (03CR) 10Joal: [C: 03+2] "Merging hotfix for deploy" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/685036 (owner: 10Joal) [19:14:17] (03Merged) 10jenkins-bot: Fix error in com.criteo:rsvd dependency fix [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/685036 (owner: 10Joal) [19:15:50] 10Analytics-Clusters: Could not find class ::profile::swap for an-test-client1001.eqiad.wmnet - https://phabricator.wikimedia.org/T281917 (10razzi) [19:34:17] !log refinery v0.1.10 released to Archiva [19:34:19] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [19:37:06] for what it's worth, I did the same thing as Luca, thought the lz4 thing was on purpose. It's not fair to Joseph that we always think he knows better :) [19:39:30] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.1.10 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/685072 [19:41:42] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy of refinery" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/685072 (owner: 10Maven-release-user) [19:43:37] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/681678 (https://phabricator.wikimedia.org/T280649) (owner: 10Joal) [19:46:24] !log Deploying refinery using scap [19:46:28] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:12:31] !log Deploy refinery onto HDFSb [20:12:33] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:28:30] Ok we're all set - restarting jobs! [20:29:34] 10Analytics: reportupdater TLC - https://phabricator.wikimedia.org/T193167 (10awight) [20:29:37] 10Analytics: [reportupdater] eliminate the funnel parameter - https://phabricator.wikimedia.org/T193170 (10awight) 05Open→03Resolved [20:29:45] !log Kill-restart referer-daily job [20:29:47] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:30:59] !log Kill-restart 16 cassandra jobs [20:31:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:42:07] It all looks good - the one hour of test I had for cassandra3 double loading succeeded, the referer-daily jobs flows - it's all good [20:42:31] milimetric: could you please double check the referer data before we call it done? [20:42:39] milimetric: https://hue.wikimedia.org/hue/jobbrowser/#!id=0011298-210426062240701-oozie-oozi-C [20:46:49] And with that I'll call it a day [21:00:33] looking now, was in a meeting [21:02:07] looks great, thanks jo [21:15:30] 10Analytics-Clusters: Could not find class ::profile::swap for an-test-client1001.eqiad.wmnet - https://phabricator.wikimedia.org/T281917 (10razzi) Hm, that patch fixed the underlying issue, and running the check manually produces the intended result: ` razzi@puppetdb1002:~$ /usr/lib/nagios/plugins/check_puppet... [22:27:49] 10Analytics, 10Data-release, 10Privacy Engineering, 10Research, 10Privacy: Apache Beam go prototype code for DP evaluation - https://phabricator.wikimedia.org/T280385 (10Isaac) > We probably do not want to install beam on the cluster just for this experiment so can we use jupyter rather and run beam on p... [23:39:30] PROBLEM - Hadoop NodeManager on an-worker1130 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts%23Yarn_Nodemanager_process