[08:26:07] <dcausse> yes red indices in codfw are from dec 10 2022 or before [08:26:55] <dcausse> those indices should "autocleanup" so something's clearly not working as expected [08:38:16] <dcausse> wdqs1009 is at chunk 763/1042 (running since jan8) [08:47:33] <dcausse> errand [10:29:04] <pfischer> o/ Hi! I started looking into spark and migrating from 2 to 3 on Friday. However, I’m still struggling to get the started with https://github.com/wikimedia/wikimedia-discovery-analytics. dcausse: Didn’t you mention a spark-project, that was a bit closer to java land (project-structure/test-wise)? [10:30:22] <dcausse> pfischer: hi! sure, it's the rdf-spark-tools in the query/service/rdf repo (you might have it already checked-out as IIRC you had made a few patches in this repo already) [10:31:10] <pfischer> Indeed, thanks! [10:32:36] <dcausse> it has a couple spark jobs that are all scheduled from the https://github.com/wikimedia/wikimedia-discovery-analytics repo [10:34:18] <dcausse> so I think the rough plan would be to: 1/ make the scala code spark3 compliant, 2/ make a release, 3/ update wikimedia-discovery-analytics to use that new released artifact and possibly change how we submit the job to use spark3-submit instead of spark2-submit (or something like that) [10:39:32] <pfischer> That helps, thank you, David! [10:45:24] <pfischer> Do we have any constraints on the Scala version (stick to 2.12+ or 3.x)? [10:47:29] <dcausse> hm.. I think scala3 might be hard to get working so most probably safer to stick to at least 2.12, we can ask DE if they started to use scala 2.13 [10:48:56] <dcausse> for the flink app I think I'll stick to scala 2.12.13 (c.f. https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/879822/) [10:50:42] <dcausse> seems like 2.12 should be prefered (T291464) [10:50:43] <stashbot> T291464: Upgrade analytics-hadoop to Spark 3 + scala 2.12 - https://phabricator.wikimedia.org/T291464 [11:18:11] <gehel> Lunch [11:24:56] <dcausse> lunch 2 [14:58:49] <dcausse> tempted to delete cirrus-integ instance in the "search" wmcloud project [14:59:05] * gehel has no idea what this is used for [14:59:38] <dcausse> it was cindy running on pre ES7 codebase [15:00:03] <gehel> seems unlikely that we'll roll back to ES6 ! [15:01:33] <gehel> dcausse: k8s meeting in https://meet.google.com/wjt-srdx-cgq [16:01:44] <gehel> pfischer: triage meeting in https://meet.google.com/eki-rafx-cxi