[00:43:12] 10Analytics-Radar, 10AbuseFilter, 10Cognate, 10ConfirmEdit (CAPTCHA extension), and 28 others: Replace PageContent(Insert|Save)Complete hooks - https://phabricator.wikimedia.org/T250566 (10DannyS712) [05:38:18] (03PS2) 10Fdans: [wip] Add filter/split component to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/607768 (https://phabricator.wikimedia.org/T249758) [05:40:01] (03CR) 10jerkins-bot: [V: 04-1] [wip] Add filter/split component to Wikistats [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/607768 (https://phabricator.wikimedia.org/T249758) (owner: 10Fdans) [06:07:44] hola fdans [06:07:46] morning :) [06:24:15] elukey: hellooooo [06:59:50] 10Analytics, 10Product-Analytics: Add editors_monthly data to Druid - https://phabricator.wikimedia.org/T256719 (10cchen) [07:00:03] 10Analytics, 10Product-Analytics (Kanban): Add editors_monthly data to Druid - https://phabricator.wikimedia.org/T256719 (10cchen) [07:01:04] 10Analytics, 10Product-Analytics (Kanban): Add editors_monthly data to Druid - https://phabricator.wikimedia.org/T256719 (10cchen) [07:01:06] 10Analytics, 10Product-Analytics, 10Patch-For-Review: Add dimensions to editors_daily dataset - https://phabricator.wikimedia.org/T256050 (10cchen) [07:01:22] 10Analytics, 10Product-Analytics (Kanban): Add editors_monthly data to Druid - https://phabricator.wikimedia.org/T256719 (10cchen) p:05Triage→03Medium [07:04:04] going afk for the morning, ttl! [08:20:44] Hi team - back from the dentist [09:16:05] (03PS1) 10Joal: Make mediawiki_history skewed join deterministic [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608567 (https://phabricator.wikimedia.org/T255548) [09:17:15] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Update skewed-join strategy in Mediawiki-history to prevent errors in case of task-retry - https://phabricator.wikimedia.org/T255548 (10JAllemandou) a:03JAllemandou [09:20:44] 10Analytics, 10Analytics-Kanban: Rename pageview_actor_hourly to pageview_actor - https://phabricator.wikimedia.org/T256415 (10JAllemandou) ping @Nuria on that one, so that we rename fast if we want to do so :) [09:32:17] !log Kill/Restart mediawiki-wikitext-history job now that the current month one is done (bz2 fix) [09:32:18] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:34:49] MEH!!!! I have forgotten to bump the jar versions for mediawiki-wikitext-history :( [09:43:41] (03PS1) 10Joal: Update mediawiki-wikitext jobs to use fixed bz2 codec [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608572 (https://phabricator.wikimedia.org/T243241) [09:50:39] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: 'namespace_is_content' column in pageview data returns 1, 0 and NULL as booleans in Superset/Turnilo - https://phabricator.wikimedia.org/T255222 (10JAllemandou) a:03JAllemandou [09:50:48] 10Analytics, 10Analytics-Kanban, 10Dumps-Generation, 10Patch-For-Review: Some xml-dumps files don't follow BZ2 'correct' definition - https://phabricator.wikimedia.org/T243241 (10JAllemandou) [09:51:00] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Delete pageview_actor_hourly data after 90 days - https://phabricator.wikimedia.org/T256417 (10JAllemandou) [09:51:04] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Update skewed-join strategy in Mediawiki-history to prevent errors in case of task-retry - https://phabricator.wikimedia.org/T255548 (10JAllemandou) [09:51:14] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats: Define reduce calculations needed to compute active editors per project family - https://phabricator.wikimedia.org/T249751 (10JAllemandou) [09:51:33] 10Analytics, 10Analytics-Kanban: Create intermediate dataset: pageview with actor information - https://phabricator.wikimedia.org/T255467 (10JAllemandou) [09:51:54] 10Analytics, 10Analytics-Kanban: Unique devices, retrofit with bot detection code - https://phabricator.wikimedia.org/T250744 (10JAllemandou) [09:55:12] (03PS1) 10Joal: Update namespace_is_content field in druid pageview [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608576 (https://phabricator.wikimedia.org/T255222) [10:42:20] 10Analytics-Radar, 10AbuseFilter, 10Cognate, 10ConfirmEdit (CAPTCHA extension), and 28 others: Replace PageContent(Insert|Save)Complete hooks - https://phabricator.wikimedia.org/T250566 (10DannyS712) [11:56:47] (03PS2) 10Joal: Update mediawiki-wikitext jobs to use fixed bz2 codec [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608572 (https://phabricator.wikimedia.org/T243241) [12:26:42] joal: timer for pageview actors added on launcher! [12:33:50] \o/ [12:33:56] Thanks elukey [13:03:27] PROBLEM - Hadoop NodeManager on analytics1068 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration [13:04:42] 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, 10CPT Initiatives (Modern Event Platform (TEC2)), and 3 others: Modern Event Platform (TEC2) - https://phabricator.wikimedia.org/T185233 (10Ottomata) [13:06:23] 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, 10CPT Initiatives (Modern Event Platform (TEC2)), and 3 others: Modern Event Platform (TEC2) - https://phabricator.wikimedia.org/T185233 (10Ottomata) [13:07:18] 10Analytics-EventLogging, 10Analytics-Kanban, 10Event-Platform, 10CPT Initiatives (Modern Event Platform (TEC2)), and 3 others: Modern Event Platform (TEC2) - https://phabricator.wikimedia.org/T185233 (10Ottomata) [13:08:06] mmmmm [13:09:11] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Services (watching): Modern Event Platform: Schema Repostories - https://phabricator.wikimedia.org/T201063 (10Ottomata) [13:10:58] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Services (watching): Modern Event Platform: Schema Repostories - https://phabricator.wikimedia.org/T201063 (10Ottomata) [13:11:07] 10Analytics, 10Analytics-EventLogging, 10Event-Platform, 10Services (watching): Modern Event Platform: Schema Repostories - https://phabricator.wikimedia.org/T201063 (10Ottomata) [13:12:29] RECOVERY - Hadoop NodeManager on analytics1068 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration [13:12:43] !log restart nodemanager on analytics1068 after GC overhead and OOMs [13:12:45] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:15:01] from the logs it seems that application_1592377297555_52845 caused some overhead, probably shuffle-related? [13:15:50] or maybe it was caught in the middle [13:16:33] 20/06/30 12:59:38 ERROR RetryingBlockFetcher: Exception while beginning fetch of 995 outstanding blocks (after 1 retries) [13:16:36] java.io.IOException: Failed to connect to analytics1068.eqiad.wmnet/10.64.53.28:7337 [13:16:41] that is the shuffler port [13:23:42] (03PS2) 10Bearloga: Update webrequest hive jar version for pageview-def [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608447 (https://phabricator.wikimedia.org/T256514) (owner: 10Joal) [13:25:15] * elukey goes afk :) [13:31:33] (03CR) 10Bearloga: [C: 03+1] "LGTM" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608447 (https://phabricator.wikimedia.org/T256514) (owner: 10Joal) [13:36:31] (03PS2) 10Ottomata: Deploy camus-wmf-0.1.0-wmf10.jar [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608460 (https://phabricator.wikimedia.org/T256370) [13:42:22] joal o/ [13:42:41] whats status of refinery deploy? i also want to deploy my camus changes [13:52:51] 10Analytics-EventLogging, 10Analytics-Radar, 10Beta-Cluster-Infrastructure, 10Product-Analytics, 10Wikipedia-Android-App-Backlog (Android-app-release-v2.7.31x-P-Pryanik): Remove MobileWikiAppProtectedEditAttempt schema from Android app - https://phabricator.wikimedia.org/T254567 (10Dbrant) 05Open→03Re... [13:59:34] hey ottomata - I actually sent a bunch of patches this morning [14:00:06] ah i should review? [14:00:16] ottomata: most of them are small-ish, it'd be great to have them merged in the next hours, so that I could deploy them as well later [14:00:22] ottomata: please feel fre :) [14:00:25] k [14:00:58] (03CR) 10Ottomata: [C: 03+2] Make mediawiki_history skewed join deterministic [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608567 (https://phabricator.wikimedia.org/T255548) (owner: 10Joal) [14:01:42] joal i think marcel usually reviews these but maybe you can since he's gone [14:01:42] https://gerrit.wikimedia.org/r/c/analytics/refinery/+/607615 [14:01:47] we could include that in deploy too [14:01:51] ottomata: I plan on deploying after retro, so no rush (I need to care kids in the meantime) [14:02:05] (03CR) 10Ottomata: [C: 03+2] Update namespace_is_content field in druid pageview [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608576 (https://phabricator.wikimedia.org/T255222) (owner: 10Joal) [14:02:07] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Update namespace_is_content field in druid pageview [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608576 (https://phabricator.wikimedia.org/T255222) (owner: 10Joal) [14:02:40] (03PS3) 10Ottomata: Update mediawiki-wikitext jobs to use fixed bz2 codec [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608572 (https://phabricator.wikimedia.org/T243241) (owner: 10Joal) [14:02:50] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Update mediawiki-wikitext jobs to use fixed bz2 codec [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608572 (https://phabricator.wikimedia.org/T243241) (owner: 10Joal) [14:03:00] k [14:03:16] you might predict that some benevolent(?) dictator might cancel retro [14:03:32] ottomata: I'm not confident in merging the whitelist patch, as I have no idea if those fields are PII or not etc - I can check for correctness, but no more :( [14:03:40] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Update webrequest hive jar version for pageview-def [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608447 (https://phabricator.wikimedia.org/T256514) (owner: 10Joal) [14:03:43] ottomata: possible :) [14:03:58] yeah true joal [14:04:01] i guess we can wait on that one [14:04:03] at least for nuria [14:04:19] ottomata: might depend on the return of the Boos :) I think it's today [14:04:30] s/Boos/Boss [14:04:32] joal, i'd like to move with the camus stuff while I can [14:04:37] OH NO THE BOSS IS BACK!? [14:04:46] since it doesn't take a refinery source deploy, and i can limit to just an launcher [14:04:48] may i proceed? [14:04:52] https://gerrit.wikimedia.org/r/c/analytics/refinery/+/608460 [14:04:53] and [14:05:07] https://gerrit.wikimedia.org/r/c/operations/puppet/+/608622 [14:05:33] Need to go for kids - ottomata please move, but not too fast (triple check webrequest for instance please: ) [14:05:42] i did a webrequset job yesterday [14:05:48] into my own dir [14:05:50] suceeded [14:05:51] ok perfect [14:05:55] MOVE ON :) [14:05:56] i'm actually not changing the jar version for webrequest here [14:05:57] only el [14:06:10] we can update the jar default version later after we are sure it works for el [14:06:16] ack [14:06:25] dropping - later - (miss standup as new time) [14:06:29] ok proceeding, laters! [14:06:35] (03PS3) 10Ottomata: Deploy camus-wmf-0.1.0-wmf10.jar [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608460 (https://phabricator.wikimedia.org/T256370) [14:06:44] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Deploy camus-wmf-0.1.0-wmf10.jar [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608460 (https://phabricator.wikimedia.org/T256370) (owner: 10Ottomata) [14:13:42] elukey: for when you ar back [14:13:50] have we done a refinery deploy since moving to archiva1002? [14:13:55] i think we missed some artifacts [14:14:05] there are some python venvs for article-recommender? [14:14:08] i wonder if we still use them [14:17:26] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Move Archiva to Debian Buster - https://phabricator.wikimedia.org/T252767 (10Ottomata) I think we might be missing some referenced artifacts in refinery. ` ./hive-jdbc-1.1.0-cdh5.10.0.jar:#$# git-fat 08067db8f8120d408a324159ba981905... [14:25:32] 10Analytics: Check home/HDFS leftovers of nathante - https://phabricator.wikimedia.org/T256356 (10Halfak) We're in the process of renewing @Groceryheist's MOU after all. He'll need to keep access. We're in the process of getting the paperwork squared away right now. Sorry for the mixup. [14:28:35] ottomata: o/ [14:28:55] no deployments yet with archiva1002 [14:28:58] what's missing? [14:29:19] I rsynced an exact copy of archvia1001 to 1002 before making the switch [14:31:49] ah I see from the task [14:33:49] sigh ./repositories/mirrored/org/apache/hive/hive-jdbc/1.1.0-cdh5.10.0/hive-jdbc-1.1.0-cdh5.10.0.jar [14:33:57] that's another upload [14:34:20] or the cloudera's repo doesn't include it anymore [14:35:12] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Move Archiva to Debian Buster - https://phabricator.wikimedia.org/T252767 (10elukey) Lovely :( ` ./repositories/mirrored/org/apache/hive/hive-jdbc/1.1.0-cdh5.10.0/hive-jdbc-1.1.0-cdh5.10.0.jar.sha1 ./repositories/mirrored/org/apache... [14:35:50] yes all in mirrored [14:38:20] but https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/hive/hive-jdbc/1.1.0-cdh5.10.0/ [14:40:29] ahhh elukey@archiva1002:/var/lib/archiva/repositories$ ls mirror-cloudera/org/apache/hive/hive-jdbc/1.1.0-cdh5.15.0/ [14:41:27] on archiva1002 we have 5.15 [14:42:03] but we try to pull 5.10 from git fat [14:44:32] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Move Archiva to Debian Buster - https://phabricator.wikimedia.org/T252767 (10elukey) Very weird: ` elukey@archiva1002:/var/lib/archiva/repositories$ ls mirror-cloudera/org/apache/hive/hive-jdbc 1.1.0-cdh5.15.0 maven-metadata.xml m... [14:45:50] ok right so nothing pulled in 5.10 yet [14:46:03] so no git-fat link and hence the error [14:48:06] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Move Archiva to Debian Buster - https://phabricator.wikimedia.org/T252767 (10Ottomata) I downloaded the 2 article-recommender venvs from archiva-old and uploaded them to archiva: ` Artifacts for 'article-recommender:venv:0.0.1', pac... [14:48:36] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Move Archiva to Debian Buster - https://phabricator.wikimedia.org/T252767 (10elukey) Ok I think I know what happened: 1) I made an rsync of 1001 to 1002, containing the 5.10 artifact in mirrored. 2) Cleaned up mirrored, and set up s... [14:49:02] ah snap arrived too late :) [14:49:28] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Move Archiva to Debian Buster - https://phabricator.wikimedia.org/T252767 (10elukey) @Ottomata I arrived too late, thanks for the fix! [14:49:56] nice, all fixed then [14:50:39] going afk again :) [14:56:17] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Move Archiva to Debian Buster - https://phabricator.wikimedia.org/T252767 (10Ottomata) Hm, I spoke too soon, I think the upload didn't quite work in the way I expected? Even though I uploaded all the files, there was a .jar and a -s... [15:02:18] PROBLEM - Check the last execution of eventlogging_to_druid_editattemptstep_hourly on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit eventlogging_to_druid_editattemptstep_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [15:02:24] PROBLEM - Check the last execution of eventlogging_to_druid_prefupdate_hourly on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit eventlogging_to_druid_prefupdate_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [15:03:41] a-team standup? [15:03:59] it's an hour later today, but I'm missing and so is Luca [15:04:11] ottomata: I'm at another meeting :) [15:04:20] ? my cal has it at 11 [15:05:22] PROBLEM - Check the last execution of eventlogging_to_druid_navigationtiming_hourly on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit eventlogging_to_druid_navigationtiming_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [15:06:52] PROBLEM - Check the last execution of refine_sanitize_eventlogging_analytics_immediate on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit refine_sanitize_eventlogging_analytics_immediate https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [15:08:16] PROBLEM - Check the last execution of eventlogging_to_druid_netflow_hourly on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit eventlogging_to_druid_netflow_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [15:10:14] elukey: if you come back I need help [15:10:23] with archiva [15:10:37] hmm, OH i know why those are filaing, the refinery deploy i just did failed even though it told me it succeeded!!!!! [15:10:38] (he's trying to take the day off) [15:10:44] OH OOOPS [15:11:02] the timers failing you mean, are because of the deploy? [15:11:18] yes, i deployed just to an launcher 1002 and git fat is all messed up somehow because of the archiva transition \ [15:11:27] it didnt't pull any artifacts afaict [15:12:30] trying to just roll back [15:13:47] I am here sorry [15:14:02] k, I'm trying to run eventlogging_to_druid_prefupdate_hourly to backfill the gap, I'll wait 'till the main timer recovers [15:14:11] so git-fat not working on an-launcher1002? [15:14:32] Luca! Go relax and eat steaks and red beets [15:14:47] ahahahah yes I'll do it in a sec, feel bad for the archiva issue [15:14:49] (I donno, it's what my parents told me to do after giving blood) [15:15:19] elukey: correct, none of the artifacts are pulled, but i think it is because SOME files are missing [15:15:26] i haven't got itquite right yet [15:15:51] milimetric: do you know how to use scap to just revert to previous deploy symlink, not actually deploy an old version? [15:16:00] even rolling back to old revision isn't working, because that tries to git fat pull [15:16:13] it actually told me the deploy suceeded once [15:16:15] nope, looking it up though [15:16:17] which is why i thought things were fine [15:16:29] but after there were no artifacts pulled [15:16:34] ottomata: can I try to drop refinery from an-launcher1002 and run puppet? [15:16:44] to see if it works or not [15:16:45] elukey: no [15:16:48] it will fail [15:16:49] i'm pretty sure [15:17:03] because the git fat pull fails becuse of missing artifacts [15:17:07] it seems to run one rsync command for all [15:17:16] i'm going to try to just manually update the scap symlink [15:17:44] oh no, because I tried to deploy an old revision [15:17:50] and we only keep two revs of scap deploys [15:17:52] the old one is gone [15:17:54] sigh [15:18:43] ottomata: then we may try to just run git fat pull and see what's missing, if it errors out. We can stop timers for the time being, and work on launcher [15:19:04] that could work [15:19:14] elukey: i dunno what was up with the archiva upload [15:19:18] it said everything uploaded fine [15:19:21] for the missing hive artifacts [15:19:26] but after the fact it doesn't have all the files i've uploaded [15:19:34] and I can't re-upload them properly it seems? not sure [15:19:45] i kinda want to delete that version from archiva and upload ONLY the jar we need [15:19:50] not all the -standalone and pom stuff [15:19:57] but, pwstore NEVER WORKS for me [15:20:01] so i don't know what the archiva admin password is [15:20:24] ottomata: you are an admin, it checks ldap [15:20:29] hmmm [15:20:35] ok then admions can't deleete? [15:21:04] from the UI? It should work, at least it worked for me when I dropped all the old artifacts [15:21:13] i dont' see any buttons.... [15:21:17] https://archiva.wikimedia.org/#artifact/org.apache.hive/hive-jdbc/1.1.0-cdh5.10.0 [15:22:16] wait.... [15:22:37] maybe it is a config of the repo [15:23:11] elukey: so i dunno what is happening, the artifacts it is showing me in the UI are not the ones I see on the CLI [15:23:13] e.g. [15:23:25] 1.1.0-cdh5.10.0 jar 94.51 K [15:23:34] 15:22:44 [@archiva1002:/var/lib/archiva] $ ll repositories/analytics-old-uploads/org/apache/hive/hive-jdbc/1.1.0-cdh5.10.0/ [15:23:40] -rw-r--r-- 1 archiva archiva 23649691 Jun 30 14:58 hive-jdbc-1.1.0-cdh5.10.0.jar [15:23:42] !log stop timers on an-launcher1002 to ease debugging for refinery deploy [15:23:43] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:24:38] (didn't find any other ways to scap deploy besides -r REV [15:24:39] ) [15:24:51] thanks for looking milimetric [15:24:54] ottomata: from https://archiva.wikimedia.org/#artifact~analytics-old-uploads/org.apache.hive/hive-jdbc I see the trash button [15:25:00] i could copy the cached version from e.g. stat1004 over to an-launcher [15:25:01] (while being logged in) [15:25:05] but it look slike we are going to fix the deploy [15:25:16] elukey: i do not [15:25:32] can you trash that? [15:25:32] and [15:26:10] AH elukey i think the UI is weird because I did the upload twice, hoping it would overwrite the files [15:26:23] ottomata: one thing that seems to have worked is a direct rsync from 1001 to 1002 and then a force manual reindex from the UI, I did it for the other deps in analytics-old. i [15:26:33] hm [15:26:38] elukey: can you also trash [15:26:38] PROBLEM - Check the last execution of refine_event on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit refine_event https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [15:26:38] https://archiva.wikimedia.org/#artifact/org.apache.hive/hive-service [15:26:43] the 5.10.0 version [15:26:43] ? [15:26:53] sure [15:27:01] hm [15:27:02] * ottomata https://archiva.wikimedia.org/#basicsearch/hive-service [15:27:15] oh sorry i just can't read, yes there is only one 5.10 version [15:27:53] ok dropped [15:28:02] I can rsync now from archiva-old, and re-index [15:28:07] oh [15:28:07] hm [15:28:10] does that work? [15:28:16] how does itknow which repo to go in? [15:28:18] well it did for the other deps that were missing [15:28:20] oh duh [15:28:21] its on disk [15:28:22] ok [15:28:34] ok doing [15:28:41] so hive-service and hive-jdbc right? [15:28:56] yes [15:29:02] they are in mirroed on old [15:29:09] you can put them wherever youthink is best in new [15:29:19] i just did analytics-old-uploads because i could upload from UI [15:36:27] elukey: how goes? [15:38:29] archiva is not collaborating, one sec [15:38:49] ok [15:41:03] it is weird that I can still see https://archiva.wikimedia.org/#browse~analytics-old-uploads/org.apache.hive [15:41:26] yea, maybe the dir just sticks around even though files are removed? [15:42:07] I removed it, weird [15:42:54] ahhh archiva what a pain [15:44:07] another solution that I have in mind is to drop the analytics-old repository, so we start fresh, and we rsync manually all the deps needed [15:44:26] I think my trick doesn't work since archiva has some state somewhere that I haven't found [15:44:42] the deps that I copied last week were two, so very quick to do [15:44:55] yeah [15:45:03] elukey: or i could just upload the two jars we need [15:45:19] i tried to upload all the artifacts for them before [15:45:29] we only need the exact jars deployed for this specific use case [15:45:46] ah ok, do you think it will be different? [15:46:23] yes, somehow it couldnt' figure out the difference between e.g. the standalone jar and the normal one [15:46:28] so it just kept the standalone one [15:46:30] for example, even if the artifacts have been dropped [15:46:30] https://archiva.wikimedia.org/#artifact/org.apache.hive/hive-jdbc [15:46:37] I still see 5.10 [15:46:52] ya interestingly that is the correct jar!!! [15:46:57] lemme try to upload that one and see what happens [15:48:01] if we really need that jar we should somehow add it as explicit dependency in source, and it will be pulled by mirror-cloudera in theory [15:48:39] ah wait! [15:48:40] https://archiva.wikimedia.org/#artifact~analytics-old-uploads/org.apache.hive/hive-jdbc [15:48:42] i think it doesn't exist in cloudera anymore? and i think we can't add it in source because it conflicts [15:48:45] yeah i just uploaded [15:48:53] goood [15:49:01] file looks fine on disk too [15:49:02] it does exists in the repo, I checked [15:49:04] let me try service [15:49:27] sure [15:49:35] I have to say that I am not impressed about archvia [15:49:37] *archiva [15:49:55] I know that there are probably some tips and things that we don't know about operating it [15:49:58] but it is a complete pain [15:50:08] very brittle, it breaks often [15:50:32] anyway [15:52:32] 10Analytics, 10Better Use Of Data, 10Product-Analytics: Bug: 'Include Time' option in table visualization produces "0NaN-NaN-NaN NaN:NaN:NaN" - https://phabricator.wikimedia.org/T256136 (10mpopov) Filed: https://github.com/apache/incubator-superset/issues/10203 The bug report template asks for python stackt... [15:53:13] look sgood elukey [15:53:17] elukey: i agreee [15:53:26] ok going to attempt a refinery deploy [15:53:30] hopefully everytyhign is in place now [15:53:35] ottomata: good luck! [15:53:40] all right, timers are still stopped [15:53:53] we can enable once we see everything looking good [15:54:51] hmm, scap deploy did not pull artifacts [15:54:58] attempting a manually git fat full on an launcher1002 [15:55:31] looking good, it is pulling them [15:56:19] ok! [15:56:22] looks good elukey ! [15:56:22] -rw-r--r-- 1 analytics-deploy analytics-deploy 104904412 Jun 30 15:55 artifacts/org/wikimedia/analytics/refinery/refinery-job-0.0.128.jar [15:57:37] \o/ [15:57:54] but scap didn't work? [15:58:17] no, well [15:58:18] it succeedd [15:58:29] but did not say anything about git fat pull [15:58:34] failing or succeeding [15:59:55] 10Analytics, 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Move Archiva to Debian Buster - https://phabricator.wikimedia.org/T252767 (10Ottomata) We had to trash my uploads, and then I re-uploaded ONLY the .jar files we needed, not the extraneous sources or javadocs or poms. [16:00:50] 10Analytics-Radar, 10AbuseFilter, 10Cognate, 10ConfirmEdit (CAPTCHA extension), and 28 others: Replace PageContent(Insert|Save)Complete hooks - https://phabricator.wikimedia.org/T250566 (10DannyS712) [16:01:36] 10Analytics, 10CirrusSearch, 10Cognate, 10Discovery-Search, and 18 others: Replace TitleMoveComplet(e|ing) hooks - https://phabricator.wikimedia.org/T250023 (10DannyS712) [16:01:42] elukey: can you reenable timers? [16:02:27] standup a-team? [16:02:31] joining [16:04:51] ottomata: yep doing it [16:05:18] !log re-enable timers on an-launcher1002 after archiva maintenance [16:05:19] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:06:29] ottomata: all enabled [16:06:33] elukey: I wanted to try a jenking refinery-source this evening - do you prefer me waiting for tomorrow? [16:06:45] gr8 [16:07:01] joal: nono please go ahead, I'll be around in case I am needed, if it breaks it is good to have Andrew around :) [16:07:05] and luca's off thank you luca!!!! [16:07:11] <3 [16:07:22] Thank you elukey :) [16:08:20] no standup and no retro today everybody! nobody around! :) [16:09:17] :) [16:09:20] Deploy time then :) [16:11:47] (03PS1) 10Joal: Add v0.0.129 to changelog.md [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608661 [16:11:53] ottomata: --^ if you have aminute [16:11:54] (03CR) 10jerkins-bot: [V: 04-1] Add v0.0.129 to changelog.md [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608661 (owner: 10Joal) [16:11:59] WAT? [16:12:25] weiird [16:12:32] feels archiva related again [16:12:36] haha [16:12:37] no way [16:12:38] can't be [16:12:53] that's a merge failure [16:12:58] hm [16:13:00] ok [16:13:03] :) [16:13:06] joal: try to pull and rebase locally? [16:13:07] archiva-everything! [16:13:20] ottomata: just did that! [16:14:30] ottomata: When I click `rebase` in gerrit UI it tells me the change is already up to date with master [16:14:38] which my local repo claims as well [16:14:47] hmmm [16:14:49] very weird [16:14:56] should we ask rel-eng? [16:15:22] we could remove jenkins -1 and just try to merge? or abandoned and resbumit to see what haappens just in case [16:15:29] abandon*( [16:15:33] will try to abandon/resubmit [16:15:42] * ottomata is very bad at typing with this new ergo keyboard [16:15:50] ok [16:16:03] (03Abandoned) 10Joal: Add v0.0.129 to changelog.md [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608661 (owner: 10Joal) [16:17:55] (03PS1) 10Joal: Add v0.0.129 to changelog.md [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608662 [16:18:02] (03CR) 10jerkins-bot: [V: 04-1] Add v0.0.129 to changelog.md [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608662 (owner: 10Joal) [16:18:02] ottomata: --^ let's see [16:18:06] meeeeeeeh [16:18:18] whaaa [16:18:40] (03CR) 10Joal: "recheck" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608662 (owner: 10Joal) [16:19:53] joal i just tried to rebase locally and it did rebaes, but now i can't git review [16:20:04] its trying to submit all the patches underneath of it that we just merged.... [16:20:06] weird [16:20:29] ottomata: if there were patches in addition, seems legit [16:20:52] (03PS2) 10Ottomata: Add v0.0.129 to changelog.md [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608662 (owner: 10Joal) [16:21:18] same ottomata [16:21:28] weird [16:21:29] :( [16:21:48] ya lets ask releng [16:22:04] ottomata: prod chan? [16:23:04] #wikimedia-releng [16:23:07] just posted in there [16:23:11] Ahhh [16:26:14] (03CR) 10Ottomata: [V: 03+2 C: 03+2] "recheck" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608662 (owner: 10Joal) [16:26:20] (03CR) 10Ottomata: [C: 03+2] Add v0.0.129 to changelog.md [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608662 (owner: 10Joal) [16:26:22] (03CR) 10jerkins-bot: [V: 04-1] Add v0.0.129 to changelog.md [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608662 (owner: 10Joal) [16:26:28] (03CR) 10jerkins-bot: [V: 04-1] Add v0.0.129 to changelog.md [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608662 (owner: 10Joal) [16:27:10] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Add v0.0.129 to changelog.md [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608662 (owner: 10Joal) [16:28:31] joal: i think manual merge was fine [16:28:36] ack elukey - [16:28:39] will check [16:28:41] jenkins is just nasty [16:29:19] (03PS1) 10Joal: Bump mediawiki-history-denormalize jar version [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608665 (https://phabricator.wikimedia.org/T255548) [16:29:32] ok - I can deal with that for today - I'll get back to it sometimes it guard is off [16:29:40] ottomata: the above patch please :) [16:29:59] RECOVERY - Check the last execution of refine_event on an-launcher1002 is OK: OK: Status of the systemd unit refine_event https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [16:30:24] !log Deploy refien [16:30:25] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:30:27] sorrrrryyyy [16:30:42] !log Release refinery-source v0.0.129 using jenkins [16:30:43] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:31:06] Starting build #50 for job analytics-refinery-maven-release-docker [16:31:56] joal: done, about to go for a run, shall I wait for more? [16:32:03] or should I wait for deploy to be 100% :) [16:33:01] ottomata: you can go - jenkins either succeeds or fail ,then I try to continue deploying or not - I'll actually go have diner with family in the mid-time - so let's recombine in some time [16:33:09] ok [16:33:17] eventlogging camus looks good [16:33:22] yey! [16:33:33] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Bump mediawiki-history-denormalize jar version [analytics/refinery] - 10https://gerrit.wikimedia.org/r/608665 (https://phabricator.wikimedia.org/T255548) (owner: 10Joal) [16:33:41] good thign too,as i was looking i noticed that there are already camus imports for TemplateWizard for dates in July! [16:33:44] I'm eager to be able to backfill wdqs events as well [16:33:52] ya [16:34:01] ok great [16:34:06] Have a good run :) [16:51:57] jenkins is just jealous 'cause of recent news [16:58:19] https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release-docker/50/console looks good [16:58:36] it is interesting that we pull from maven directly [16:59:14] for example [16:59:15] Downloading from central: http://repo1.maven.org/maven2/org/codehaus/plexus/plexus-utils/3.3.0/plexus-utils-3.3.0.jar [16:59:29] and some from wmf-mirrored [16:59:49] but in theory now we could set mirror-maven-central [17:00:13] to leverage jenkins -> archiva better net speed? [17:00:44] just a nit, it will not change much probably [17:01:07] Project analytics-refinery-maven-release-docker build #50: 04FAILURE in 30 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release-docker/50/ [17:01:12] RECOVERY - Check the last execution of eventlogging_to_druid_navigationtiming_hourly on an-launcher1002 is OK: OK: Status of the systemd unit eventlogging_to_druid_navigationtiming_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [17:01:42] buuuuu [17:02:11] this is interesting [17:02:12] Failed to transfer file: http://repo1.maven.org/maven2/org/apache/maven/plugins/maven-source-plugin/maven-metadata.xml. Return code is: 501 , ReasonPhrase:HTTPS Required. [17:03:02] RECOVERY - Check the last execution of refine_sanitize_eventlogging_analytics_immediate on an-launcher1002 is OK: OK: Status of the systemd unit refine_sanitize_eventlogging_analytics_immediate https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [17:03:17] https://repo1.maven.org/maven2/org/apache/maven/plugins/maven-source-plugin/maven-metadata.xml works [17:04:42] RECOVERY - Check the last execution of eventlogging_to_druid_netflow_hourly on an-launcher1002 is OK: OK: Status of the systemd unit eventlogging_to_druid_netflow_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [17:08:04] (03PS1) 10Elukey: Pull artifacts and plugin from Archiva's central repository [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608692 [17:08:11] (03CR) 10jerkins-bot: [V: 04-1] Pull artifacts and plugin from Archiva's central repository [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608692 (owner: 10Elukey) [17:08:45] (03PS2) 10Elukey: Pull artifacts and plugin from Archiva's central repository [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608692 [17:08:52] (03CR) 10jerkins-bot: [V: 04-1] Pull artifacts and plugin from Archiva's central repository [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608692 (owner: 10Elukey) [17:09:24] RECOVERY - Check the last execution of eventlogging_to_druid_editattemptstep_hourly on an-launcher1002 is OK: OK: Status of the systemd unit eventlogging_to_druid_editattemptstep_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [17:09:30] RECOVERY - Check the last execution of eventlogging_to_druid_prefupdate_hourly on an-launcher1002 is OK: OK: Status of the systemd unit eventlogging_to_druid_prefupdate_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [17:10:04] anyway, joal/ottomata I create the above patch, will be back in ~30 mins :) [17:13:06] it also says "Build timed out (after 30 minutes). Marking the build as failed." that is weird, never seen in [17:13:09] anywa [17:33:01] Here I am [17:33:13] Let's talk about the patch when you're back Luca [17:34:29] ok just checked refinery-source repo, it looks clean [17:34:36] We can test anew [17:44:44] here I am [17:45:03] Good evening elukey - I'm super sorry to bother you :( [17:45:15] nono no problem, sorry that this thing is failing [17:45:22] more than happy to help [17:45:41] I know that, and I also know you need and deserve holidays [17:45:43] anywa [17:45:57] totally recovered from the blood donation don't worry :) [17:46:06] I drank a ton of water today, feel a lot better [17:46:14] I've been reading your patch, and was confirming my understanding of the section in the pom [17:46:59] elukey: I think you deserve and need holidays wether you give blood or not - You have worked really hard in the past years - resting is important :) [17:47:11] <3 [17:47:14] <3 [17:47:18] speaking of fun things [17:47:43] yes? [17:48:01] IIUC the pom instruct jenkins to pull from maven central, and in some cases it seems failing due to https:// vs http://, and eventually it times out (30 mins) [17:48:11] now I am not sure if the above build was particularly unlucky [17:48:17] elukey: agreed, for plugins [17:48:49] I was reading https://maven.apache.org/pom.html#Repositories - And the enabled=false for the main maven repo measn don't use it [17:49:05] Actually, don't use it neither for snapshots or releases [17:49:27] BUT - we have it set to enabled=true for repository-plugins [17:49:37] ah interesting! [17:50:33] So, changing the repo URL for archive is not correct - We want to disable the actual central, while having the mirror enabled (see below, wmf-mirrored) [17:50:46] * ottomata is back [17:51:00] Hi ottomata - We're having fun with maven :) [17:52:11] elukey: I think the change we need is to make central disabled for plgins [17:52:25] sure makes sense, going to update the patch [17:52:44] Thanks mate [17:55:51] (03PS3) 10Elukey: Pull plugins from Archiva instead of Maven Central [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608692 [17:56:04] (03CR) 10jerkins-bot: [V: 04-1] Pull plugins from Archiva instead of Maven Central [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608692 (owner: 10Elukey) [17:56:25] elukey: we've experienced that with ottomata as well - seems unrelated to the quality of the patch [17:56:32] elukey: (the -1 from jenkins) [17:57:10] yes yes today jenkins hates us [17:58:26] yup [17:58:50] I'm gonna have to get an appointment with him and a releng moderator, or it'll be nasty [17:59:18] elukey: currently trying your patch locally for me - will also test it on stat1004 [18:02:59] elukey: your patch succeeded to build for me locally [18:03:17] elukey: it also worked on stat1004 [18:03:29] BUT - I think I have understood why jenkins hates us [18:03:44] I have no clue how - but the jar built is v0.0.127 !!! [18:03:49] instead of 0.0.129 [18:04:27] joal: does the output still show pulling from central or only wmf-mirrored? [18:04:30] (I can test as well) [18:04:42] elukey: so many lines I can't tell :( [18:05:25] ok - I don't know why but it seems gerrit is slow - there are acutally cleanup to be done [18:05:28] doing now [18:06:05] Downloaded from wmf-mirrored: https://archiva.wikimedia.org/repository/mirrored/org/scalatest/scalatest-maven-plugin/1.0/scalatest-maven-plugin-1.0.jar (24 kB at 13 kB/s) [18:06:14] if this is part of plugins, we are good [18:06:15] \o/ [18:06:18] seems so [18:06:37] about gerrit - I see in here https://gerrit.wikimedia.org/r/admin/repos/analytics/refinery/source [18:06:42] that we have merge if necessary [18:07:22] not sure if moving to rebase if necessary would be better [18:08:05] joal: are https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/337900 and https://gerrit.wikimedia.org/r/c/analytics/refinery/source/+/294055 something that I can try to abandon? [18:08:20] they show up as merge conflict, I am curious to know if it is related [18:08:27] (for some weird reason) [18:08:43] elukey: the second one is ready for that yes :) [18:08:49] elukey: the one about flink [18:09:27] (03Abandoned) 10Elukey: [WIP] Test Flink [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/294055 (owner: 10Joal) [18:10:31] joal: I'll abandon 337900 temporarily, we can re-open it if necessary (just checking my theory) - is it ok? [18:11:04] sure [18:11:36] (03Abandoned) 10Elukey: [WIP] Add job computing citations diffs over text [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/337900 (https://phabricator.wikimedia.org/T158896) (owner: 10Joal) [18:11:50] (03PS1) 10Joal: Revert "[maven-release-plugin] prepare release v0.0.129" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608700 [18:11:52] (03PS4) 10Elukey: Pull plugins from Archiva instead of Maven Central [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608692 [18:11:57] (03CR) 10jerkins-bot: [V: 04-1] Revert "[maven-release-plugin] prepare release v0.0.129" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608700 (owner: 10Joal) [18:11:59] (03CR) 10jerkins-bot: [V: 04-1] Pull plugins from Archiva instead of Maven Central [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608692 (owner: 10Elukey) [18:12:50] nope :) [18:15:17] sorry haven't been following backscroll lemme know if i can help [18:17:53] ottomata: I am wondering - maybe the -1s are related to "merge if necessary" in https://gerrit.wikimedia.org/r/admin/repos/analytics/refinery/source ? [18:18:07] for some reason the new gerrit doesn't like source + that setting [18:18:16] ? [18:18:25] maybe elukey? [18:18:32] I am wondering if "rebase if necessary" could help [18:18:33] i can only reason about what those settings means [18:18:34] elukey, ottomata - asking permission to force merge the revert of v0.0.129 prep [18:18:34] don't really know [18:18:39] if so, we should file a bug [18:18:51] joal yes please force anything you need [18:19:06] can I try a sec rebase if necessary? [18:19:18] sure [18:19:22] elukey: before I merge I guess [18:19:55] (03CR) 10Elukey: "recheck" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608692 (owner: 10Elukey) [18:20:02] (03CR) 10jerkins-bot: [V: 04-1] Pull plugins from Archiva instead of Maven Central [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608692 (owner: 10Elukey) [18:20:05] ahahha nope [18:20:29] restored prev config, joal go ahead [18:21:24] Who should I be talking to about reportupdater again? :D [18:23:06] ok elukey - merging the patches (mine and then yours) [18:23:12] ack! [18:23:20] (03CR) 10Joal: [V: 03+2 C: 03+2] "Force merging" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608700 (owner: 10Joal) [18:23:28] (03CR) 10jerkins-bot: [V: 04-1] Revert "[maven-release-plugin] prepare release v0.0.129" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608700 (owner: 10Joal) [18:24:37] elukey: I can't do it! [18:24:49] elukey: I can't remove the jenkins-bot -1 [18:25:17] ahah! [18:25:19] managed to do it [18:25:26] tricked it [18:25:54] (03CR) 10Joal: [V: 03+2 C: 03+2] "Force merging" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608692 (owner: 10Elukey) [18:27:10] elukey: ok for you if I try a new jenkins job?n [18:27:37] +! [18:27:38] elukey: repo is cleanedup from previous job, and your patch is merged [18:27:38] 1 [18:27:39] :D [18:27:41] ack! [18:28:07] !log trying to release refinery-source to archiva from jenkins (second time) [18:28:09] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:28:24] Starting build #51 for job analytics-refinery-maven-release-docker [18:28:39] here we are [18:31:01] addshore: I'm in some training now but what's up? Marcel's on vacation for a couple weeks so you're stuck with me helping you :P [18:31:58] :D [18:32:47] I have a query that I was to run for 1 wiki, taking data from hadoop and spitting data into graphite. So far I looked at the report updater docs and it all looks a bit complicated D: I was suggested into looking into it a yer or so ago, and I'm finally there :P [18:33:39] addshore: just fyi, SRE is trying to deprecate graphite [18:33:47] so you might not want to add anything new there [18:34:17] thats true, well, its not new, just been broken for a while, but the old data is still there. I'm also fine with making it send data to somewhere else that can be visualized in grafana :D [18:36:52] unless that isn't possible in report updater yet :PP [18:39:17] elukey: I don't know why, but it feels the new archiva is a lot slower than the previous one [18:39:32] elukey: jenkins job is taking quite longer [18:40:46] joal: cache should be warmed up, is it longer for jenkins jobs or in general? [18:41:39] elukey: I think it's for when you need to download - in regular mode, everything is in local cache, so almost no download happens - except for jenkins [18:42:15] elukey: the previous failed jenkins job took ~30 minutes, vs 10 minutes before [18:44:21] so slowness makes jenkins to hit the 30 mins timeout [18:44:32] could be yes [18:44:56] elukey: current job has been running for ~20 minutes [18:45:11] started at 6:28pm UTC [18:45:43] maybe the repo-group thing is really flexible but slower than mirrored [18:45:58] possible - I don't know :( [18:46:13] maybe explicit mirror can help? [18:46:31] well we tried to move away from that model [18:46:37] right [18:49:19] the download speed shown by jenkins is really terrible [18:50:30] elukey: maybe there is no cache? [18:53:05] joal: I was thinking about it as well but we pulled artifacts doing our builds from stat1004 etc.. [18:55:35] Project analytics-refinery-maven-release-docker build #51: 04STILL FAILING in 27 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release-docker/51/ [18:56:21] Failed to transfer file: https://archiva.wikimedia.org/repository/releases/org/wikimedia/analytics/refinery/refinery/0.0.129/refinery-0.0.129.pom. Return code is: 409, ReasonPhrase: Overwriting released artifacts is not allowed.. -> [Help 1] [18:57:08] but it would have hit the timeout anyway I think [18:57:20] WAT? [18:58:13] elukey: some jars had been released !!!!! [18:58:36] MEH [18:58:56] elukey: deleting them manually now :( [19:00:08] (03PS3) 10Ottomata: Overloaded methods to make working with default Refine related classes easier [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/607788 [19:00:23] elukey: howdo you want us to proceed? [19:00:37] elukey: I'm gonna clean up the repo anew [19:01:06] joal: it seems that the network slowness needs some investigation, it is kinda late so I'd start it tomorrow morning if not super urgent [19:01:34] if ottomata has ideas please go forward, but it looks something that might take a bit to track down :( [19:02:35] sheesh [19:02:46] nasty [19:02:57] still only half following, so the issue is archiva is now too slow for build? [19:02:57] makes sense elukey - deploy is important this week as there are 2 stuff that need to happen fast (new pageview def, backfilling wdqs events) [19:03:17] i can do that backfill without a deploy if we have to [19:03:23] as a separate manual refine job [19:03:32] all i need to do is refine them without the hostname filter [19:03:37] joal: how about I go ahead and do that [19:03:48] and we chill on deploy til tomorrow? [19:04:04] joal, ottomata - ok if I log off then? [19:04:43] works for me ottomata - product people are eager for pageview-def update, but I don't want to make a special mess for that :( [19:04:55] tomorrow if you didn't find the culprit I'll start the investigation in my morning, I have some ideas in mind but it will require a fresh brain + coffee :D [19:05:32] also - we should not rush, people can wait one day :) [19:05:44] (in general I mean) [19:06:04] sounds good elukey [19:06:09] thanks fo rhelping [19:06:18] i hope you dont' pass out after al lthis work and giving blood [19:06:19] i would [19:06:22] sorry for this, hopefully is not another Luca's PEBCAK [19:06:27] good - finishing the cleaning then resting as well [19:06:33] thanks again elukey and ottomata [19:06:35] o/ [19:07:03] (03PS1) 10Joal: Revert "[maven-release-plugin] prepare release v0.0.129" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608712 [19:07:10] (03CR) 10jerkins-bot: [V: 04-1] Revert "[maven-release-plugin] prepare release v0.0.129" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608712 (owner: 10Joal) [19:07:54] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for clean state before release retry" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608712 (owner: 10Joal) [19:09:33] hm joal i guess we'll have to backfil a second time after we deploy [19:09:40] for stuff refined more recntly [19:09:45] i'll just backfil up to june 1 for now [19:09:52] ottomata: hm - I don't get it [19:09:59] the job is currently refining with the filter [19:10:23] ah yes - we need to bakcfill up to next deploy date [19:11:02] ya i'll just grab til june 1 for now to make sure we don't lose old stuff [19:11:12] perfect - thanks a lot for that@ [19:13:18] 10Analytics, 10Event-Platform: Backfill wdqs_external_sparql_query without filtering on meta.domain - https://phabricator.wikimedia.org/T256797 (10Ottomata) [19:16:52] 10Analytics, 10Event-Platform: Backfill wdqs_external_sparql_query without filtering on meta.domain - https://phabricator.wikimedia.org/T256797 (10Ottomata) ` sudo -u analytics kerberos-run-command analytics /usr/bin/spark2-submit \ --name refine_event_wdqs_external_sparql_query_backfill_otto0 \ --class org.wi... [19:17:12] 10Analytics, 10Analytics-Kanban, 10Event-Platform: Backfill wdqs_external_sparql_query without filtering on meta.domain - https://phabricator.wikimedia.org/T256797 (10Ottomata) a:03Ottomata [19:18:32] 10Analytics, 10CirrusSearch, 10Cognate, 10Discovery-Search, and 18 others: Replace TitleMoveComplet(e|ing) hooks - https://phabricator.wikimedia.org/T250023 (10DannyS712) [19:20:31] (03CR) 10Ottomata: Overloaded methods to make working with default Refine related classes easier (034 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/607788 (owner: 10Ottomata) [19:30:12] 10Analytics-Radar, 10Product-Analytics: prefUpdate schema contains multiple identical events for the same preference update - https://phabricator.wikimedia.org/T218835 (10MNeisler) In T249386, I used prefupdate data to determine the opt-in and opt-out rate of the `discussiontools-betaenable` property. While re... [19:32:15] gone for tonight team - tomorrow kids day. will show every now and then to see if I can help with archiva [19:32:28] nite yo [19:38:17] 10Analytics, 10Product-Analytics: Clarify the data retention extension process - https://phabricator.wikimedia.org/T256776 (10LGoto) p:05Triage→03Medium [19:45:34] 10Analytics, 10Product-Analytics: Re-process webrequests from 2020-05-18 so that page views from latest Wikipedia app releases are counted - https://phabricator.wikimedia.org/T256516 (10mpopov) need feedback from @SNowick_WMF's conversations with apps PMs [19:46:45] 10Analytics, 10Product-Analytics, 10Patch-For-Review: New app pageview definition needs to be deployed - https://phabricator.wikimedia.org/T256515 (10kzimmerman) p:05Triage→03Unbreak! [19:48:16] 10Analytics, 10Product-Analytics, 10Patch-For-Review: New app pageview definition needs to be deployed - https://phabricator.wikimedia.org/T256515 (10mpopov) a:03JAllemandou [19:48:31] 10Analytics, 10Product-Analytics, 10Patch-For-Review: New app pageview definition needs to be deployed - https://phabricator.wikimedia.org/T256515 (10mpopov) [19:48:33] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10Product-Analytics (Kanban): PageviewDefinition should detect /api/rest_v1/page/mobile-html requests as pageviews - https://phabricator.wikimedia.org/T256514 (10mpopov) 05Open→03Resolved [19:48:35] 10Analytics, 10Product-Analytics: API pageview counts for 'Mobile app' are incorrect since switch to mobile-html - https://phabricator.wikimedia.org/T256508 (10mpopov) [19:49:16] 10Analytics, 10Product-Analytics, 10Epic: API pageview counts for 'Mobile app' are incorrect since switch to mobile-html - https://phabricator.wikimedia.org/T256508 (10kzimmerman) [20:04:04] 10Analytics, 10Product-Analytics, 10Epic: Identify next steps for dealing with missing mobile app pageview counts - https://phabricator.wikimedia.org/T256804 (10kzimmerman) [20:05:54] 10Analytics, 10Product-Analytics (Kanban): Identify next steps for dealing with missing mobile app pageview counts - https://phabricator.wikimedia.org/T256804 (10kzimmerman) p:05Triage→03High [20:23:34] 10Analytics, 10Product-Analytics, 10Epic: API pageview counts for 'Mobile app' are incorrect since switch to mobile-html - https://phabricator.wikimedia.org/T256508 (10Nuria) >so the good news is that webrequests (the source data from which pageview counts we use everywhere are derived) is retained for 90 d... [20:25:17] 10Analytics, 10Product-Analytics, 10Epic: API pageview counts for 'Mobile app' are incorrect since switch to mobile-html - https://phabricator.wikimedia.org/T256508 (10Nuria) The way to avoid this happening in the future is to move to event-based pageviews (MEP can help) for us, which puts 100% of the cont... [20:27:15] https://www.irccloud.com/pastebin/dqR2ocqC/ [20:27:31] milimetric: so I want to run queries like that ^^ and put the data somewhere specific in graphite [20:27:42] for each snapshot that is generated [20:29:06] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Data missing in event_prefupdate in Druid - https://phabricator.wikimedia.org/T256178 (10Milimetric) Backfilled the data from Hive, looks good to me now. Thanks for doing all the work @nettrom_WMF! :) [20:37:35] addshore: the tricky thing is that grafana's datasources (that we currently use) can't accept historical data [20:37:40] graphite does [20:37:45] but prometheus does not [20:37:53] there is a unmaintained druid datasource for grafana that would work [20:38:00] but until someone fixes it up we can't use it [20:38:35] could you make a dashboard in superset instead? [20:39:34] Now thats a valid question! I imagine the answer is yes? all of the data could just live in another hadoop table after all. the fact said dashboard wouldnt be public could be annoying [20:39:45] we have a whole bunch of this sort of data that we currently put in graphite [20:43:14] (03PS1) 10Ottomata: Remove unused custom avro camus classes [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608725 [20:43:21] (03CR) 10jerkins-bot: [V: 04-1] Remove unused custom avro camus classes [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/608725 (owner: 10Ottomata) [20:57:03] 10Analytics-EventLogging, 10Analytics-Radar, 10QuickSurveys, 10MW-1.35-notes (1.35.0-wmf.3; 2019-10-22), and 2 others: QuickSurveys EventLogging missing ~10% of interactions - https://phabricator.wikimedia.org/T220627 (10Nuria) >but in this case we're comparing two eventlogging streams so sendBeacon is goi... [21:23:47] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Data missing in event_prefupdate in Druid - https://phabricator.wikimedia.org/T256178 (10nettrom_WMF) Verified in Superset that data is now available, looks good to me. Thanks @Milimetric! [21:23:52] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Data missing in event_prefupdate in Druid - https://phabricator.wikimedia.org/T256178 (10nettrom_WMF) 05Open→03Resolved [21:24:37] 10Analytics, 10Product-Analytics (Kanban): Collect metrics/tables which might be touched by IP masking feature - https://phabricator.wikimedia.org/T255816 (10kzimmerman) @Nuria IP masking changes won't be rolling out until Q2 at the earliest, but I wanted to make sure this is on your team's radar. [21:27:50] ottomata: I'm seeing if superset would work for us :) https://phabricator.wikimedia.org/T154601#6269648 Then this might be a bit easier! [22:38:25] (03CR) 10Nuria: Update whitelisting of Growth Team's schemas (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/607615 (https://phabricator.wikimedia.org/T255501) (owner: 10Nettrom) [22:39:41] (03CR) 10Nuria: [C: 03+2] "Merging per Marcel's CR" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/606734 (https://phabricator.wikimedia.org/T247417) (owner: 10Nuria)