[02:02:37] 10Analytics: Superset error: `Hive table 'wmf.webrequest' is corrupt. The number of files in the directory (64) does not match the declared bucket count (256) for partition` - https://phabricator.wikimedia.org/T273693 (10JAllemandou) Hi @jrbs, The problem comes from a change we made to the table on 2020-11-16 (s... [02:32:24] 10Analytics, 10observability: Modify Kafka max replica lag alert to only alert if increasing - https://phabricator.wikimedia.org/T273702 (10Ottomata) [02:41:15] 10Analytics, 10observability: Modify Kafka max replica lag alert to only alert if increasing - https://phabricator.wikimedia.org/T273702 (10Ottomata) I think [[ https://grafana-rw.wikimedia.org/explore?orgId=1&left=%5B%22now-24h%22,%22now%22,%22eqiad%20prometheus%2Fops%22,%7B%22expr%22:%22deriv(kafka_server_Re... [02:52:18] 10Analytics, 10observability: Modify Kafka max replica lag alert to only alert if increasing - https://phabricator.wikimedia.org/T273702 (10Ottomata) I added that query as [[ https://grafana-rw.wikimedia.org/d/000000027/kafka?orgId=1&from=1612309903159&to=1612320703159&var-datasource=eqiad%20prometheus%2Fops&v... [03:12:41] (03PS1) 10Eric Gardner: Update schema to handle quickview copy events [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/661273 (https://phabricator.wikimedia.org/T263663) [07:20:50] good morning [07:25:30] still a million pending blocks to replicate for an-worker1117 [07:39:28] I added some info to the refinery source doc page [07:39:29] https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Deploy/Refinery-source#How_to_deploy_with_Jenkins_%28and_related_steps%29 [07:40:00] specifically a reference of the SNAPSHOT version and how to deal with failed builds etc.. [07:40:05] let me know if it makes sense :) [07:41:36] in this case, since we are at 0.1.2 but we wanted 0.1.0 [07:41:58] - git push --delete origin v0.1.2 [07:42:07] - git push --delete origin v0.1.1 [07:42:15] git push --delete origin v0.1.0 [07:42:20] then [07:43:08] find -name pom.xml -exec sed -e 's/0.1.3-SNAPSHOT/0.1.0-SNAPSHOT/' -i {} \; [07:43:33] add the info to the changelog, send review and merge [07:43:35] build [07:43:42] does it make sense? [07:45:54] 10Analytics, 10SRE, 10Traffic: Downloading from Archiva.wikimedia.org seems slower than Maven Central - https://phabricator.wikimedia.org/T273086 (10elukey) To keep archives happy - I had to revert the patch since some maven build jobs issue HTTP PUT to the /repository path, meanwhile my assumption was that... [08:15:44] 10Analytics: Upgrade the Analytics Hadoop cluster to Apache Bigtop - https://phabricator.wikimedia.org/T273711 (10elukey) [08:23:51] 10Analytics: Check home/HDFS data of Bernd Sitzmann - https://phabricator.wikimedia.org/T273712 (10MoritzMuehlenhoff) [08:25:01] Hadoop upgrade scheduled :) [08:29:34] Hi elukey [08:30:43] bonjour Joseph [08:32:46] elukey: Thanks for scheduling the upgrade :) I think the duration you mentioned is ver optimistic, particularly given that last round of data-copy check will take a few hours - but eh :) [08:33:19] joal: IIRC it was not few hours, we talked about 1/2 [08:33:36] plus 2 for the upgrade, it seemed good enough [08:33:38] elukey: About refinery-procedure, you're missing removing 0.1.3 artifact from archiva :) [08:34:26] also in theory when we'll copy data the cluster will not be down [08:34:39] elukey: last round of copy/check will not be 1/2h - For instance going over /user is 2h :S [08:34:39] we'll discourage people to make changes etc.. [08:35:03] I think it is fine to go over /user the day before [08:35:30] elukey: I think we should make HDFS read-only for the last round of copy, to prevent unexpected discrepencies and facilitate jobs not failing [08:36:08] elukey: distcp is not resilient to file changes in the middle of a copy [08:36:33] joal: but you are using it while jobs are running no? [08:36:54] I wish to make more tests about folders to copy, cause /wmf/data will for instance be very long as well (even if a lot of data is not copied) [08:36:59] I am trying to find a good balance between "we are paranoid" vs "we are too paranoid :D" [08:37:23] elukey: I do run the job while jobs are running, and that's why sometimes I have failures [08:37:45] if you have a list of commands to run, I can wake up a little earlier and run them [08:38:10] (after draining the cluster) [08:38:33] elukey: I'll happily wake up early and run them early on - My point being that it'd be great if we could stale the cluster early (stop camus, drain, then read-only [08:38:33] Also if we are in read-only things like Superset should still work, that is good for most of the people [08:38:44] true elukey [08:38:45] joal: yep yep [08:39:08] sorry for the ununderstanding on time taken for copy :( [08:39:45] joal: so we can do in this way - I wake up a little before you, say an hour or more, and start the cluster drain. So by the time you join, we should be in a relatively good state [08:39:57] then we can set safe mode, and start the copy [08:40:11] distcp doesn't need any tmp file etc.. right? [08:40:23] elukey: That's great about starting early, you tell me at what time we start and I wake up to be with you :) [08:40:49] joal: nah I'll likely go back to bed after stopping camus etc.. :D [08:41:05] early == 6/7 AM :D [08:41:05] elukey: distcp uses tmp, but it does so on the cluster running the job, namely the backup one [08:41:13] perfect [08:41:14] perfect elukey :) [08:41:27] thanks for bearing with me :) [08:41:28] I'll come up with a precise schedule in the task [08:41:30] <3 [08:41:54] let's keep 4h of downtime so people don't freak out :D [08:42:01] elukey: huhuhu :) [08:43:15] elukey: have you noticed my comment on archiva? the procedure to make version back to 0.1.0 seems good, adding removing artifacts from archiva [08:43:22] about archiva (if you have time) - is it easy to drop releases? I guess it is what I did the last time when I caused the big mess with refinery right? [08:43:29] ahahah yes I was about to ask :) [08:43:31] :) [08:44:41] so I have to go through [08:44:43] https://archiva.wikimedia.org/#artifact~releases/org.wikimedia.analytics.refinery.camus/refinery-camus [08:44:46] etc.. [08:44:48] and drop 0.1.2 [08:45:35] correct elukey - I have no better way than clicking along :( [08:46:23] elukey: asking permission to start a gentle disctp (same as yesterday that didn't raise alerts) and letting it run under your control [08:47:59] I am asking to Arzhel since there was a network alarm, but I think it should be nothing (better to wait for the green light though) [08:48:08] all right green light joal [08:48:08] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-01-20): Add edit count bucketing to all metrics - https://phabricator.wikimedia.org/T269986 (10awight) >>! In T269986#6797385, @Ottomata wrote: > I manually altered... [08:48:46] !log drop refinery source artifacts v0.1.2 from Archiva [08:48:50] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:51:34] ack elukey - starting job and gone [08:51:36] joal: the ones in https://archiva.wikimedia.org/#browse~releases/org.wikimedia.analytics.refinery are enough right ? [08:51:39] super [08:51:50] I think so elukey - [08:51:53] <3 [08:51:58] Thanks a lot :) [08:52:56] I have updated the docs so another instance of Luca the n00b will know what to do ;D [08:54:01] !log drop v0.1.x tags from Refinery source upstream repo [08:54:03] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:56:12] (03PS1) 10Elukey: Revert next iteration to 0.1.0-SNAPSHOT in pom.xml files [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/661337 [08:56:44] joal: when you have time --^ [08:57:43] not sure if I have to mention anything to the changelog [09:32:54] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10MW-1.36-notes (1.36.0-wmf.30; 2021-02-09), 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-02-03): Compensate for sampling - https://phabricator.wikimedia.org/T273454 (10Lena_WMDE) [09:34:55] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-02-03): Adjust edit count bucketing for TemplateWizard, segment all metrics - https://phabricator.wikimedia.org/T273475 (10Lena_WMDE) [09:35:17] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-02-03): Adjust edit count bucketing for CodeMirror - https://phabricator.wikimedia.org/T273471 (10Lena_WMDE) [10:02:49] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-02-03): Adjust edit count bucketing for CodeMirror - https://phabricator.wikimedia.org/T273471 (10lilients_WMDE) [10:04:52] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-02-03): Adjust edit count bucketing for CodeMirror - https://phabricator.wikimedia.org/T273471 (10lilients_WMDE) [10:05:22] (03CR) 10Elukey: [C: 03+2] Revert next iteration to 0.1.0-SNAPSHOT in pom.xml files [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/661337 (owner: 10Elukey) [10:08:19] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-02-03): Adjust edit count bucketing for TemplateWizard, segment all metrics - https://phabricator.wikimedia.org/T273475 (10Andrew-WMDE) [10:08:53] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-02-03): Adjust edit count bucketing for TemplateWizard, segment all metrics - https://phabricator.wikimedia.org/T273475 (10Andrew-WMDE) [10:11:38] (03Merged) 10jenkins-bot: Revert next iteration to 0.1.0-SNAPSHOT in pom.xml files [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/661337 (owner: 10Elukey) [10:12:58] all right let's try another build [10:16:38] Starting build #73 for job analytics-refinery-maven-release-docker [10:19:05] 10Analytics, 10observability: Modify Kafka max replica lag alert to only alert if increasing - https://phabricator.wikimedia.org/T273702 (10fgiunchedi) >>! In T273702#6798669, @Ottomata wrote: > I think [[ https://grafana-rw.wikimedia.org/explore?orgId=1&left=%5B%22now-24h%22,%22now%22,%22eqiad%20prometheus%2F... [10:22:02] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-02-03): Adjust edit count bucketing for TemplateWizard, segment all metrics - https://phabricator.wikimedia.org/T273475 (10Andrew-WMDE) [10:26:14] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), 10WMDE-TechWish (Sprint-2021-01-20): Adjust edit count bucketing for VisualEditor in Grafana - https://phabricator.wikimedia.org/T273728 (10lilients_WMDE) [10:26:25] Project analytics-refinery-maven-release-docker build #73: 09SUCCESS in 9 min 49 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release-docker/73/ [10:27:27] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-02-03): Adjust edit count bucketing for TemplateWizard, segment all metrics - https://phabricator.wikimedia.org/T273475 (10Andrew-WMDE) [10:28:02] ah! nice! [10:28:25] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10WMDE-TechWish (Sprint-2021-02-03): Adjust edit count bucketing for VisualEditor in Grafana - https://phabricator.wikimedia.org/T273728 (10lilients_WMDE) [10:28:58] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), 10WMDE-TechWish (Sprint-2021-01-20): Adjust edit count bucketing for VisualEditor, segment all metrics - https://phabricator.wikimedia.org/T273474 (10lilients_WMDE) [10:31:17] 10Analytics-Radar, 10WMDE-Templates-FocusArea: Adjust edit count bucketing for VisualEditor in Grafana - https://phabricator.wikimedia.org/T273728 (10lilients_WMDE) [10:33:08] Starting build #37 for job analytics-refinery-update-jars-docker [10:33:34] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.1.0 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/661342 [10:33:35] Project analytics-refinery-update-jars-docker build #37: 09SUCCESS in 26 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars-docker/37/ [10:35:28] (03CR) 10Elukey: [V: 03+2 C: 03+2] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/661342 (owner: 10Maven-release-user) [10:36:00] !log released Refinery Source 0.1.0 [10:36:03] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:38:27] ok so I think I can deploy refinery [10:38:31] there are two changes [10:38:41] 1) that requires a webrequest_load restart [10:39:05] 2) it is related to indexing webrequest_sampled, so I guess no action needed since new indexations will pick it up [10:47:05] also, do we need to deploy refinery to an-coord1001? [10:47:28] because it may probably be taken out the list, to speed up the deployment [10:48:15] brb [10:56:25] the complete deployment took 11 mins, that I think it is faster right? (If so Antoine is the one to thank!) [10:57:37] !log deploy refinery to hdfs [10:57:40] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:40:55] * elukey lunch! [11:41:10] (will restart webrequest_load after lunch, so I can monitor it) [11:57:35] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: Uncaught TypeError: navigator.sendBeacon is not a function - https://phabricator.wikimedia.org/T273374 (10Amorymeltzer) @Milimetric Ach yeah, it'll be some ad blocker. Haven't figured out which yet, but sorry for hassle. I've also just found T86680... [11:59:35] 10Analytics, 10observability, 10User-fgiunchedi: Setup Analytics team in VO/splunk oncall - https://phabricator.wikimedia.org/T273064 (10fgiunchedi) @razzi I've invited you to victorops, you should have received an email. Please follow the instructions at https://wikitech.wikimedia.org/wiki/VictorOps#Set_up_... [12:40:22] I'm migrating some metrics in Graphite, adjusting paths and how we calculate. I'd like to understand my options, is there a wiki page on this topic? [12:40:59] For example, I assume it's possible to move some metrics beginning with `Mediawiki.` to become `MediaWiki.`, even if data already exists under the target path? [12:41:13] I assume it's possible to purge deprecated paths? [12:41:19] Thanks a lot for the email elukey :) [12:41:54] And with analytics help, I assume we can purge reportupdater outputs, to regenerate updated backfill? [12:42:03] Hi awight - I don't think yo're at the right place to get info on graphite - Observability would be better :) [12:42:20] oho! [12:42:35] Thanks, TIL about the channel. [12:43:36] :) [12:56:39] I have looked at under-replicated blocks data, and the slope says we should be done in a bit less than 6 hours [13:42:29] going to check the alerts in a few! [13:44:55] np elukey - I quickly looked and I think I have pinpointed the thing [13:54:28] I saw a permission error again in the druid indexation, but didn't track down the issue yet (I am doing laundry :) [13:58:51] good laundry elukey :) Indeed both failures are perms issues :( [13:59:49] so some jobs need to write under /wfm/data/wmf as user druid [14:00:08] or better, to read probably ? [14:00:23] correct elukey - let's BC for a minute if you wish [14:00:54] ah and of course the edit "hourly" is indexed every month [14:01:07] sure, let's bc [14:01:09] correct elukey, same for reduced [14:12:03] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-01-20): Add edit count bucketing to all metrics - https://phabricator.wikimedia.org/T269986 (10Ottomata) Hm, @awight, I don't know why, but the schemas you mention... [14:20:40] 10Analytics, 10observability: Modify Kafka max replica lag alert to only alert if increasing - https://phabricator.wikimedia.org/T273702 (10Ottomata) So even though alertmanager is upcoming, should we continue to use `check_prometheus`? (Can't use `grafana_alert`, the dashboard is templated.) [14:24:37] !log sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chmod o+rx /wmf/data/wmf [14:24:39] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:25:25] !log sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chmod -R o+rx /wmf/data/wmf/edit [14:25:27] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:28:27] !log relaunch edit-hourly-druid-coord 02-2021 after chmods [14:28:29] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:30:38] !log kill + relaunch webrequest_load to pick up new changes after refinery deployment [14:30:40] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:32:34] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban: Uncaught TypeError: navigator.sendBeacon is not a function - https://phabricator.wikimedia.org/T273374 (10Ottomata) Huh, yeah we should wrap the new `navigator.sendBeacon` call with a similar check, and perhaps just log a more informative console err... [14:40:15] !log kill + restart webrequest-druid-{hourly,daily} to pick up new changes after refinery deployment [14:40:17] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:45:11] !log sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chmod o+rx /wmf/data/wmf/mediawiki [14:45:13] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:48:42] !log sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chmod -R o+rx /wmf/data/wmf/mediawiki/history_reduced [14:48:43] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:50:05] heya teammm [14:50:57] holaaa [15:03:35] mforns: so in theory the refinery deployment should be completed [15:03:44] elukey: ok [15:03:58] elukey: when I joined this morning I read your emails [15:04:05] sound good [15:07:08] ack! [15:38:33] 10Analytics, 10observability: Modify Kafka max replica lag alert to only alert if increasing - https://phabricator.wikimedia.org/T273702 (10Ottomata) @fgiunchedi is there a way to somehow smooth this alert? In [[ https://grafana-rw.wikimedia.org/explore?orgId=1&left=%5B%221612279243763%22,%221612309822249%22,... [15:39:47] 10Analytics-Clusters, 10Patch-For-Review: Improve logging for HDFS Namenodes - https://phabricator.wikimedia.org/T265126 (10razzi) @Ottomata and I discussed next steps for this ticket, and came up with the following: - Create a puppet patch that allows a hiera setting for symlinking hadoop logs into /var/log/... [15:51:31] (03PS1) 10Ottomata: Migrate TranslationRecommendation from metawiki [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/661399 (https://phabricator.wikimedia.org/T271163) [15:54:11] 10Analytics-Radar, 10WMDE-Templates-FocusArea, 10MW-1.36-notes (1.36.0-wmf.29; 2021-02-02), 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-01-20): Add edit count bucketing to all metrics - https://phabricator.wikimedia.org/T269986 (10awight) @Ottomata Oh dear. Yeah I was worried that we are your recur... [16:52:16] 10Analytics: Superset error: `Hive table 'wmf.webrequest' is corrupt. The number of files in the directory (64) does not match the declared bucket count (256) for partition` - https://phabricator.wikimedia.org/T273693 (10jrbs) >>! In T273693#6798625, @JAllemandou wrote: > Hi @jrbs, > The problem comes from a cha... [16:53:57] 10Analytics, 10Analytics-Kanban, 10Event-Platform, 10EventStreams, and 5 others: Set up internal eventstreams instance exposing all streams declared in stream config (and in kafka jumbo) - https://phabricator.wikimedia.org/T269160 (10elukey) Next step is https://wikitech.wikimedia.org/wiki/LVS#Configure_th... [17:24:50] 10Analytics, 10Event-Platform: Sanitize and ingest event tables defined in the event_sanitized database - https://phabricator.wikimedia.org/T273789 (10Ottomata) [17:36:56] 10Analytics: Superset error: `Hive table 'wmf.webrequest' is corrupt. The number of files in the directory (64) does not match the declared bucket count (256) for partition` - https://phabricator.wikimedia.org/T273693 (10JAllemandou) 05Open→03Declined Declining then :) [17:47:34] 10Analytics, 10GrowthExperiments, 10Growth-Team (Current Sprint): eventgate_validation_error for NewcomerTask, HomepageTask, and HomepageVisit schemas - https://phabricator.wikimedia.org/T273700 (10kostajh) I don't understand why this error is occurring ([logstash link](https://logstash.wikimedia.org/app/dis... [17:50:16] !log rebalance kafka partitions for codfw.wdqs-internal.sparql-query [17:50:18] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:52:20] !log rebalance kafka partitions for eqiad.wdqs-internal.sparql-query [17:52:22] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:18:27] 10Analytics: Add time interval limits to pageview API - https://phabricator.wikimedia.org/T261681 (10JAllemandou) > I'm thinking ~1 year is a good limit +1 ! > Also, are there any endpoints other than per-article and per-file for which I should implement this limit? I don't think so - top jobs are already restr... [18:19:59] elukey: thanks a milion for the fixes in perms - edit job succeeded and history-reduced is still going, indexing druid \o/ [18:28:14] !log rebalance kafka partitions for codfw.mediawiki.job.refreshLinks [18:28:18] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:28:44] !log rebalance kafka partitions for eqiad.mediawiki.job.refreshLinks [18:28:46] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:30:12] joal: goooooood [18:45:55] 10Analytics-Radar, 10SRE, 10ops-eqiad: Degraded RAID on an-worker1099 - https://phabricator.wikimedia.org/T273034 (10Cmjohnson) 05Open→03Resolved done [18:49:23] * razzi lunch break [18:49:35] 10Analytics, 10GrowthExperiments, 10Growth-Team (Current Sprint), 10Patch-For-Review: eventgate_validation_error for NewcomerTask, HomepageTask, and HomepageVisit schemas - https://phabricator.wikimedia.org/T273700 (10Ottomata) > I assume that if validation errors occur, this means that the event is not pe... [18:53:41] o/ anyone have time to help me figure out why a dataset I moved to `stat1007:/srv/published/datasets/one-off/isaacj/list-building/` a few hours ago still isn't showing up at https://analytics.wikimedia.org/published/datasets/one-off/isaacj/ ? [18:54:16] 10Analytics, 10GrowthExperiments, 10Growth-Team (Current Sprint), 10Patch-For-Review: eventgate_validation_error for NewcomerTask, HomepageTask, and HomepageVisit schemas - https://phabricator.wikimedia.org/T273700 (10Ottomata) Oh, re missing/null fields in https://gerrit.wikimedia.org/r/c/mediawiki/extens... [18:55:02] isaacj: I see the directory, you may have the page cached.. have you tried incognito? [18:55:40] hmm...that's what i figured but even a hard refresh and incognito mode didn't change anything [18:55:53] well just proves i understand even less than i thought about web browsers :) [18:55:58] I confirm I see the fir [18:56:04] s/fir/dir [18:56:10] Must be cached related [18:56:51] so weird, i even just switched to Safari which i never use and they aren't there [18:57:01] i wonder if it is varnish cache related [18:57:05] isaacj: can you paste me somewhere the result of curl -i https://analytics.wikimedia.org/published/datasets/one-off/isaacj/ ? [18:57:05] isaacj: It's data-center related [18:57:06] maybe esams has it but eqiad doesn't [18:57:08] even in pvt [18:57:26] ottomata: I think it may be related, [18:57:49] I have x-cache: cp3062 miss, cp3054 miss [18:58:00] elukey https://www.irccloud.com/pastebin/7sZ9qwdt/ [18:58:01] isaacj: does this url work for you? [18:58:01] https://analytics.wikimedia.org/published/datasets/one-off/isaacj/list-building/ [18:58:18] yep x-cache: cp1085 miss, cp1083 hit/28 [18:58:21] ottomata: haha, yeah, don't know why didn't think of that [18:58:30] (guessing that is not cached in your browser or in eqiad) [18:59:15] if the webserver doesn't return any cache header it is kept for 24h [18:59:48] oh yeah, i just switched my VPN to Europe and now i see it. so interesting [19:00:00] 10Analytics, 10Editing-team, 10Event-Platform: VisualEditorFeatureUse Event Platform Migration - https://phabricator.wikimedia.org/T267353 (10Ottomata) Hi, I haven't had time to start this yet, so I'll wait until next week. Thanks for your patience. [19:00:03] 10Analytics, 10Editing-team, 10Event-Platform: EditAttemptStep Event Platform Migration - https://phabricator.wikimedia.org/T267343 (10Ottomata) Hi, I haven't had time to start this yet, so I'll wait until next week. Thanks for your patience. [19:01:47] lexnasser: I have just created three new vms for aqs in horizon-analytics, tomorrow morning I should be able to bootstrap aqs :) [19:03:52] elukey: we shoudl probably reduce TTL for datasets [19:04:47] ottomata: I think that setting a cache header of one/two hours should be fine [19:04:55] elukey: sounds good, thanks so much for your help! [19:17:40] lexnasser: ah there might be a little issue, namely the fact that we need a deployment server for scap etc.. [19:18:00] will try to find a solution [19:18:50] elukey: quick question - When will we be able to add an-worker1117 to the backup cluster? [19:20:35] joal: tomorrow morning for sure [19:20:59] ack elukey - I see the HDFS-reamining-space curve gently moving down :) [19:21:15] ttl!! [19:21:18] * elukey afk [19:35:10] Gone for tonight - see you tomorrow team [19:53:49] 10Analytics, 10Event-Platform: Sanitize and ingest event tables defined in the event_sanitized database - https://phabricator.wikimedia.org/T273789 (10mforns) //mediawiki_client_session_tick//, IIUC, is not supposed to be kept indefinitely. Instead, we want to keep its aggregated/sessionized intermediate table... [20:03:37] !log rebalance kafka partitions for codfw.mediawiki.job.RecordLintJob [20:03:40] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:04:02] !log rebalance kafka partitions for eqiad.mediawiki.job.RecordLintJob [20:04:06] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:04:27] 10Analytics-Clusters: Balance Kafka topic partitions on Kafka Jumbo to take advantage of the new brokers - https://phabricator.wikimedia.org/T255973 (10razzi) [20:10:18] 10Analytics, 10GrowthExperiments, 10Growth-Team (Current Sprint), 10MW-1.36-notes (1.36.0-wmf.30; 2021-02-09), 10Patch-For-Review: eventgate_validation_error for NewcomerTask, HomepageTask, and HomepageVisit schemas - https://phabricator.wikimedia.org/T273700 (10kostajh) Oh fun, we get some XSS attempts... [20:31:22] 10Analytics, 10Growth-Team, 10Product-Analytics: Growth: End wider data purge window - https://phabricator.wikimedia.org/T273815 (10Rileych) [20:35:55] 10Analytics: Build a process to check permissions when changing datasets from non-PII to PII - https://phabricator.wikimedia.org/T273818 (10Milimetric) [21:09:49] 10Analytics, 10Growth-Team, 10Product-Analytics: Growth: delete data older than 90 days - https://phabricator.wikimedia.org/T273821 (10Rileych) [21:28:22] 10Analytics, 10Growth-Team, 10Product-Analytics: Growth: delete data older than 90 days - https://phabricator.wikimedia.org/T273821 (10MMiller_WMF) [21:28:39] 10Analytics, 10Growth-Team, 10Product-Analytics: Growth: End wider data purge window - https://phabricator.wikimedia.org/T273815 (10MMiller_WMF) [21:29:15] 10Analytics, 10Growth-Team, 10Product-Analytics: Growth: End wider data purge window - https://phabricator.wikimedia.org/T273815 (10MMiller_WMF) [21:29:28] 10Analytics, 10Growth-Team, 10Product-Analytics: Growth: delete data older than 90 days - https://phabricator.wikimedia.org/T273821 (10MMiller_WMF) [21:30:21] 10Analytics, 10Growth-Scaling, 10Growth-Team, 10Product-Analytics: Growth: delete data older than 90 days - https://phabricator.wikimedia.org/T273821 (10MMiller_WMF) [21:30:40] 10Analytics, 10Growth-Scaling, 10Growth-Team, 10Product-Analytics: Growth: End wider data purge window - https://phabricator.wikimedia.org/T273815 (10MMiller_WMF) [21:37:59] !log rebalance kafka partitions for eventlogging_MobileWikiAppLinkPreview [21:38:02] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [21:57:33] 10Analytics-Radar, 10Product-Analytics, 10Growth-Team (Current Sprint): remove all Growth schemas from the schema whitelist - https://phabricator.wikimedia.org/T273826 (10Rileych) [22:09:28] 10Analytics, 10Product-Analytics, 10Growth-Team (Current Sprint): remove all Growth schemas from the schema whitelist - https://phabricator.wikimedia.org/T273826 (10nettrom_WMF) [22:10:56] 10Analytics, 10Growth-Scaling, 10Growth-Team, 10Product-Analytics: Growth: End wider data purge window - https://phabricator.wikimedia.org/T273815 (10nettrom_WMF) [22:10:59] 10Analytics, 10Product-Analytics, 10Growth-Team (Current Sprint): remove all Growth schemas from the schema whitelist - https://phabricator.wikimedia.org/T273826 (10nettrom_WMF) [22:14:37] 10Analytics, 10Product-Analytics, 10Growth-Team (Current Sprint): remove all Growth schemas from the schema whitelist - https://phabricator.wikimedia.org/T273826 (10nettrom_WMF) [22:27:07] 10Analytics, 10Event-Platform: Sanitize and ingest event tables defined in the event_sanitized database - https://phabricator.wikimedia.org/T273789 (10Ottomata) Ah ok cool, but we still need to do this for new non legacy EventLogging events in general. [22:39:26] 10Analytics, 10GrowthExperiments, 10Growth-Team (Current Sprint), 10MW-1.36-notes (1.36.0-wmf.30; 2021-02-09), 10Patch-For-Review: eventgate_validation_error for NewcomerTask, HomepageTask, and HomepageVisit schemas - https://phabricator.wikimedia.org/T273700 (10Ottomata) > I don't suppose there's much t... [22:52:18] 10Analytics, 10Better Use Of Data, 10Product-Data-Infrastructure: Define acceptable usage of the `meta` object in event schemas - https://phabricator.wikimedia.org/T273293 (10jlinehan) @Ottomata, @Mholloway and I had a chance to sit down and dive into this, notes from our discussion: - Likely undesirable to... [23:22:07] 10Analytics, 10Product-Analytics: Analyze differences between checksum-based and revert-tag based reverts in mediawiki_history - https://phabricator.wikimedia.org/T266374 (10kzimmerman) p:05Low→03Medium The Growth team runs updates every week, so they're using the mw-reverted tag in those notebooks. But we...