[00:00:30] 10Analytics-Clusters: Review and improve Oozie authorization permissions - https://phabricator.wikimedia.org/T262660 (10Nuria) +1 [00:02:02] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Use types in Analytics Puppet classes/profiles/etc.. - https://phabricator.wikimedia.org/T252617 (10Nuria) confirming with @Ottomata this can now be closed [00:02:06] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Use types in Analytics Puppet classes/profiles/etc.. - https://phabricator.wikimedia.org/T252617 (10Nuria) 05Open→03Resolved [00:02:18] 10Analytics, 10Analytics-Kanban: Analytics Ops Technical Debt - https://phabricator.wikimedia.org/T240437 (10Nuria) [00:07:57] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Add editors per country data to AQS API (geoeditors) - https://phabricator.wikimedia.org/T238365 (10Nuria) This is still not working, see error on: https://wikimedia.org/api/rest_v1/metrics/editors/by-country/ro.wikipedia/5..99-edits/2020/07 Looking at... [00:07:59] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Add editors per country data to AQS API (geoeditors) - https://phabricator.wikimedia.org/T238365 (10Nuria) cc @mforns [00:12:07] Pchelolo: is there still any pending code to deploy on rest base for the new end point (cc mforns ) [00:15:48] 10Analytics, 10Analytics-Kanban, 10Product-Analytics: Add editors per country data to AQS API (geoeditors) - https://phabricator.wikimedia.org/T238365 (10Nuria) From: https://wikimedia.org/api/rest_v1/metrics/editors/ it does not look like the "by-country" endpoint is up [00:17:33] 10Analytics-Clusters, 10Analytics-Kanban: Review and improve Oozie authorization permissions - https://phabricator.wikimedia.org/T262660 (10Nuria) a:03razzi [00:47:27] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: AQS is not OpenAPI 3 compliant - https://phabricator.wikimedia.org/T240995 (10Nuria) We probably need to start patch from scratch [00:47:29] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: AQS is not OpenAPI 3 compliant - https://phabricator.wikimedia.org/T240995 (10Nuria) Maybe @paulkernfeld is interested on this ticket? [01:22:52] RECOVERY - Check the last execution of monitor_refine_event on an-launcher1002 is OK: OK: Status of the systemd unit monitor_refine_event https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [01:47:48] 10Analytics, 10Analytics-Kanban, 10Platform Engineering: Add log entry details to page and user events in EventBus - https://phabricator.wikimedia.org/T263055 (10Milimetric) >>! In T263055#6468006, @Pchelolo wrote: > Could you expand a bit in what MySQL data are you trying to correlate the events to? It feel... [02:04:13] 10Analytics-Radar, 10Technical-blog-posts: Story idea for Blog: The Best Dataset on Wikimedia Content and Contributors - https://phabricator.wikimedia.org/T259559 (10Milimetric) I'm sorry, I keep trying to bring it up at our team meetings but we've been busy end of quarter. I think if you have a lull and need... [06:23:11] good morning [06:30:03] RECOVERY - Check the last execution of produce_canary_events on an-launcher1002 is OK: OK: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [06:33:59] 10Analytics, 10Analytics-Kanban, 10Platform Engineering: Add log entry details to page and user events in EventBus - https://phabricator.wikimedia.org/T263055 (10JAllemandou) My 2 cents on this. For automated usage of data, we want to be able to join representations of the same actions in a (hopefully) non-f... [06:42:23] PROBLEM - Check the last execution of produce_canary_events on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [07:11:28] bonjour [07:11:52] buongiorno Luca :) [07:13:14] 10Analytics, 10Analytics-Kanban: Analytics Ops Technical Debt - https://phabricator.wikimedia.org/T240437 (10elukey) [07:13:34] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Use types in Analytics Puppet classes/profiles/etc.. - https://phabricator.wikimedia.org/T252617 (10elukey) 05Resolved→03Open a:05razzi→03None Please don't close this, let's keep it as tracking task in the Tech Debt area so anybody that wants to co... [07:13:40] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Use types in Analytics Puppet classes/profiles/etc.. - https://phabricator.wikimedia.org/T252617 (10elukey) [07:14:23] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Use types in Analytics Puppet classes/profiles/etc.. - https://phabricator.wikimedia.org/T252617 (10elukey) [07:21:14] joal: nice the webrequest-load workflow graphs is ok now! \o/ [07:21:21] elukey: indeed :) [07:21:36] thanks for the troubleshooting elukey! [07:22:07] joal: we'll have to do a little more soon with hue-next.wikimedia.org :) [07:22:21] still not ready, but it hopefully will be today [07:22:30] I am sure some bugs are still to be discovered [07:23:13] No problem elukey - Let's make than py2 out [07:23:58] also it will be the first CDH package that we ditch in favor of something else :) [07:24:07] strong win :) [07:26:28] I am not very happy with the current status of alerts@, it seems very difficult to get what's happening (as you were saying yesterday joal ) [07:28:26] agreed elukey - I think it'll be b [07:28:44] better now - data-quality alarms should be gone, and refine for test as well [07:31:45] elukey: wanna talk HDFS? [07:34:02] 10Analytics-Clusters: Upgrade to Superset 0.37.x - https://phabricator.wikimedia.org/T262162 (10elukey) [07:34:05] joal: in ~30 mins? [07:34:12] when you wish elukey [07:43:27] 10Analytics, 10Growth-Team, 10Product-Analytics: Revisions missing from mediawiki_revision_create - https://phabricator.wikimedia.org/T215001 (10JAllemandou) Now is my time for this. Here is some data for `simplewiki` only in `presto` for July and August 2020: * July ` select count(distinct rev_id) from even... [07:54:58] 10Analytics-Clusters: Upgrade to Superset 0.37.x - https://phabricator.wikimedia.org/T262162 (10elukey) I am also seeing that http://localhost:9080/superset/dashboard/7/ is broken as well. ` Sep 17 07:53:34 an-tool1005 superset[5963]: Traceback (most recent call last): Sep 17 07:53:34 an-tool1005 superset[5963]... [08:01:45] 10Analytics, 10Growth-Team, 10Product-Analytics: Revisions missing from mediawiki_revision_create - https://phabricator.wikimedia.org/T215001 (10JAllemandou) Things I have checks: * Similar ratio of revision made by anonymous users vs registered users between missing revisions and not missing ones. Doens't... [08:02:15] 10Analytics, 10Growth-Team, 10Product-Analytics: Revisions missing from mediawiki_revision_create - https://phabricator.wikimedia.org/T215001 (10JAllemandou) Ping @Milimetric , @Ottomata and @Pchelolo on that one please :) [08:33:16] joal: we can bc if you want! [08:33:30] joining elukey :) [09:06:08] Morning! [09:09:21] o/ [09:11:02] elukey: where will we put the backup of the 1004 files? [09:11:51] klausman: I think on the first stat100x that has space [09:11:59] (under /srv) [09:12:10] roger. Will do a lil' survey [09:17:43] ack [09:20:43] https://phabricator.wikimedia.org/P12625 [09:20:56] Looks like either 5 or 8 are best candidates [09:28:23] yep! [09:30:19] klausman: one nit - if you go on cumin1001 you can use: sudo cumin 'stat100*' 'df -h etc..' [09:30:31] See #wm-security :) [09:52:44] elukey: regarding T251938, it looks to me like running /opt/rocm-3.3.0/bin/rocm_smi.py does not require privileges [09:52:45] T251938: Monitoring GPU Usage on stat Machines - https://phabricator.wikimedia.org/T251938 [09:53:11] I can run it without sudo and get useful info. But I am unsure if "No pids are using the GPU" is accurate in this mode. [10:00:19] klausman: https://github.com/RadeonOpenCompute/ROC-smi/blob/master/rocm_smi.py#L2897 [10:00:52] Nasty. [10:01:06] I do not like when tools do privilege escalation quietly behind my back [10:01:30] Anyway, will make a sudoers puppet change later, now to backup stat1004! [10:01:49] I will stop cron and all timers on 1004 now, objections? [10:02:09] things may have changed though, I tried to sudo -u joal bash + /opt/rocm/bin/rocm-smi and it works [10:02:50] ahhh wait they made the script smarter I think [10:03:02] it forces the sudo only on certain occasions [10:04:13] anyway sorry +1 for timers etc.. [10:04:16] please feel free to go [10:07:24] Ok timers off, cron off. [10:09:39] Does this look good? cumin1001 ~ $ transfer.py stat1004:/srv stat1008:/srv/backup-1004 [10:11:32] `sudo transfer.py stat1004.eqiad.wmnet:/srv stat1008.eqiad.wmnet:/srv/backup-1004` probably works better [10:12:47] !log started backup of stat1004's /srv to stat1008 [10:12:48] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:13:24] Oh, how do I disable puppet while this runs? I suspect it'd might re-enable cron [10:17:15] klausman: sudo puppet agent --disable "username - reason" [10:17:21] (on the host) [10:17:37] on cumin there is the handy sudo cumin 'target' 'disable-puppet "reason"' [10:19:18] Ok, thanks [10:19:34] And now, while the gears grind, I shall find some food. [10:19:39] ack! [10:31:59] RECOVERY - Check the last execution of produce_canary_events on an-launcher1002 is OK: OK: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:02:19] 246G copied so far [11:03:32] * elukey lunch! [11:05:13] so 246G in 50 minutes, that means 83 MiB/s. [11:06:55] OOh my, this will take 8h :-S [11:07:04] ouch :( [11:07:11] I could turn off encryption [11:07:21] But I dunno how much that'll help [11:07:30] probably not a lot [11:07:43] yeah, it'll max out at 125MiB/s anyway, because gigabit [11:08:22] what we can do is do the snapshot today, and reimage tomorrow morning first thing, announcing this to the mailing list (today) [11:08:32] Yeah, I'll send a mail [11:14:03] Could we run sth like rsync stat1007:/srv/ stat1008:/srv/backup-1007/ from cumin? If so, maybe we should do that for the remaining machines, so we can do the bulk upfront, and then just sync up (relatively) quickly on the day of reimage [11:15:33] The only other options I see to avoid lengthy backups like this are faster NICs (and switches), or plugging in a disk locally that we can detach during the reimage [12:04:42] klausman: so the rsync daemon on the stats runs as nobody, so some files would lead to perms denied probably? [12:05:01] I may miss something about rsync, so if you have more details lemme know :) [12:10:00] 10Analytics-Clusters: Upgrade to Superset 0.37.x - https://phabricator.wikimedia.org/T262162 (10elukey) I followed up with upstream on slack and they are about to merge a little change that should fix the `KeyError: 'filter'`. Since it is related to the "legacy" druid connector, as opposed to the SQL Alchemy one... [12:12:54] No, I meant rsync through ssh [12:13:15] (which would need to run as root, so permissions can be kept intact) [12:14:08] I dunno if root-login on the stat machines via ssh (no sudo) would even work [12:20:08] I don't think that we could do rsync through ssh in that way, but I admit my ignorance [12:23:20] 10Analytics-Clusters, 10Analytics-Radar, 10User-Elukey: Monitoring GPU Usage on stat Machines - https://phabricator.wikimedia.org/T251938 (10klausman) I did some testing just now, and it looks like the current version of `rocm_smi.py` does not try to re-execute itself through `sudo` when the `--showpidgpus`... [12:23:43] So clearly, cumin has the necessary keys to ssh as root, judging from what I can see. I'll run a little test to see what happens :) [12:48:30] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Use types in Analytics Puppet classes/profiles/etc.. - https://phabricator.wikimedia.org/T252617 (10Ottomata) I was asked to scope this task, so we scoped it to just the profile::analytics:: classes. I guess we can leave it open but Razzi isn't going to f... [12:49:38] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Use types in Analytics Puppet classes/profiles/etc.. - https://phabricator.wikimedia.org/T252617 (10elukey) Yes this is what we all agreed on at the time, I'd prefer to keep a task open since there is still work to do in the tech debt are for puppet types. [12:55:28] 10Analytics, 10Analytics-Kanban, 10Platform Engineering: Add log entry details to page and user events in EventBus - https://phabricator.wikimedia.org/T263055 (10Pchelolo) just to be clear, I'm not in a strong opposition to adding the log_id either, just wanted to better understand why. [13:04:51] 10Analytics, 10Growth-Team, 10Product-Analytics: Revisions missing from mediawiki_revision_create - https://phabricator.wikimedia.org/T215001 (10Pchelolo) @Jalemayehu I assume the last table is the missing revisions. Which wiki are these from? [13:08:30] elukey: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/AMD_GPU#Outstanding_issues IIUC, this is no longer a problem> [13:08:32] ? [13:13:11] it seems so yes [13:23:30] Should I close T248574 and adjust the wiki page? [13:23:31] T248574: GPUs are not correctly handling multitasking - https://phabricator.wikimedia.org/T248574 [13:24:30] klausman: yes we can close [13:26:10] Alright, will do [13:27:17] 10Analytics: GPUs are not correctly handling multitasking - https://phabricator.wikimedia.org/T248574 (10klausman) 05Open→03Resolved The recent update of the GPU kernel-side drivers to using the rock-dkms package from upstream seems to have resolved this issue (parallel jobs seem to work just fine now.) Clo... [13:27:20] 10Analytics-Clusters, 10Analytics-Kanban: Upgrade AMD ROCm to latest upstream - https://phabricator.wikimedia.org/T247082 (10klausman) [13:29:55] 10Analytics: GPUs are not correctly handling multitasking - https://phabricator.wikimedia.org/T248574 (10klausman) 05Resolved→03Open [13:29:58] 10Analytics-Clusters, 10Analytics-Kanban: Upgrade AMD ROCm to latest upstream - https://phabricator.wikimedia.org/T247082 (10klausman) [13:31:35] Sorry for the spam [13:31:36] 10Analytics: GPUs are not correctly handling multitasking - https://phabricator.wikimedia.org/T248574 (10klausman) 05Open→03Resolved [13:31:38] 10Analytics-Clusters, 10Analytics-Kanban: Upgrade AMD ROCm to latest upstream - https://phabricator.wikimedia.org/T247082 (10klausman) [13:32:05] Also, we're at 1.1T for stat1004 [13:33:08] nice [14:09:00] * elukey run errand before standup [14:15:34] 10Analytics, 10Event-Platform, 10Operations, 10Wikimedia-production-error: Could not enqueue jobs from stream mediawiki.job.cirrusSearchIncomingLinkCount - https://phabricator.wikimedia.org/T263132 (10dcausse) https://grafana.wikimedia.org/d/ePFPOkqiz/eventgate?orgId=1&refresh=1m&from=now-3h&to=now shows a... [14:26:46] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: AQS is not OpenAPI 3 compliant - https://phabricator.wikimedia.org/T240995 (10paulkernfeld) Yeah, I can take a look at this. I have a couple questions: - What version of node and npm do you use to generate `package.json`? When I try with node 10.22.1 and... [14:35:38] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: AQS is not OpenAPI 3 compliant - https://phabricator.wikimedia.org/T240995 (10Pchelolo) I would suggest waiting a little on this. AQS is based on a very outdated codebase of RESTBase/hyperswitch that do not support openAPI 3 yet, and upgrading to a newer v... [14:50:26] 10Analytics, 10Analytics-EventLogging, 10JavaScript, 10Wikimedia-production-error: OperationError: The operation failed for an operation-specific reason in generateRandomSessionId - https://phabricator.wikimedia.org/T263041 (10Nuria) * all errors from the same FF 52 session (three year old browser) , does... [14:50:28] 10Analytics, 10Analytics-EventLogging, 10JavaScript, 10Wikimedia-production-error: OperationError: The operation failed for an operation-specific reason in generateRandomSessionId - https://phabricator.wikimedia.org/T263041 (10Nuria) 05Open→03Invalid [14:51:08] And we're past the halfway point (at 1360G) [15:00:46] if people want to try https://hue-next.wikimedia.org/ [15:10:50] 10Analytics, 10Analytics-Kanban, 10Platform Engineering: Add log entry details to page and user events in EventBus - https://phabricator.wikimedia.org/T263055 (10Milimetric) > Why not just use event timestamp instead of log timestamp in the ongoing update? they should be within milliseconds from each other.... [15:17:40] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: AQS is not OpenAPI 3 compliant - https://phabricator.wikimedia.org/T240995 (10Nuria) i see, @paulkernfeld this is not a best fit then. Unassigning. [15:20:42] elukey: Just tried hue-next quickly - I need to get used to duplicate windows (no CTRL-click to open new tabs :( but I actually manage to view oozie stuff, which is the important bit for me [15:23:53] perfect thanks for the check :) [16:24:10] 10Analytics: Separate RSVD anomaly detection into a systemd timer for better alarming with Icinga - https://phabricator.wikimedia.org/T263030 (10ssingh) Yes, sure! I think I can take care of the systemd timer part. [16:25:32] 10Analytics: Create a kibana dashboard for AQS hyperswitch's logs - https://phabricator.wikimedia.org/T262012 (10Nuria) p:05Triage→03High [16:25:42] 10Analytics, 10Analytics-Kanban: Debianize Python's pid library to be able to use it from reportupdater - https://phabricator.wikimedia.org/T262574 (10Nuria) p:05Triage→03High [16:30:20] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: AQS is not OpenAPI 3 compliant - https://phabricator.wikimedia.org/T240995 (10Milimetric) So, why Go? Are we completely moving away from node? Will there be a new version of node-service-template? I think we can easily migrate to a new version of node-s... [16:32:30] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: AQS is not OpenAPI 3 compliant - https://phabricator.wikimedia.org/T240995 (10Pchelolo) service-template-node could be another possibility, yes. The point is that it's just shouldn't rely on restbase codebase anymore. I'll come up with a more detailed prop... [16:56:54] 10Analytics, 10Product-Analytics, 10Structured Data Engineering, 10SDAW-MediaSearch (MediaSearch-Beta), 10Structured-Data-Backlog (Current Work): [L] Instrument MediaSearch results page - https://phabricator.wikimedia.org/T258183 (10CBogen) [17:01:35] 10Analytics-Clusters, 10Discovery, 10Discovery-Search (Current work), 10Patch-For-Review: Move mjolnir kafka daemon from ES to search-loader VMs - https://phabricator.wikimedia.org/T258245 (10EBernhardson) The daemons are moved. A few followups might be required elsewhere, but this task should be complete. [17:20:32] curious if analytics team has thought much about this or has a process: T263157 [17:20:33] T263157: Process to check approximate correctness of analytics pipeline outputs - https://phabricator.wikimedia.org/T263157 [17:23:39] mforns: I am going to exclude oversampled from our entrophy queries, ok? [17:44:08] a-team: https://wikimedia.org/api/rest_v1/metrics/editors/by-country/ro.wikipedia/5..99-edits/2020/07 [17:44:31] nuria: oh, ok, didn't know we could do that! [17:45:04] Pchelolo just deployed that for us, thanks!! [17:52:09] 10Analytics-Clusters, 10Analytics-Radar, 10User-Elukey: Monitoring GPU Usage on stat Machines - https://phabricator.wikimedia.org/T251938 (10Aroraakhil) @elukey and @klausman thanks! It works fine for me. Just to clarify, I use the following `/opt/rocm/bin/rocm-smi --showpids` without `sudo`, and it works ju... [18:05:49] mforns: ta-tachannnnn [18:05:53] Pchelolo: THANKS [18:07:07] :D [18:09:51] sorry it took me so long to fix it up. I was out yesterday and forgot about it a bunch [18:22:27] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Use MaxMind DB in piwik geo-location - https://phabricator.wikimedia.org/T213741 (10razzi) This is done and configured on Matomo: https://piwik.wikimedia.org @Nuria could you please check that everything is working as expected? [18:22:41] (03PS1) 10Nuria: Removing oversampled data that can trigger false positives [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628169 (https://phabricator.wikimedia.org/T251814) [18:23:32] * elukey afk! [18:23:41] mforns: let me know what you think but i think we should rerun entropy for UA for the last 90 days with oversampled removed [18:25:03] nuria: yes, if oversampled can skew the distribution of UA, then let's do that definitely [18:25:45] mforns: ok, code sent will add note tpo train etherpad [18:28:22] mforns: added https://etherpad.wikimedia.org/p/analytics-weekly-train [18:29:43] nuria: maybe, as we only need 720 data points to calculate RSVD, we could backfill only 30 days [18:30:20] mforns: i think we should do the whole ts to have best quality as possible [18:30:39] nuria: but the whole ts is more than 90 days no? [18:30:56] mforns: if we read from events it can only be 90 days [18:31:11] oh, of course [18:31:14] mforns: if we read from events sanitized it could be the entire ts [18:31:30] no no, sanitized events don't have the useragent field, right? [18:31:44] for navtiming? not sure one sec [18:31:55] no, they are not whitelisted [18:32:07] so we can only backfill 3 months [18:32:20] mforns: k, still better than nothing [18:32:37] i am kicking myself for not having thought about this ealier ahahahahaha [18:32:38] however, the comand we usually use, would also backfill the hourly traffic metrics [18:32:53] heh, me neither [18:33:04] wasn't even aware of the isOversample ield [18:33:05] field [18:33:17] mforns: ya see but i was , duh [18:33:28] bad [18:34:44] nuria: the person who backfills this, will have to modify the hourly coordinator (comment out the traffic snippet) before calling for backfill, otherwise the traffic metrics will also be recomputed, and alarms retrigered [18:35:22] mforns: mmmm [18:35:53] mforns: then let's just you and I do it? [18:36:27] sure, let's combine after deployment [18:36:44] are we deploying this today? [18:36:47] ottomata: heya - I tried to open the doc, but google wants to very the email, being the list - could you please re-share the prez with us individually? [18:36:56] I have an interview now [18:37:42] joal: fixed [18:37:48] thanks ottomata [18:38:06] mforns: we can deploy next week, np [18:38:11] ty! [18:38:50] nuria: ok [18:39:33] nuria: [18:39:46] isoversample _c1 [18:39:46] false 350302 [18:39:46] true 768364 [18:40:06] oversample=true is 2/3 of the data [18:42:01] ottomata: still not working for me :( [18:44:29] hm [18:44:55] now joal ? [18:45:06] Yes! [18:45:13] great thanks [19:03:50] (03CR) 10Mforns: [V: 03+2 C: 03+2] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628169 (https://phabricator.wikimedia.org/T251814) (owner: 10Nuria) [19:08:34] ottomata: it's late here - will you be there tomorrow to talk about the prez? [19:08:43] yup! [19:08:46] Great :) [19:08:53] Then, I'm gone :) [19:08:57] tty! [19:08:59] t [19:09:03] see you tomorrow team [19:16:11] 1004 is aaaaalmost done backing up [19:17:36] 2393G/2463G [19:51:11] 2020-09-17 19:29:07 INFO: 2641297096471 bytes correctly transferred from stat1004.eqiad.wmnet to stat1008.eqiad.wmnet [19:51:13] woohoo! [19:53:57] 10Analytics, 10Analytics-Kanban: Debianize Python's pid library to be able to use it from reportupdater - https://phabricator.wikimedia.org/T262574 (10Ottomata) [20:07:36] 10Analytics, 10Event-Platform, 10Technical-blog-posts: Story idea for Blog: Wikimedia's Event Platform - https://phabricator.wikimedia.org/T253649 (10srodlund) Part 2 has been posted: https://techblog.wikimedia.org/2020/09/17/wikimedias-event-data-json-event-schemas/ [20:23:57] 10Analytics-Clusters, 10Discovery, 10Discovery-Search (Current work), 10Patch-For-Review: mjolnir-kafka-msearch-daemon dropping produced messages after move to search-loader[12]001 - https://phabricator.wikimedia.org/T260305 (10EBernhardson) > Try kafka-python 2.0.1 to see if the consumer errors get fixed... [20:25:52] 10Analytics-Clusters, 10Discovery, 10Discovery-Search (Current work): mjolnir-kafka-msearch-daemon dropping produced messages after move to search-loader[12]001 - https://phabricator.wikimedia.org/T260305 (10EBernhardson) [20:38:02] hi analytics friends, do any of the webrequest/pageview tables in turnilo have USA state breakdowns in them? [20:44:25] 10Analytics, 10Analytics-Kanban, 10Platform Engineering: Add log entry details to page and user events in EventBus - https://phabricator.wikimedia.org/T263055 (10Milimetric) I'm game to do it all together. The new log stream might be easier considering changing a hook's signature is involved since T240307.... [20:46:29] 10Analytics, 10Analytics-Kanban, 10Platform Engineering: Add log entry details to page and user events in EventBus - https://phabricator.wikimedia.org/T263055 (10Milimetric) [20:58:08] 10Analytics, 10Event-Platform, 10Technical-blog-posts: Story idea for Blog: Wikimedia's Event Platform - https://phabricator.wikimedia.org/T253649 (10Nintendofan885) >>! In T253649#6472449, @srodlund wrote: > Part 2 has been posted: https://techblog.wikimedia.org/2020/09/17/wikimedias-event-data-json-event-s... [21:03:44] cdanis: i'm not totally sure, you should check in hive [21:03:47] webrequest [21:03:59] https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest#Current_Schema [21:04:00] says [21:04:01] 😬 one of these days I'll actually learn how to use Hive [21:04:08] Geocoded data computed during refinement using client_ip and MaxMind database contains: continent, country_code, country, subdivision, city but has nulls where information is not available [21:04:14] subdivision might be state? [21:04:15] or province? [21:04:17] yeah 'subdivision' is probably it [21:04:18] cdanis: it is so easy! [21:04:26] CMON do it right now i'll show you [21:04:26] based on the structure of the geo-maps dns repo file anyway [21:09:07] 10Analytics, 10Event-Platform, 10Technical-blog-posts: Story idea for Blog: Wikimedia's Event Platform - https://phabricator.wikimedia.org/T253649 (10srodlund) @Nintendofan885 We updated the URL for this a few minutes ago, which I was just about to update this ticket with :) https://techblog.wikimedia.org/20... [22:54:23] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Use types in Analytics Puppet classes/profiles/etc.. - https://phabricator.wikimedia.org/T252617 (10Nuria) k, removing from kanban [22:54:32] 10Analytics, 10Patch-For-Review: Use types in Analytics Puppet classes/profiles/etc.. - https://phabricator.wikimedia.org/T252617 (10Nuria) [23:00:29] 10Analytics-Radar, 10Domains, 10Operations, 10Traffic, 10Wikimedia-General-or-Unknown: Blocking all third-party storage access requests - https://phabricator.wikimedia.org/T262996 (10Krinkle) Those urls don't need to change. We just need to stop accidentally setting cookies on them. I'm 99% sure this is... [23:00:59] 10Analytics-Radar, 10Domains, 10Operations, 10Traffic, and 2 others: Blocking all third-party storage access requests - https://phabricator.wikimedia.org/T262996 (10Krinkle) [23:28:44] (03PS1) 10Jenniferwang: add SpecialMuteSubmit schema to EventLogging whitelist https://phabricator.wikimedia.org/T262499 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628235 [23:28:46] (03CR) 10Welcome, new contributor!: "Thank you for making your first contribution to Wikimedia! :) To learn how to get your code changes reviewed faster and more likely to get" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628235 (owner: 10Jenniferwang) [23:46:20] (03PS2) 10DannyS712: Add SpecialMuteSubmit schema to EventLogging whitelist [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628235 (https://phabricator.wikimedia.org/T262499) (owner: 10Jenniferwang) [23:47:00] (03CR) 10DannyS712: Add SpecialMuteSubmit schema to EventLogging whitelist (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628235 (https://phabricator.wikimedia.org/T262499) (owner: 10Jenniferwang) [23:51:40] (03PS1) 10Jenniferwang: Add SpecialInvestigate schema to EventLogging whitelist https://phabricator.wikimedia.org/T262496 [analytics/refinery] - 10https://gerrit.wikimedia.org/r/628237