[00:01:07] 10Analytics-Kanban, 10Patch-For-Review: Update per-domain uniques fresh-sessions computation - https://phabricator.wikimedia.org/T167005 (10Nuria) [00:02:37] cthulchu: sorry, can you repeat it? [00:02:49] nuria, how does the data get collected? [00:03:00] cthulchu: what data? [00:04:04] datasets mentioned here: https://wikitech.wikimedia.org/wiki/Analytics [00:04:34] I was wondering about web analytics [00:05:21] since the system name for accessing it is Piwik, I presume it's Piwik that is used for web-tracking [00:05:37] I could actually check it on the front-end, I guess [00:05:47] I only checked if it was GA and it doesn't look like GA [00:07:54] nuria: thanks! This is relevant. I am thinking about comparing readers who are likely to be reading an in a primary language in their country or not. [00:08:20] cthulchu: no, sorry, it is quite abit more complicated piwik is not used at all on wikis [00:08:36] So I want like a country <-> % language speakers <-> wikipedia language edition triple. [00:09:20] cthulchu: the system used on the web is called eventlogging: https://www.mediawiki.org/wiki/Extension:EventLogging [00:09:39] I found that the world fact book provides country <-> % language speakers , but it will take work to connect those languages to ISO prefixes. [00:10:38] nuria, very interesting, many thanks! [00:12:03] oh, so it's wiki's own solution. Very interesting. [00:17:00] cthulchu: or if you were asking about the system used to count pageviews etc., look at https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest (in particular the slide deck linked there ) [00:18:31] it's interesting that pageviews are served separately [00:18:44] I would expect them to be a part of event logging [00:18:58] you know, in case you wanna switch to something more complex like socket [00:57:29] HaeB: thanks, that's useful [00:58:14] Amir1: and please edit in case you learn new things about the topic ;) [01:03:54] Sure, now I see it says the class doesn't exist [01:04:05] It seems I made the jar in a bad way [01:19:21] nuria: for when you're around. How can I package the refinery because logs tell me org.wikimedia.analytics.refinery.job.ClickstreamBuilder class doesn't exist meaning the package is not built properly [01:21:10] e.g. /mnt/hdfs/var/log/hadoop-yarn/apps/ladsgroup/logs/application_1542030691525_24955/analytics1048.eqiad.wmnet_8041 [01:23:21] 10Analytics: ReadingDepth schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209051 (10Groceryheist) A handful of thoughts: The current schema has page_title, but not page_id. We were able to recover page_id from this using the page_title and the timestamps. Isn't this... [01:23:31] 10Analytics, 10Analytics-Data-Quality, 10Datasets-Webstatscollector, 10Language-Team: Add alarms for high volume of views to pages with replacement characters - https://phabricator.wikimedia.org/T117945 (10Tbayer) See also https://meta.wikimedia.org/wiki/Talk:Pageviews_Analysis#Topviews_bug_report:_%EF%BF%... [01:47:13] 10Analytics: ReadingDepth schema is whitelisting both session ids and page ids - https://phabricator.wikimedia.org/T209051 (10Tbayer) >>! In T209051#4760668, @Groceryheist wrote: > A handful of thoughts: > > The current schema has page_title, but not page_id. We were able to recover page_id from this using th... [07:20:47] 16/18 hadoop workers ready! [08:04:55] Congrat elukey :) [10:16:30] 10Analytics, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: rack/setup/install an-worker10[78-95].eqiad.wmnet - https://phabricator.wikimedia.org/T207192 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts: ` ['an-worker1094.eqiad.wmnet'... [10:17:10] 10Analytics, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: rack/setup/install an-worker10[78-95].eqiad.wmnet - https://phabricator.wikimedia.org/T207192 (10elukey) >>! In T207192#4757504, @elukey wrote: > @Cmjohnson the Debian OS install is in progress, but I think that an-worker109[45] h... [10:17:18] starting the reimage of the last two nodes [10:28:08] joal: the decom process will be interesting: 1028 -> 1042 to decom, that contain two journal nodes [10:28:23] elukey: greaaaaaaat :/ [10:28:56] elukey: I'm assuming we're gonna get 2 new journal-nodes in the new pack [10:29:05] yeo [10:30:01] I think that we should simply schedule some downtime with HDFS in safe mode [10:30:10] move the journal nodes [10:30:24] and then re-enable regular hdfs writes [10:32:11] OR! we could couple it with cdh 5.15.1 upgrade [10:32:22] (the fun part is that we tested 5.15.1 in labs :P) [10:32:29] (so we are ready to upgrade) [10:32:38] elukey: +1 for upgrade + journal nodes [10:49:57] 10Analytics, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: rack/setup/install an-worker10[78-95].eqiad.wmnet - https://phabricator.wikimedia.org/T207192 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['an-worker1094.eqiad.wmnet', 'an-worker1095.eqiad.wmnet'] ` and were **ALL**... [10:54:15] 10Analytics, 10Analytics-Kanban: Decommission old Hadoop worker nodes and add newer ones - https://phabricator.wikimedia.org/T209929 (10elukey) p:05Triage>03Normal [10:54:24] created --^ [11:01:16] all right all (new) workers ready [11:01:31] * elukey afk for a bit [11:28:37] 10Analytics, 10Operations, 10ops-eqiad, 10Patch-For-Review, 10User-Elukey: rack/setup/install an-worker10[78-95].eqiad.wmnet - https://phabricator.wikimedia.org/T207192 (10elukey) 05Open>03Resolved [11:44:34] !log re-run pageview-hourly-wf-2018-11-20-9 [11:44:36] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:44:44] didn't really get why it failed [11:45:00] it seemed a HDFS temporary glitch, the re-run went fine [11:45:18] Mwarf - Didn't even noticed :/ Thanks elukey :) [11:49:18] * elukey lunch! [12:55:42] 10Analytics, 10Analytics-EventLogging: Resurrect eventlogging_EventError logging to in logstash - https://phabricator.wikimedia.org/T205437 (10fgiunchedi) Likely adding a new kafka input for logstash, though how much data are we talking about ? [12:59:57] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Decommission old Hadoop worker nodes and add newer ones - https://phabricator.wikimedia.org/T209929 (10elukey) [13:08:37] joal: https://gerrit.wikimedia.org/r/#/c/operations/puppet/cdh/+/474907/ [13:08:48] if you are ok I'd merge this [13:09:08] and restart hive-server2 at first sign of inactivity in Yarn :) [13:09:48] many thanks elukey :) [13:10:51] merging! :) [13:16:49] joal: waiting for a spot for restart [13:17:55] elukey: probably just before next hour [13:18:23] yep probably :( [13:43:10] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: JVM pauses cause Yarn master to failover - https://phabricator.wikimedia.org/T206943 (10elukey) p:05High>03Low [13:59:53] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Move users from stat1005 to stat1007 - https://phabricator.wikimedia.org/T205846 (10elukey) Update: we are still waiting for the last people moving out of stat1005, so access is still allowed but we are not doing any more rsyncs of home dirs to stat1007 (t... [14:02:09] !log restart hive-server2 to pick up new settings - T209536 [14:02:11] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:02:12] T209536: Hive query fails with local join - https://phabricator.wikimedia.org/T209536 [14:02:19] \o/ ! [14:17:19] 10Analytics, 10Analytics-EventLogging: Resurrect eventlogging_EventError logging to in logstash - https://phabricator.wikimedia.org/T205437 (10elukey) >>! In T205437#4761768, @fgiunchedi wrote: > Likely adding a new kafka input for logstash, though how much data are we talking about ? Not much! https://grafan... [14:25:20] elukey: Heya - would ou have a minute for me? [14:25:48] joal: 5 euros [14:25:51] :P [14:25:56] hm ... Later ? [14:25:59] (on my way to batcave) [14:26:00] :-D [14:48:52] hey team :] [15:03:00] o/ [15:03:46] mforns: I have a qq - do you have an idea when the EL2Druid maps support will be available for auto-ingestion? (just organizing the banner impression work) [15:04:12] elukey, I'd like to work on this asap [15:04:19] <3 [15:04:24] you rock [15:04:36] let me know if I can help [15:04:50] elukey, not sure if there's a higher priority for me now, but I believe not (except from fixing EL sanitization) [15:07:39] mforns: let's see what the fundraising team tells us about the druid ingestion spec [15:08:18] elukey, about the normalized metric right? [15:10:09] mforns: and also about the geocoded data, maybe they'll only need the value that event itself bring? [15:10:42] (there is a country field, that is in event_ itself, not geocoded data) [15:10:49] elukey, I see, then should I not start the map type ingestion until we have an answer? [15:18:23] mforns: yeah let's wait [15:18:33] ok [15:18:55] elukey, it should be less than one week of work [15:21:25] 10Analytics, 10Analytics-EventLogging: Resurrect eventlogging_EventError logging to in logstash - https://phabricator.wikimedia.org/T205437 (10fgiunchedi) Seems small enough, I see peaks of 200 events/s a few weeks back, I guess that was validation errors you mentioned @elukey ! Anyways yeah I think we can do... [15:39:54] 10Analytics-EventLogging, 10Analytics-Kanban: [EventLogging Sanitization] Enable older-than-90-day purging of unsanitized EL database (event) in Hive - https://phabricator.wikimedia.org/T209503 (10mforns) @Tbayer @Neil_P._Quinn_WMF > On the other hand, delaying the sanitization significantly sounds like it w... [15:43:53] rigt [15:43:55] *right [15:44:08] im going to take another stab at making this eventlogging service thing work for me! :D [15:44:49] addshore: did you have a chat about MEP with Dan or Andrew? [15:45:23] nope :( [15:45:25] :D [15:45:38] * addshore is starting to remember our other conversation now [15:45:51] * addshore goes to find the link to MEP [15:47:12] i might end up just writing something for my own usecase for now until MEP develops further [15:49:34] I might use the EventBus extension, put just point $wgEventServiceUrl to my own service [15:49:37] hey all, reminder i'm off today; just responding to emails and going to some meetings later, will not be at standup. [15:49:52] if i'd be the only one missing from retro and you want to have it, I can come! [15:49:54] just let me know [15:53:14] maybe I'll ignore EventBus all together... [15:53:43] addshore: i just joined the room, and now am curious... [15:53:44] :) [15:53:46] whatcha doing? [15:53:49] ;) [15:53:55] im not working, thats what im doing ;) [15:54:29] I tried setting up the mw -> event bus -> event logging -> kafka -> wdqs updater ... pipeline a few weeks ago, and ran into issues, [15:55:07] hmmm [15:55:07] its for a local install / project, nothing WMF, but i was hoping it would TM (just work) [15:55:13] ah [15:55:24] im now thinking of different ways to tackle the problem, essentially i just need a queue / stream of page creations [15:55:43] mw -> kafka is kinda hard, since PHP kafka clients are so bad [15:55:46] just spoted EventRelayer and EventRelayerKafka in mw core [15:56:01] hm? i think those might be the ones that we want to deprecate then :p but you can try! [15:56:12] there are only 2 event streams in MW that produce to kafka directly, and we want to remove them [15:56:14] yeh, my next thought is I probably don't need kafka, or for this to have 100% of all events, realisticaly, 99.99% would be fine [15:56:27] aaah, if they are going then I wont use them! :D [15:56:47] what was your problem setting up eventbus? [15:57:32] well, i still have it setup, let me give it a go [15:57:38] I think I might just not know what the config should be [15:57:47] wfLoadExtension( 'EventBus' ); [15:57:51] $wgEventServiceUrl = 'http://eventlogging:8085/v1/events'; [15:58:04] and afaik that endpoint gets hit when pages are created, but nothing gets into kafka [15:58:05] eventlogging:8085 ? [15:58:11] does eventlogging resolve to your service? [15:58:15] yup ;) [15:58:25] annd, you have eventlogging-service-eventbus running producing to kafka? [15:58:31] (are you using mw-vagrant? no right?) [15:58:53] https://www.irccloud.com/pastebin/JdyOOXKc/ [15:59:04] nope, my own docker thing, vagrant wont fit the eventual usecase [15:59:09] ah cool [15:59:12] yeah that won't go to kafka [15:59:14] but to stdout [15:59:18] wait [15:59:20] i overrode that [15:59:32] well, the last thing I tried was this.... [15:59:32] command: ["stdout://", "mysql://eventuser:eventuser-pass@eventlogging-sql/events"] [15:59:40] ah, that will go to stdout and mysql [15:59:41] not kafka [15:59:51] yup, but i couldnt make it actually go to mysql either [16:00:06] i tried ith kafka before, dont have the exact comment i was running, but it will be in git history somewhere [16:00:16] let me spin this up and remember exactly what happened [16:00:43] your mysql thing shoudl work though [16:00:46] if you just want events in mysql [16:00:56] and you don't care about kafka stuff [16:01:01] then that will be a more directly pipeline [16:01:06] yeh, i dont really care where they go to [16:01:15] you could make them go to a file if you don't care [16:01:19] file:///path/to/file [16:01:19] i was going for kafka before hand as i wanted to use the wdqs kafkapoller [16:01:30] but im going to have to rewrite the updater anyway, so i can make it read from sql or whatever [16:01:31] i think... [16:01:38] file wont work :) [16:01:46] k :) [16:02:00] a-team: On my way!, 3 min [16:02:07] ping ottomata , milimetric , fdans [16:02:55] nuria: going to skip standup today, just did meetings and worked on stream intake yesteerday.. if you need me for retro i can make it [16:03:07] i'm trying to get soem little emails etc. done today before I sign off [16:04:22] so I get 2018-11-20 16:03:40,698 [1] (MainThread) tornado.access [INFO] 201 POST /v1/events (172.25.0.10) 1.03ms [16:04:37] also on the service startup it confirmed 2018-11-20 16:01:55,598 [1] (MainThread) root [INFO] Publishing valid JSON events to mysql://eventuser:eventuser-pass@eventlogging-sql/events [16:04:53] but there are no tables there after triggering an event [16:05:39] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Update log_namespace, page_namespace from bigint to int - https://phabricator.wikimedia.org/T209179 (10mforns) a:05mforns>03JAllemandou [16:06:29] hm [16:06:50] also, i dont really see the event itself in stdout [16:06:59] only the fact that the /v1/events endpoint was hit [16:07:12] hm addshore let's try file and verify that way [16:07:18] was gonna say about mysql [16:07:23] the default batch size is 3000, and batch time 300 [16:07:28] aaaaaaaaaah [16:07:32] so you'd have to either emit 3000 events or wait 5 minutes [16:07:34] but you can adjust [16:07:36] just add [16:07:38] that might be the thing im missing :p [16:07:43] ?batch_size=1 to the mysql URI [16:07:45] nuria: sorry, gas guy is here, will try to get to the end of stand up [16:08:08] ottomata: and whats the wait time in the url? is there one? [16:08:15] ya but if you don't see the events in stdout that is strange [16:08:23] where does the service stdout go in your case tho! [16:08:29] might be good to try to file:// just to verify [16:08:33] https://www.irccloud.com/pastebin/CbIEmC6w/ [16:08:42] yeh, let me try file quickly [16:09:09] file://tmp/output.something should work? :P [16:09:32] missing a / [16:11:03] ya file:///tmp ... ya [16:11:12] ottomata: yup, they appear in the file [16:11:22] ok that's good [16:11:31] maybe stdout:// has other issues [16:11:38] that could be my setup [16:13:20] batch_time is in seconds ottomata ? [16:13:22] addshore: with mysql and batch_size=1 [16:13:23] anythign? [16:13:26] yes [16:14:56] ottomata: YES! [16:14:57] YAY! [16:14:59] :D [16:15:07] this is great [16:15:14] yeehaw! [16:15:23] eventlogging_1 | 2018-11-20 16:14:36,956 [1] (MainThread) Log [INFO] Inserted 2 mediawiki_revision_create_3 events in 0.036758 seconds [16:15:26] and it appears in the table [16:15:27] win [16:15:38] 10Analytics, 10Maps, 10Wikidata, 10Wikidata-Query-Service, 10Discovery-Search (Current work): Review Grafana Dashboards for search / wdqs / maps - https://phabricator.wikimedia.org/T209584 (10Mathew.onipe) [16:15:47] <3 [16:17:28] im glad I caught you on irc before rewriting the world [16:24:18] great! [16:24:30] yaaa always come and ask we wanna help :) [16:27:33] ottomata: 1 other question, this eventlogging service, i can run mutliple instances right? and they won't break each other? [16:27:51] what about if I run 1 instances that receive the same event? will they break on inserting into sql or? [16:30:18] Hallo [16:30:27] milimetric or anyone else who works with Dashiki: [16:30:33] I created this JSON: https://meta.wikimedia.org/wiki/Config:Dashiki:CXMachineTranslationEngines [16:30:46] a-team: hi aharoni [16:30:46] Can anyone please review it and deploy it if it's correct? [16:32:16] addshore: if you run multiple instances (on different ports) and they write to the same mysql database [16:32:17] they won't break [16:32:24] amazing [16:32:25] but they will fail the insert because of duplicate key error [16:32:31] okay, thats fine :) [16:32:34] aharoni: you have other language-team related tabs-layout dashboards, and they all have one tab [16:32:35] you can set replace=True [16:32:37] in the uri [16:32:42] if you want the insert to succeed and replace the duplicate [16:32:47] aharoni: maybe combine them into one dashboard with multiple tabs? [16:33:00] Mmm... let me see... [16:33:07] aharoni: because each one means more config and overhead, and they're all related, right? [16:33:19] aharoni: and you can link people directly to a specific tab and even a specific graph [16:33:28] Hmm, probably OK, but let me ask the team [16:34:11] aharoni: well, I'll gently suggest you definitely consolidate here, just like I mentioned with the jobs. The current approach is not scalable for us, if you want 10 graphs I don't want to make 10 dashboards with 10 separate jobs for each one [16:36:29] 10Analytics, 10Anti-Harassment (AHT Sprint 33): Tracking blocks: Log direction actions - https://phabricator.wikimedia.org/T209969 (10dbarratt) p:05Triage>03Low [16:37:52] milimetric: so basically... if I just edit https://meta.wikimedia.org/wiki/Config:Dashiki:Interlanguage , which is already deployed, and add the new chart there, it will just work right awya? [16:37:54] away? [16:38:10] 10Analytics, 10Anti-Harassment: Tracking blocks: Log direction actions - https://phabricator.wikimedia.org/T209969 (10dbarratt) [16:42:32] aharoni: yes, definitely [16:43:11] aharoni: and if you want to rework the dashboard itself, to call it something more generic, feel free, I can change what address it's deployed to and so on [16:46:39] 10Analytics, 10Anti-Harassment: Tracking blocks: Log direction actions - https://phabricator.wikimedia.org/T209969 (10TBolliger) Not needed for our Nov. 2018 work. [16:49:13] milimetric: heh [16:49:15] that was easy [16:49:19] how didn't I think about it :) [16:49:28] https://language-reportcard.wmflabs.org/cx2/#mt-engines [16:49:35] aharoni: great, that's what I'm here for, to gently take the dangerous toys away :) [16:49:40] great [16:49:41] So this one is done [16:49:54] aharoni: I can delete the other page if you like, let me know [16:50:08] I'll talk to the team and we'll probably think of a better naming scheme for them, but no more requests for now :) [16:50:09] thanks [16:50:40] no, please don't delete it immediately, I haven't finished consolidating them all yet [16:50:45] but some time soon [16:50:49] I'll ping you [17:02:06] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services: Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10Bawolff) >>! In T209031#4740258, @Anomie wrote: > Note the `actor` view will likely turn out to have similar issues. > > As s... [17:23:56] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services: Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10Anomie) @Bawolff: You quoted by comment, but I can't see how your reply is relevant. How would fetching the minimum and maximu... [17:39:36] mforns or fdans: can you go tomorrow to SoS filling in for joal? Going forward the best i can think of is joal taking charge of finding a substitute every week [17:39:43] nuria, sure [17:41:13] I can go, unless fdans prefers to go, or we can flip a coin [17:41:52] ok i’ll go next time mforns? [17:42:05] cool! [18:17:57] 10Analytics, 10Analytics-Kanban, 10User-Elukey: Return to real time banner impressions in Druid - https://phabricator.wikimedia.org/T203669 (10AndyRussG) >>! In T203669#4731161, @elukey wrote: > @AndyRussG @Jseddon Hi! So I have something to show to you in: https://turnilo.wikimedia.org/#event_centralnoticei... [18:18:48] a-team: moved standup next monday to acomodate meeting with DBA and others, hard to schedule as some do not share their calendars, might need to be moved around. [18:20:28] :+1: [18:20:44] it lied, it said that would turn into a thumbs up. #emojifail [18:21:39] hehe [18:29:27] (03CR) 10Framawiki: Add "/health/summary/v1/" API endpoint (032 comments) [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/474532 (https://phabricator.wikimedia.org/T205151) (owner: 10Rafidaslam) [18:46:23] * elukey off! [19:17:22] Hi guys, need to get a superset user created [19:17:38] for Jason Linehan (me) [19:21:02] 10Analytics, 10Analytics-Kanban: Update EventLogging kafkacat examples to use jumbo - https://phabricator.wikimedia.org/T209635 (10Nuria) [19:21:20] 10Analytics, 10Analytics-Kanban: Update EventLogging kafkacat examples to use jumbo - https://phabricator.wikimedia.org/T209635 (10Nuria) Thanks @nettrom_WMF [19:22:20] 10Analytics-Kanban, 10Patch-For-Review: Rename insertion_ts to insertion_dt in pageview_whitelist tabler (convention) - https://phabricator.wikimedia.org/T208237 (10Nuria) 05Open>03Resolved [19:25:24] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Wikistats Bug: "Anonymous Editor" is a broken link - https://phabricator.wikimedia.org/T206968 (10Nuria) 05Open>03Resolved [19:27:33] mforns: on SoS - My turn is not tomorrow butr next week - Tomorrow is Dan's turn [19:27:46] mforns: Thanks a lot for covering for me :) [19:27:59] Hi hip [19:28:51] hip: Can you please tell me your ldap username (shell account)? [19:29:06] joal: yup, it's 'jdl' [19:30:55] hip: Can you please test with that username and your LDAP password? https://superset.wikimedia.org [19:31:07] works [19:31:20] joal: thank you! [19:31:34] no prob hip :) Enjoy dashboarding ;) [19:40:14] Gone for tonight team - tomorrow kids day, will be there for standup (if everything goes well) [19:44:23] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: eventlogging logs taking a huge amount of space on eventlog1002 and stat1005 - https://phabricator.wikimedia.org/T206542 (10Nuria) @elukey: confirming that we have set up deletion for files like hdfs dfs -text /wmf/data/raw/eventlogging_client_side/eventlo... [19:52:12] joal, ok! [20:04:34] 10Analytics-Kanban, 10Patch-For-Review: Fix refinery-source jenkins build/release jobs - https://phabricator.wikimedia.org/T208377 (10Nuria) 05Open>03Resolved [20:06:40] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.32-notes (WMF-deploy-2018-10-16 (1.32.0-wmf.26)), and 3 others: Spin out a tiny EventLogging RL module for lightweight logging - https://phabricator.wikimedia.org/T187207 (10Nuria) 05Open>03Resolved [20:59:33] 10Analytics, 10Analytics-Kanban, 10New-Readers, 10Patch-For-Review: Instrument the landing page - https://phabricator.wikimedia.org/T202592 (10Isaac) Checking in here on conversions from the landing page. I ran a quick query to see how things were going from yesterday. Found very few pageviews to es-wiki w... [21:09:04] 10Analytics, 10Analytics-Kanban, 10New-Readers, 10Patch-For-Review: Instrument the landing page - https://phabricator.wikimedia.org/T202592 (10Nuria) Traffic is very small to the site about 300 people per day max. [21:23:22] 10Analytics, 10Analytics-Kanban, 10New-Readers, 10Patch-For-Review: Instrument the landing page - https://phabricator.wikimedia.org/T202592 (10Isaac) thanks @Nuria! so that puts it at about 4% from this snapshot that are clicking through the landing page to Wikipedia (mixture of the main page and the Músic... [21:49:10] 10Analytics, 10Product-Analytics: Event counts from Mysql and Hive don't match - https://phabricator.wikimedia.org/T210006 (10chelsyx) [22:09:58] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services: Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10Milimetric) I think @Bawolff was referring to the automatic query that Sqoop generates against the table you point it at, usua... [22:10:04] 10Analytics, 10Analytics-Kanban, 10DBA, 10Data-Services: Not able to scoop comment table in labs for mediawiki reconstruction process - https://phabricator.wikimedia.org/T209031 (10Bawolff) I was assuming bases on this comment >>! In T209031#4732101, @Krenair wrote: > By the way, the query discussed on IR... [22:10:08] 10Analytics, 10EventBus, 10Core Platform Team Backlog (Later), 10Services (later): EventBus extension started emitting rev_count as a string - https://phabricator.wikimedia.org/T210013 (10Pchelolo) [22:53:03] 10Analytics: Allow access to Data Lake/Hive for Niharika - https://phabricator.wikimedia.org/T210022 (10Niharika) [22:58:16] 10Analytics, 10Operations, 10SRE-Access-Requests: Allow access to Data Lake/Hive for Niharika - https://phabricator.wikimedia.org/T210022 (10Nuria) [23:01:45] Is anyone around in one hour to assist with verifying some eventlogging data in testwiki, as part of a SWAT? It shouldn't take more than 5-10 minutes [23:08:45] kostajh: i can help you , data can be verified before being deployed in beta labs if it is up (they were doing some maintenance yesterday) https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging/TestingOnBetaCluster [23:09:34] nuria: thanks, the problem is we need it verified in the production stack as we are swatting patches that fix bugs particular to production [23:09:58] kostajh: production branch must be deployed to beta before no? [23:10:10] kostajh: isn't that teh normal flow of things? [23:10:12] *the [23:11:07] kostajh: so we can verify there right? [23:11:19] kostajh: asking as i am not sure how things get deployed [23:12:14] nuria: yes the branch is in beta now. I have access to verify events there. But bugs we ran into only occurred in production. So the patch we are backporting to wmf4 will trigger events on testwiki when we swat it [23:12:18] And I need to verify those [23:12:40] I have to run now but back in 30-40 minutes [23:16:30] 10Analytics, 10EventBus, 10Core Platform Team Backlog (Later), 10Patch-For-Review, 10Services (doing): EventBus extension started emitting rev_count as a string - https://phabricator.wikimedia.org/T210013 (10Pchelolo) > The argument is for some reason not documented Fixed in https://www.mediawiki.org/w/... [23:25:57] 10Analytics, 10EventBus, 10Services (later): Create alert on EventBus 400error rate - https://phabricator.wikimedia.org/T210031 (10Pchelolo) [23:52:13] (03PS4) 10Rafidaslam: Add "/.health/summary/v1/" API endpoint [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/474532 (https://phabricator.wikimedia.org/T205151) [23:53:18] (03PS5) 10Rafidaslam: Add "/.health/summary/v1/" API endpoint [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/474532 (https://phabricator.wikimedia.org/T205151) [23:53:19] nuria: beta cluster seems to be done now in any case https://en.wikipedia.beta.wmflabs.org/ [23:53:23] *down [23:53:48] kostajh: i cannot help you there, maybe cloud channel? [23:54:58] nuria: yeah, it's ok for now. what I'd like to do is, once RoanKattouw swats https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/474997, I'll generate some events on testwiki. If you could then pull the EditorJourney events for the user ID I send to you, that would be wonderful [23:55:30] kostajh: teh event logging pipeline processes thousands of events per second [23:55:51] kostajh: as such is not real time but rather events are processed in batch every hour [23:56:15] kostajh: that is why we encourage to test in beta which is real time [23:56:54] kostajh: i can help you verify events but not that they validate until probably over an hour after they have been emitted [23:57:14] hmm, Morten was able to pull events for me when we swatted a fix yesterday, but maybe we just got lucky [23:58:03] (and I'd ask him but he's on a plane right now) [23:58:07] kostajh: i can pull events from kafka but that does not ensure that they validate [23:58:14] kostajh: what was the original issue? [23:58:28] kostajh: that you are trying to fix? [23:59:25] it's kind of a long story, but the short version is that the usage of deferredupdates for some of the logging has resulted in 1) a problem with a cache value we need to retrieve not being available and 2) events being out of order chronologically [23:59:48] nuria ^