[00:07:56] 10Analytics, 10EventBus, 10MediaWiki-General-or-Unknown, 10Multi-Content-Revisions, and 2 others: MediaWiki\Storage\RevisionAccessException from line 217 of /srv/mediawiki/php-master/includes/Storage/RevisionStore.php Could not determine title for page ID 7... - https://phabricator.wikimedia.org/T183505#3855830 [00:26:49] 10Analytics, 10EventBus, 10MediaWiki-General-or-Unknown, 10Multi-Content-Revisions, and 3 others: MediaWiki\Storage\RevisionAccessException from line 217 of /srv/mediawiki/php-master/includes/Storage/RevisionStore.php Could not determine title for page ID 7... - https://phabricator.wikimedia.org/T183505#3855842 [07:45:20] 10Analytics, 10Analytics-EventLogging: EventLogging has invalid @covers tags - https://phabricator.wikimedia.org/T183528#3856142 (10Legoktm) p:05Triage>03Lowest [08:05:06] 10Analytics, 10Analytics-EventLogging: EventLogging has invalid @covers tags - https://phabricator.wikimedia.org/T183528#3856216 (10Legoktm) [10:46:30] 10Analytics, 10EventBus, 10MediaWiki-General-or-Unknown, 10Multi-Content-Revisions, and 4 others: MediaWiki\Storage\RevisionAccessException from line 217 of /srv/mediawiki/php-master/includes/Storage/RevisionStore.php Could not determine title for page ID 7... - https://phabricator.wikimedia.org/T183505#3856382 [12:55:18] 10Analytics, 10EventBus, 10MediaWiki-General-or-Unknown, 10Multi-Content-Revisions, and 4 others: MediaWiki\Storage\RevisionAccessException from line 217 of /srv/mediawiki/php-master/includes/Storage/RevisionStore.php Could not determine title for page ID 7... - https://phabricator.wikimedia.org/T183505#3856574 [13:06:10] 10Analytics-Cluster, 10Analytics-Kanban, 10Operations, 10Traffic, 10User-Elukey: TLS security review of the Kafka stack - https://phabricator.wikimedia.org/T182993#3856606 (10BBlack) https://en.wikipedia.org/wiki/There_are_known_knowns * Known Knowns: * Restrict ciphersuite selection to a strong FS+AE... [13:47:05] 10Analytics, 10Operations, 10Research, 10Traffic, and 4 others: Referrer policy for browsers which only support the old spec - https://phabricator.wikimedia.org/T180921#3773297 (10BBlack) Open Q's from my POV: 1) Flipping these Edges/Safaris to `origin` is going to deny us information on internal referrer... [13:49:23] 10Analytics, 10EventBus, 10MediaWiki-General-or-Unknown, 10Multi-Content-Revisions, and 5 others: RevisionStore.php Could not determine title for page ID X and revision ID Y in EventBus::createRevisionAttrs - https://phabricator.wikimedia.org/T183505#3856704 (10Addshore) [14:21:12] 10Analytics-Data-Quality, 10Analytics-Kanban, 10Datasets-Webstatscollector, 10Language-Team, and 5 others: Investigate anomalous views to pages with replacement characters - https://phabricator.wikimedia.org/T117945#3856775 (10mforns) Hi all, I've been looking into this for a while. It seems that this iss... [14:26:22] 10Analytics-Data-Quality, 10Analytics-Kanban, 10Datasets-Webstatscollector, 10Language-Team, and 5 others: Investigate anomalous views to pages with replacement characters - https://phabricator.wikimedia.org/T117945#3856785 (10mforns) @Tbayer @Nuria I'd say it will be difficult to troubleshoot this issu... [14:30:06] ...gust of wind... [14:50:30] hi [14:59:05] hi milimetric :] [15:45:00] is it just you and me mforns? [15:45:07] dunno... [15:45:27] I have to take off today anyway :) Steph had to run out and I'm alone with the baby [15:45:35] Hi spark folks (joal, ottomata, ??). Anybody available for a debugging question. [15:45:51] Shilad: they're both on break, but mforns and I can give it our best shot [15:46:13] np milimetric [15:46:36] I just saw your message above. I think my WMF work schedule is basically the opposite of most people's regular WMF schedules! [15:47:06] I'm trying to debug a Spark job, and I wanted to get at some of the logs through Hue. [15:47:24] They are located at URLs like http://analytics1035.eqiad.wmnet:8042/ [15:47:53] But DNS fails for that host. Are they actually publicly accessible. If so, where can I get DNS entries for them? [15:48:26] Shilad: right, yes, those URLs are not publicly accessible, there are some ways to tunnel to them that I can never remember, but first [15:48:35] Shilad: have you looked in yarn.wikimedia.org? [15:48:50] Yes! [15:49:00] ok, and that's where you're getting the URLs [15:49:02] Is that a better source for the log info? [15:49:13] Shilad: that's where I usually go for spark stuff, yea [15:49:33] Shilad, I think you can use oozie or yarn in the command line to see the logs like oozie job -info, then get the application id and then use yarn log in the command line? [15:50:10] wait... but there was a pretty site somewhere for the spark logs, mforns do you know? [15:50:17] They have the same problem. I can definitely use command line. [15:50:26] yeah, they all link to those URLs Shilad mentioned [15:50:33] cmdline doesn't seem to have everything, though [15:50:42] I'm a little confused by this. [15:50:46] milimetric, never saw spark logs on website... [15:50:53] aha, Shilad how about this: https://yarn.wikimedia.org/proxy/application_1512469367986_50103/ [15:51:57] and then you can click on the various tasks [15:52:38] WOW! [15:53:07] I didn't know about that site. Super helpful! [15:53:28] you can do: yarn logs -applicationId [15:53:30] this was the "pretty" site i remembered, but is this all you need or more in-depth [15:55:14] Once the job is done, it looks like that site goes away, but that is fine. [15:57:19] oh, Shilad the yarn logs, funny enough, are only there after the job is done, because it has to collect them from all the data nodes [15:57:37] but when the job's finished, mforns's command should get you all the logs [15:58:46] Thanks! Scanning through the logs now. SO MANY LOGS [16:01:14] sorry a-team 2Fauth [16:30:42] o/ [16:30:50] Are we doing the large scale data analysis meeting today? [16:30:54] I don't have agenda items [16:31:04] And I thought we consolidated that meeting away. [16:31:14] ottomata[m], ^ [16:48:01] 10Analytics, 10EventBus, 10MediaWiki-General-or-Unknown, 10Multi-Content-Revisions, and 5 others: RevisionStore.php Could not determine title for page ID X and revision ID Y in EventBus::createRevisionAttrs - https://phabricator.wikimedia.org/T183505#3855484 (10mwjames) > [mediawiki/core@master] [MCR] Add... [16:56:51] 10Analytics, 10EventBus, 10MediaWiki-General-or-Unknown, 10Multi-Content-Revisions, and 5 others: RevisionStore.php Could not determine title for page ID X and revision ID Y in EventBus::createRevisionAttrs - https://phabricator.wikimedia.org/T183505#3857378 (10Addshore) @mwjames Yup, you are absolutely ri... [17:12:27] 10Analytics, 10Analytics-EventLogging: EventLogging has invalid @covers tags - https://phabricator.wikimedia.org/T183528#3856142 (10Umherirrender) There is a php file named "JsonSchema.php" with multi classes. Would adding ".php" fix this? (https://phabricator.wikimedia.org/diffusion/EEVL/browse/master/tests/p... [17:43:27] halfak: sorry we're mostly on break, I should've pinged you [17:43:36] I'm not sure why that meeting is on the calendar, we did consolidate [17:43:55] Ok no worries at all. :) [17:55:47] millimetric and mforns: Still trying to debug a spark issue. Can I run something by you? [17:57:51] yeah Shilad [17:58:51] (btw Shilad you're stuck with me for now, Marcel's in India and it's like the middle of the night for him) [17:59:06] Got it! I am trying to count num users who viewed each page, which should be easy: RDD[pageId, userHash] -> distinct -> groupByKey -> map(key, value.size) [17:59:18] I am getting out of memory, but I can't imagine where this should happen. [17:59:42] value.size is on an iterator of userHashes, so there should be no memory use there. [17:59:51] I can't imagine where this is going wrong! [17:59:52] Shilad: wait what's your input data? [18:00:10] shilad.sessions hive table, which is sessionized page views and searches. [18:01:18] and how much memory are you starting Spark with? [18:01:46] I've been bumping it up incrementally to see if that fixes things, but it's now ridiculous: 24G for driver, 16 for executors. [18:01:56] ok, yeah, that's quite a bit [18:02:18] hm, ok, so logically that means however the RDD is partitioning the data it has one partition that's too big [18:02:40] oh... or too many partitions [18:02:47] Do you think the groupByKey is loading the values for a key into memory? Because the main page may kill that if so. [18:02:52] how many unique pageId/userHash pairs are there? [18:02:53] I can't imagine why it would do that, though. [18:02:57] that seems like there would be a lot [18:04:05] Getting an estimate now... [18:05:16] reading about map a bit [18:06:48] 10 billionish [18:07:03] So not small, but this is what Spark is supposed to do! [18:07:07] right [18:07:25] I'm very noobish at spark, I'd of course just do this in Hive and not think twice about it :) [18:07:56] Hah! Maybe that's actually the answer, sadly. I actually can run the hive query fine to replicate the spark results. [18:08:01] so it depends how the RDD is being partitioned, I think the partitions have to fit in memory [18:08:03] It takes 5 minutes, but is okay. [18:08:22] yeah, maybe until the Spark master is back [18:15:33] Shilad: I was trying to read this code by Joseph that deals with the same problem in mediawiki history reconstruction: https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-job/src/main/scala/org/wikimedia/analytics/refinery/job/mediawikihistory/denormalized/DenormalizedRunner.scala#L292 [18:16:06] we were running out of memory in several places, and he made some custom partitioning and split the data, later zipping it back up [18:16:19] but I couldn't figure it out enough to help you [18:17:02] Huh. Interesting. I'll take a look. Thanks! [18:19:25] and this one, it's not exactly applicable because the problem there was we had a graph of events that was too big, so he split up the graph into subgraphs, but the solution might be inspiring: https://github.com/wikimedia/analytics-refinery-source/blob/3bc0a6b42f2eef14d7ad90953e0e90a6ee151035/refinery-job/src/main/scala/org/wikimedia/analytics/refinery/job/mediawikihistory/utils/SubgraphPartitioner.scala#L171 [18:20:07] yeah, good luck Shilad. If you need any help with SQL, I've got your back :) [18:31:48] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Support multi DC statsv - https://phabricator.wikimedia.org/T179093#3857667 (10Krinkle) @Ottomata Cool! Also thanks for raising the concern with regards to replication/MirrorMaker. The replication delay and message congestion after (even a short) d... [19:07:33] 10Analytics, 10ChangeProp, 10EventBus, 10MediaWiki-JobQueue, and 2 others: Separate ChangeProp and JobQueue Redis - https://phabricator.wikimedia.org/T183586#3857772 (10Pchelolo) p:05Triage>03Normal [19:13:21] 10Analytics, 10ChangeProp, 10EventBus, 10MediaWiki-JobQueue, and 3 others: Separate ChangeProp and JobQueue Redis - https://phabricator.wikimedia.org/T183586#3857823 (10mobrovac) +1. Given the way Redis is used in CP and CP4JQ, we might want to set these separate instances up in Kubernetes as early as Q4! [19:32:06] 10Analytics, 10Analytics-EventLogging: EventLogging has invalid @covers tags - https://phabricator.wikimedia.org/T183528#3857861 (10Legoktm) https://phpunit.de/manual/4.8/en/appendixes.annotations.html#appendixes.annotations.covers.tables.annotations doesn't mention anything about being able to cover whole files? [19:38:28] milimetric: I tried pushing everything into Spark's DataFrame API (rather than using RDDs). Much uglier but it works! https://pastebin.com/hm2ZYC8Q [19:39:01] Spark must be able to optimize the SQLish code more effectively than the RDDs. [19:53:23] interesting. sql-as-the-greatest-common-denominator ftw :) [19:53:38] happy holidays and happy new year everyone, I'm out [19:57:35] 10Analytics, 10Analytics-EventLogging: EventLogging has invalid @covers tags - https://phabricator.wikimedia.org/T183528#3857878 (10Umherirrender) It is not used often: https://codesearch.wmflabs.org/search/?q=@covers.*\.php&i=nope&files=&repos= ``` extensions / Cognate tests/phpunit/ServiceWiringTest.php 1...