[00:15:19] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3969969 (10Legoktm) [00:19:39] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3969992 (10Jdlrobson) > IMHO DNT should be respected whenever possible. I don't think that's being disputed here. The question is whether something... [00:22:22] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3969999 (10Nuria) >The question is whether something that tracks page-view-like things should be consistent with how we collect page views (which d... [00:24:16] 10Analytics-Kanban: English Wikivoyage traffic spike possible bot - https://phabricator.wikimedia.org/T187244#3969063 (10kaldari) Seems to mostly be caused by a sustained spike in desktop-only views to the Zimbabwe page. Strange. [00:41:09] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3970026 (10Nuria) Quick and dirty estimate of dnt on: https://phabricator.wikimedia.org/T127571 [01:28:32] 10Analytics-Kanban, 10Research-landing-page, 10Patch-For-Review: Pageviews/Stats on research.wikimedia.org - https://phabricator.wikimedia.org/T186819#3970359 (10diego) >>! In T186819#3968880, @Nuria wrote: > Sorry, there are two levels of access: 1) LDAP (usual credentials) 2) piwik That means to login whe... [03:18:45] 10Analytics-Kanban: English Wikivoyage traffic spike possible bot - https://phabricator.wikimedia.org/T187244#3969063 (10Tbayer) The Zimbabwe page shows an increase of 1000-2000 per day per https://tools.wmflabs.org/pageviews/?project=en.wikivoyage.org&platform=all-access&agent=user&range=latest-20&pages=Zimbab... [04:52:03] 10Analytics-Kanban: English Wikivoyage traffic spike possible bot - https://phabricator.wikimedia.org/T187244#3970656 (10kaldari) Oh, your right :) Maybe it's an example of something that's happening on a lot of pages though. The desktop vs. mobile percentage looks very suspicious (<1% mobile) and the increase s... [06:06:12] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3969855 (10Urbanecm) Is collecting agregated data TRACKING? I don't think so. Therefore collecting those data is possible no matter if the user's s... [08:43:22] hello people! [08:43:45] I have been working on removing jmxtrans support this morning (https://gerrit.wikimedia.org/r/410396), since I forgot that without a rebuild it doesn't support java 8 [08:43:53] except from that, everything looks good [08:44:15] the last step after the alarms is to remove java7 from all the hosts [08:51:11] 10Analytics, 10Discovery, 10Wikidata, 10Wikidata-Query-Service, 10Wikimedia-Stream: Increase kafka event retention to 14 or 21 days - https://phabricator.wikimedia.org/T187296#3970994 (10Smalyshev) p:05Triage>03Normal [08:51:30] 10Analytics, 10Discovery, 10Wikidata, 10Wikidata-Query-Service, 10Wikimedia-Stream: Increase kafka event retention to 14 or 21 days - https://phabricator.wikimedia.org/T187296#3971006 (10Smalyshev) p:05Normal>03Triage [09:08:37] elukey: Hi ! Very successful operation yesterday night - Congrats :) [09:12:41] elukey: congrats! :) [09:13:34] \o/ [09:13:49] joal: congrats to both, we worked hard on this together :) [10:58:24] still working on monitoring, tried to remove java 7 from analytics1028 [10:58:39] it removes a bunch of things, like default-jre, jmxtrans,etc.. [10:58:57] so I suspect that the mapred/zkfc issues will go away when removing these [11:01:08] yeah, congrats. that's very nice, with hadoop migrated we have few systems in the cluster left on Java 7 [11:01:37] moritzm: we are also going to upgrade druid asap too :) [11:29:58] ok rolling out the new monitors [11:30:06] let's see how it goes [11:30:18] I've also cleaned up analytics1028/29 from java 7 [11:30:21] so far so good [12:43:27] java 7 removed from all the workers [12:43:43] !log openjdk-7-* packages removed from all the Hadoop workers [12:43:48] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:43:55] !log jmxtrans removed from all the Hadoop workers [12:43:58] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:51:38] joal: about zkfc and mapred I think that it is only a matter of removing java 7 from the hosts [12:51:55] later on let's try that on an1002 and restart zkfc? [12:52:18] elukey: works for me, lets's not forget to remove the JAVA_HOME patc [12:52:35] yep! [12:53:26] I got what's happening.. if I run . /usr/lib/bigtop-utils/bigtop-detect-javahome on an1001 I get [12:53:48] elukey@analytics1001:~$ env | grep JAVA [12:53:48] JAVA_HOME=/usr/lib/jvm/default-java [12:53:48] JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF-8 [12:54:15] when openjdk-7 gets removed default-java will follow [12:54:27] and the script will finally have no choice but to use java 8 [12:54:39] Ahhhh [12:54:57] I mean, that is wise from that script not to try to use a non-existant version of something :) [12:55:56] the script for some reason prefers java 7 over java 8, not sure why [12:56:22] I think it is an old "convention" when 8 was still not that popular [12:56:54] elukey: I think actually the script prefers "default" java, which in our special case is j7 - if default-java was j8, then [12:58:17] joal: could be an explanation yes [12:58:26] on jessie default-java is 7, on stretch 8 [12:58:33] if any, didn't check :) [12:58:35] PROBLEM - HDFS capacity reimaing GBs on analytics1001 is CRITICAL: CRITICAL - hadoop-hdfs-capacity-gb-remaining is 248451 https://grafana.wikimedia.org/dashboard/db/analytics-hadoop?orgId=1&panelId=47&fullscreen [12:59:04] ????? [12:59:04] this is a new alarm, probably wrong :) [12:59:30] Ah - Ah - with 240Tb left, I hope we're ok :) [13:01:02] it is weird, I've set the critical to 100000 [13:03:22] mmmm [13:05:32] ahhhh lol! [13:05:55] gt != lt ? [13:05:55] this metric should be le than the current value, not ge! [13:05:59] yeah [13:06:00] :D [13:06:02] :D [13:11:07] joal: [13:11:08] elukey@analytics1002:~$ . /usr/lib/bigtop-utils/bigtop-detect-javahome [13:11:11] elukey@analytics1002:~$ env | grep JAVA [13:11:11] :) [13:11:13] JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64 [13:11:25] \o/! [13:14:44] !log removed java 7 packages from analytics100[12] [13:14:47] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:14:52] also fixed the manual edits [13:15:49] analytics1003 is the only one left with java 7, but it still depends on jmxtrans so cannot remove it for now [13:20:42] RECOVERY - HDFS capacity reimaing GBs on analytics1001 is OK: OK - hadoop-hdfs-capacity-gb-remaining is 248721 https://grafana.wikimedia.org/dashboard/db/analytics-hadoop?orgId=1&panelId=47&fullscreen [13:20:52] yep :) [13:21:57] is Kafka compatible with Java 8? [13:22:59] moritzm: Jumbo is on java 8, analytics will hopefully be decommed next quarter and main eqiad/codfw will move to java 8 when upgraded to Kafka 1.0 [13:23:11] (this is my understanding) [13:23:45] perfect [13:24:24] archiva is on java 7 [13:24:41] joal: do you think that we can upgrade it ? I don't see any blocker [13:25:07] elukey: not on my side :) [13:30:08] it works probably fine with 8, at some point we should also update this, the archiva release running on meitnerium is four years old [13:30:17] this == archiva itself [13:30:58] so 2.3 runs with 1.7+, but not sure about 2.0 [13:35:18] !log installed openjdk-8 on meitnerium, manually upgraded java-update-alternatives to java8, restarted archiva [13:35:22] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:35:27] so archiva seems to be running with java 8 now [13:36:00] (checked via lsof) [13:36:09] joal: whenever you have time do you mind to test a build? [13:37:17] elukey: testing [13:37:25] <3 [13:38:45] (brb) [13:38:51] elukey: failed :( [13:38:56] ouch [13:39:02] what does it say? [13:39:18] Failed to transfer file: https://archiva.wikimedia.org/repository/mirrored/org/apache/maven/reporting/maven-reporting-api/2.0.9/maven-reporting-api-2.0.9.jar. Return code is: 500 , ReasonPhrase:Server Error [13:40:06] weird I don't see it in the logs [13:40:20] I just tried to download it and it works [13:40:46] mind to retry? [13:40:53] to understand if it is temp or not [13:41:12] ah yes now I see them [13:41:16] same issue [13:41:38] elukey: looks like some plugin jars are not workiong as exepcted [13:42:04] INFO | jvm 1 | 2018/02/14 13:37:29 | 2018-02-14 13:37:29.637:WARN:oejs.ServletHandler:/repository/mirrored/commons-cli/commons-cli/1.0/commons-cli-1.0.jar [13:42:07] INFO | jvm 1 | 2018/02/14 13:37:29 | java.util.ConcurrentModificationException [13:42:19] super weird [13:42:24] all right, rolling back [13:42:38] :( [13:43:22] joal: can you check now? [13:43:38] good [13:43:40] :( [13:43:45] Seems realyy j8 related :( [13:44:00] well now we have data :) [13:44:24] !log rollback java 8 upgrade for archiva - issues with Analytics builds [13:44:32] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:45:08] joal: I asked too much :D [13:45:33] all right, going afk for a couple of hours tops, will be back for meeting! [13:45:36] *meetings [13:59:27] ottomata[m]: Added better logging for schema-superset checking, and also added tests using nulls -- I'm interested to test that with you when you're in :) [14:56:30] hello teaam [15:22:56] 10Analytics-Kanban, 10Analytics-Wikistats, 10Hindi-Sites: Hindi Wikiversity is not showing in Wikimedia Stats - https://phabricator.wikimedia.org/T183682#3971985 (10Jayprakash12345) Any Progress? [15:27:14] 10Analytics, 10Discovery, 10Wikidata, 10Wikidata-Query-Service, 10Wikimedia-Stream: Increase kafka event retention to 14 or 21 days - https://phabricator.wikimedia.org/T187296#3972050 (10Ottomata) I think we can do this just for the mediawiki eventbus topics on the jumbo cluster. [15:35:23] 10Analytics-Kanban, 10Analytics-Wikistats: Create Daily & Monthly pageview dump with country data and Visualize on UI - https://phabricator.wikimedia.org/T90759#3972060 (10fdans) [15:35:25] 10Analytics-Cluster, 10Analytics-Kanban, 10Analytics-Wikistats: Add map component to Wikistats 2 - https://phabricator.wikimedia.org/T181529#3972058 (10fdans) 05Open>03Resolved a:03fdans [15:40:03] o/ [15:42:32] 10Analytics: Make the Wikistats 2 UI responsive - https://phabricator.wikimedia.org/T186812#3972081 (10fdans) [15:42:35] 10Analytics, 10Analytics-Wikistats: Make Wikistats 2's UI responsive - https://phabricator.wikimedia.org/T182764#3972083 (10fdans) [15:49:31] 10Analytics-Kanban, 10Analytics-Wikistats, 10Hindi-Sites: Hindi Wikiversity is not showing in Wikimedia Stats - https://phabricator.wikimedia.org/T183682#3972117 (10fdans) Hey @Jayprakash12345 this maybe won't yet cover all your use cases, but have you tried using the new Wikistats? https://stats.wikimedia.... [15:50:02] ottomata, what do you think about this whitelist format? https://pastebin.com/UK1ikfDp [15:53:10] mforns: into it [15:53:10] q [15:53:14] 10Analytics-Kanban, 10Analytics-Wikistats, 10Hindi-Sites: Hindi Wikiversity is not showing in Wikimedia Stats - https://phabricator.wikimedia.org/T183682#3972132 (10Jayprakash12345) @fdans Fine Sir, I was asking because whether we open this task or not [15:53:16] thanks :] [15:53:27] in order to keep any data, someone has to whitelist every field? [15:53:33] even if most of the PII is only in a few fields? [15:53:35] just a q [15:54:01] also, what would happen if keep for a nested object [15:54:05] userAgent: keep [15:54:06] ? [15:55:12] of [15:55:12] or [15:55:15] TableName1: keep [15:55:16] ? [15:55:35] ottomata, good questions :] [15:55:55] NICELY DONE cc joal elukey [15:56:18] that annoying format is on purpose, to force specifying all fields to keep [15:56:28] java 8ers gonna 8 [15:56:57] as I imagined, userAgent: keep is a format error [15:57:04] (it's so annoying when phab auto hides some comments intasks) [16:14:08] mforns: seems like it might be good to support keeping tables/nested objects? we control the whitelist in puppet or refinery, right? so we get to review? [16:14:08] if the field has children, you should open a new indentation in the yaml whitelist [16:14:09] yes [16:14:09] but we can do that later [16:14:09] this is good for now [16:14:09] i'm thinking of eventbus events, where there are things like `prior_state` [16:14:09] which is a subobject that contains many/most of the fields of the current object [16:14:10] e.g. https://github.com/wikimedia/mediawiki-event-schemas/blob/master/jsonschema/mediawiki/page/move/2.yaml#L131 [16:14:10] ottomata, mmmmh, I don't like to allow whitelisting full nested objects, because if someone adds a field inside it that has sensitive data, it passes unsanitized without any change [16:14:10] mforns: also, it probably needs to be case insensitive [16:14:10] since some fields get lowercased in hive [16:14:10] case insensitive is OK, I agree [16:14:10] mforns: good point [16:14:10] k [16:14:10] ottomata, and about the fact that all elements need the word keep? [16:14:11] we could do it in a way that the simple presence of the fieldName means to whitelist [16:14:11] but then, in yaml, you have to use: - fieldName [16:14:12] and then, when nested fields come, I don't think you can shuffle list elements and nested elements can you? [16:14:41] ottomata, it seems we could do it like: https://pastebin.com/jYQMGb3r [16:15:37] I think my brain is broken [16:15:59] I was getting duplicate log output until really late last night, so I slept on it. Read a whole bunch more about the logging module, fixed up some code, tried again [16:16:07] now I'm getting triplicate log output [16:16:27] O.o [16:17:28] it's great for a single module, but when you try to factor your code out and have logging defined in one central place it becomes a steaming pile of poop [16:17:50] xDDD [16:24:44] ottomata, we could allow whitelisting full nested objects. But for EventLogging specifically, I'd like to prevent that, maybe by adding a whitelistFullNested=false flag to the call to WhitelistSanitization.scala [16:25:29] But for other databases that we fully controll its contents, we could whitelist full tables in a single line, which is interesting [16:25:52] (in meetings sorryyyy) [16:25:58] np, sorry [16:42:39] ottomata: do you want to do a quick ops-sync? [16:43:12] elukey: ya as soon as meeting over :) [16:44:08] in bc elukey [16:45:57] nuria_: i think i thought i was more up-to-speed on this than i might be [16:46:28] nuria_: is this idea that dep tracking is a use-case for an event platform something that everyone shares? [16:46:39] i assumed we were treating it as part of the system [16:48:30] (03PS1) 10Fdans: Bunch of small map UI fixes [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/410488 [16:53:31] 10Analytics-Kanban: The size of metric areas in the dashboard should scale to available window space - https://phabricator.wikimedia.org/T187345#3972459 (10fdans) [16:53:40] 10Analytics-Kanban: The size of metric areas in the dashboard should scale to available window space - https://phabricator.wikimedia.org/T187345#3972459 (10fdans) a:03fdans [16:53:45] urandom: (cc ottomata ) as i understand it not all use cases need inter relation among entities [16:54:12] that's true, but many important ones do [16:54:36] urandom: thus while allowing those has to be part of the design from the beginning it does not preclude implementation of non dep use cases [16:54:47] right, sure [16:54:57] urandom: the tier-1 cases do yes, changeprop, and cache purging [16:55:06] you could define a MVP that didn't include that functionality [16:55:12] urandom: (cc ottomata ) but not all [16:55:25] urandom: but this is what faidon said, we alredy have that home grown [16:55:35] sorry, have what? [16:55:41] urandom: the MVP [16:56:12] urandom: we already digest and process events in eventloggin, event bus and we expose those in storage and event streams [16:56:30] urandom: those are kind of the mvps of all this [16:56:36] yeah, i meant MVP for a new architecture [16:57:02] urandom: yes, then ok [16:58:35] nuria_: i'm not necessarily trying to advocate for anything here (yet :)), just understand where the dissonance lies (almost certainly with me) [16:58:40] urandom: but (and i might not have this right) i do not think that dependency tracking must be solved at the same time, it should inform the design [16:58:58] urandom: ya same here, i think we all have different mental pictures in mind [16:59:22] urandom: that is why fist we need a design doc [16:59:57] nuria_: right, which i guess is the context of my questions here; i'm trying to understand what we have (this program doc) [17:00:47] urandom: i'm indifferent on dep tracking in the program [17:01:06] i was originally against it, but i think marko really wants it, and he and I think he sees dep traclking == stream processing [17:01:12] whereas i see it as a use case of stream processing [17:01:20] nuria_: if this is v2, or the final product for our "MVP", (eventlogging, eventbus, changeprop, et al), then dependency tracking would be a part of that final design, even if treated as an advanced feature that came after others were put in place [17:01:23] ping fdans standup [17:01:45] i think mark and I agree on what the scope of dep tracking over next FY year is (mostly research that informs event data platform design) [17:01:46] mark* [17:01:47] ottomata: there we go; i don't understand the use-case pov [17:01:50] marko* [17:02:08] ottomata: do you think of present day changeprop as a use-case as well? [17:02:09] (we are in meetings ... :) ) [17:02:34] ya, i do, i think marko sees change-prop & eventbus as the same system, i see them as separate [17:03:43] if change-prop were to be re-implemeted on top of event data platform, it would likely be as a stream processing app deployed to the system (e.g. something like flink) [17:04:06] (caveat: i am using flink as an example, not a decided tech) [17:04:10] ottomata: this sounds like an issue of semantics [17:04:13] yeah [17:04:22] ottomata: event data platform invokes an idea, for me [17:04:28] which is why i'm indifferent about dep tracking in the program [17:04:44] urandom: in my mind, event data platform :: confluent stream data platform [17:04:51] ottomata: stream processing another, one that implies event data [17:05:03] - kafka, schema registry, http proxy, kafka connectors (for state store integration), and stream processing system [17:05:28] urandom: going to forward you an email to blog posts... :) [17:05:33] ottomata: ok, but if you look at how we do stream *processing* here today [17:06:09] ottomata: ok, but if you look at how we do stream *processing* here today, then i think dependency tracking as a part of that system, makes more sense [17:06:24] ottomata: it sounds like you're focused on a subset (which is fine) [17:07:15] ottomata: it may also be fine to exclude everything not in that subset, but that's what we're doing, we need to get on the same page [17:07:30] but if that's what we're doing, i mean [17:08:08] urandom: aye, marko really wanted it in the program, and I think we are on the same page as to the work that will be done for dep tracking over the next year, so i'm fine with it either way [17:08:30] maybe we should scale down what is says in that outcome and just mention that dep tacking architecture and edp archs are tightly linked, and should be researched together? [17:08:58] btw, marko wrote those outputs there, so i dunno exactly [17:08:58] ottomata: it almost sounds like you're calling everything outside that subset as a "use-case", which might also be fine, again so long as everyone else understands what that means [17:09:49] Kafka question: if I subscribe to a number of topics each of which has tons of messages, how Kafka decides which messages to return so neither of the topics is starved? are there any configs that control it? [17:09:50] ottomata: it really sounds like what we need to decide is complete scope [17:10:42] SMalyshev: that I don't know, and would be kafka client speciifc [17:11:06] ottomata: it almost (as a late-comer to these discussions), it almost sounds like the processing of events as we do for changeprop and jobqueue are expanding the scope in a manner that isn't entirely welcomed [17:11:08] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade Analytics Cluster to Java 8 - https://phabricator.wikimedia.org/T166248#3972607 (10elukey) [17:11:40] ottomata: which again might be the right pov, but it should be explicit and understood by all [17:12:51] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade Analytics Cluster to Java 8 - https://phabricator.wikimedia.org/T166248#3289857 (10elukey) Cluster upgraded to java8 and java 7 packages removed from all analytics hosts except analytics1003 due to T184794 (jmxtrans depends... [17:13:22] urandom: oh? not sure i understand? [17:14:16] which part might not be welcome, a non home grown generic stream processing platform? [17:14:47] ottomata: well, you've remarked more than once that "marko really wanted it in the program", as a means of explaining why it is there [17:14:52] haha [17:14:53] oh [17:15:16] ottomata: that would seem to indicate that you don't see it as in-scope [17:15:41] ottomata: for Java consumer, standard Apache library? [17:15:55] ottomata: and since you're leading the charge here, your idea of in-scope is going to be treated as canonical [17:16:20] I am kinda concerned that it would consume one topic and leave other lagging for a long time... [17:18:51] hmm, ok, urandom i think ji see, sounds like we need to talk with marko more? i think marko and I are on the same page about what work will be done for dep tracking over next FY, and that that research work will inform some edp design (mostly choice of stream processing framework). [17:18:59] SMalyshev: yeah it'd be in the kafka java client [17:19:29] ottomata: ok [17:19:39] ottomata: i don't think anyone else got that impression in that meeting [17:19:43] ottomata: including me [17:19:45] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3972647 (10Tbayer) >>! In T187277#3969999, @Nuria wrote: >>The question is whether something that tracks page-view-like things should be consistent... [17:19:58] that we marko and I agreed on work that will be done? [17:21:42] ottomata: there were some pretty serious questions from sre about whether its inclusion was justified [17:22:16] oh elukey no job queue mig meeting? [17:22:17] ottomata: which seems to indicate a lack of clarity that there is agreement it is useful work that ought to be done [17:22:19] thoguht i saw it on my cal [17:22:32] aye i see [17:22:45] urandom: what do you think? [17:22:49] should it be included? [17:23:53] ottomata: [17:23:53] diomerda crespi [17:23:59] ahhahaha sorry [17:24:18] this is not intended for this chat and it is a horrible thing that a friend just pasted to me in chat in italian [17:24:21] apologies :D [17:24:24] i see in services chat [17:24:25] haha [17:24:28] we can do this just for the mediawiki eventbus topics on the jumbo cluster. [17:24:41] man IRC is getting weird today :) [17:24:44] https://gerrit.wikimedia.org/r/#/c/405687/16/eventlogging/handlers.py was the correct one ottomata :D [17:25:19] SMalyshev: i don't know much about how Java client works, but usually lower level topic-partition consumers are threaded in some way, and hand their message back to a queue that your client code consumes from? [17:25:29] i'd expect that it'd be smart about it [17:25:58] ottomata: it's all in the scope [17:26:06] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3972691 (10Tbayer) And for context, the overarching issue here is that EventLogging's main use case has been studying UI interactions, where exclud... [17:26:18] ottomata: i think it should be done (as do others), the question is where it fits in [17:26:43] ottomata: if not as a part of this work, than as it's own thing ('cause i can't think of anything it'd be a closer fit to). [17:27:08] ottomata: but given a finite number of programs, then yes, i do think the scope should include this [17:27:10] urandom: I think we just have to word more precisely what is happening with dep tracking as part of eventplatform next year [17:27:27] nuria_: yeah, agreed [17:27:53] urandom: but that's it, as we all agree that design of even system needs to include use cases that are more complex that a disjointed event traveling [17:28:25] urandom: do you want to edit wiki and add a more focused outcome /output [17:28:36] urandom: outcome: "desired results" [17:28:55] urandom: output; "a thing your team produces for desired results" [17:29:15] urandom: would you want to give it a try ? [17:29:18] cc ottomata [17:31:48] nuria_: yeah; I need to keep Pchelolo and mobrovac synced on this as well though, and both are traveling atm [17:32:03] nuria_: what is the timeline here again? [17:34:43] brb [17:49:36] elukey: style question for you [17:50:10] we run these python jobs from cron, and we need to wrangle stdout and stderr, and we have two options from what I can tell [17:50:40] 1. python duplicates ERROR and above to both stdout and stderr, cron then just has to send stdout to the log and stderr to email [17:51:36] 2. python does the default thing and logs only up to WARNING to stdout, and ERROR and above to stderr, cron then has to be smarter and combine stdout with stderr before logging, and also send stderr to email [17:52:32] The second option is harder for the crons, and I don't know if it would combine the log lines in the same order they were written (does it merge/sort on timestamp?) [17:52:44] I have code for the 1st version working, so it's no problem [17:53:01] it's just a style thing, python does the 2nd option by default and the code I wrote messes with that a little bit [17:54:41] ottomata[m]: yes, what I am trying to figure out is how to control what the threads are returning so there's a good mix. Right now I'm getting huge batch of messages from one topic and then huge batch from another, and that's not good because deduplication between topic does not work between batches (at least in current code) [17:55:12] so I have to do extra work. It'd be nice if there were smaller batches including mix of topics... but I am not sure what to tweak to make it happen [17:55:15] will read more docs [17:57:53] SMalyshev: sorry, ran out of internet at cafe, was typing: you won't get necessarily predictable split (e.g. round robin or somethign) from all topics all the time, but in general i don't think a topic will be starved [17:57:58] milimetric: so in theory 1. would be good for cron but not for "cli" usage, since you'd get duplicated output right? [17:58:02] urandom: sorry, ran out of internet, not sure if that convo continuted :) [17:58:44] ottomata: I don't need it to be exact, just some tweak so I don't get 500/1000 events from single topic and none from others, but instead say 100 from 5 topics [17:58:47] elukey: eh, but you can always 2> /dev/null and you'd be fine on the CLI [17:59:01] ok, so I'll do 1. then [17:59:21] many thanks to Marcel for helping to calm me down :) [17:59:42] O.o :] [17:59:58] milimetric: why 1? can't we just simple send output to stdout, err to stderr and then use tee? [18:00:35] milimetric: in cron we send stdout to a file, then we pipe stderr to | tee [18:00:47] it will log to a (different) file errors and send the email [18:00:49] it should work [18:00:55] and cli usage would be fine as well [18:01:17] but when you combine stderr and stdout, does it merge them in the right order by timestamp? [18:02:01] do you mean for cli? [18:02:15] no, for the cron use case [18:02:28] it would go to different files [18:02:32] so if I do nothing special in Python, it will send ERRORs only to stderr [18:02:38] oh, yeah, that's not great [18:02:44] because the info around the errors might be useful [18:03:02] true [18:03:55] but timestamp would help [18:04:28] usually we check errors, see the stacktrace of whatever, and then look for the rest [18:04:28] right, you could cat ERROR.log >> INFO.log | sort ... ew... [18:04:51] it is a sneaky problem, there is not win win [18:05:27] uh... up to you, I already have python working to do it the option 1 way, it's easy and doesn't mess up anything on the CLI really since we don't run these manually almost ever [18:06:25] well up to us to decide, I have 0 authority in the subject :D [18:07:28] the reason I asked was you were saying something about cleaning up the cron jobs, I wasn't sure if this was something you wanted to address (making all the log / email handling the same) [18:07:35] elukey: ^ [18:07:53] if you don't care, I'll just leave it as option 1. for now and we can change it very easily later [18:10:27] I don't particularly love option 1 due to the duplication of stderr, this is why i am not completely convinced [18:10:28] 10Analytics-Kanban, 10Analytics-Wikistats, 10Hindi-Sites: Hindi Wikiversity is not showing in Wikimedia Stats - https://phabricator.wikimedia.org/T183682#3972885 (10Milimetric) a:03Milimetric [18:11:04] the perfect end result for me would be [18:11:17] 10Analytics-Kanban, 10Analytics-Wikistats, 10Hindi-Sites: Hindi Wikiversity is not showing in Wikimedia Stats - https://phabricator.wikimedia.org/T183682#3860400 (10Milimetric) I'm sorry, I was going to fix this before this month's run and I forgot. I will fix it now and new data will be available by March... [18:11:18] - stdout + stderr handling expected flows of data [18:11:27] - cli runs without duplicate [18:11:52] - cron line + magic redirection -> one file for stdout+stderr, stderr goes to mail [18:12:06] ok, no problem [18:12:14] will do that [18:12:25] I am not sure how easy is that though [18:12:30] tomorrow I'll try to play with it [18:12:31] (right now some of the jobs do the duplication, btw, and some don't) [18:12:39] :( [18:12:48] don't want to be pedantic about this [18:12:54] sorry [18:13:06] not at all, it should all be the same, and that sameness should be something sensible [18:13:25] making it so, thanks for the thoughts [18:13:51] all right let me know if you want me to review code etc.. going offline but I'll check tomorrow morning :) [18:13:57] * elukey off! [18:16:49] elukey: ah remind me, so yes to +3 dbstore replacement nodes? [18:16:57] (03PS4) 10Milimetric: [WIP] Saving in case laptop catches on fire [analytics/refinery] - 10https://gerrit.wikimedia.org/r/408848 [18:18:05] (03PS1) 10Milimetric: Add hiwikiversity to sqoop list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/410519 (https://phabricator.wikimedia.org/T183682) [18:18:20] (03CR) 10Milimetric: [V: 032 C: 032] Add hiwikiversity to sqoop list [analytics/refinery] - 10https://gerrit.wikimedia.org/r/410519 (https://phabricator.wikimedia.org/T183682) (owner: 10Milimetric) [18:19:55] ottomata: yeah, that was what we said yesterday [18:20:25] ok thanks [18:20:28] will keep in budget then [18:21:03] 10Analytics, 10EventBus, 10Operations, 10hardware-requests, and 2 others: SSDs for main Kafka clusters - https://phabricator.wikimedia.org/T166341#3972939 (10RobH) Can you guys provide me with the exact hostnames of the kafka hostnames you want upgraded? I see quite a few, and the hostnames of kafka and a... [18:22:15] 10Analytics, 10EventBus, 10Operations, 10hardware-requests, and 2 others: SSDs for main Kafka clusters - https://phabricator.wikimedia.org/T166341#3972942 (10Ottomata) @robh I am talking with Faidon about this right now for budgeting next FY. I think we are not going to add SSDs, but instead, get a couple... [18:22:56] Heya ottomata - do ou want us to discuss JsonRefine now? [18:23:13] ooo yes, joal gimme 5-10 mins? (eating lunch) [18:23:38] k ottomata [18:25:04] (03CR) 10Nuria: "Nice, thanks for taking care of this" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/410519 (https://phabricator.wikimedia.org/T183682) (owner: 10Milimetric) [18:29:20] ok joal bc! [18:30:33] 10Analytics, 10EventBus, 10Operations, 10hardware-requests, and 2 others: SSDs for main Kafka clusters - https://phabricator.wikimedia.org/T166341#3972952 (10Ottomata) For reference, the Kafka cluster nodes are defined in Puppet in [[ https://github.com/wikimedia/puppet/blob/production/hieradata/common.yam... [18:31:08] 10Analytics, 10EventBus, 10Operations, 10hardware-requests, and 2 others: SSDs for main Kafka clusters - https://phabricator.wikimedia.org/T166341#3972956 (10RobH) So it looks like kafka[12]00[123] are all misc systems with 4 * 4TB LFF hot swap bays. Those cannot easily be converted to SFF, since Dell doe... [18:32:01] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3969855 (10bd808) How will the EventLogging client side javascript know if an event is a "page view like" or the more common UI interaction measure... [18:32:05] 10Analytics, 10ChangeProp, 10EventBus, 10MediaWiki-JobQueue, and 3 others: [EPIC] Develop a JobQueue backend based on EventBus - https://phabricator.wikimedia.org/T157088#3972961 (10Ottomata) [18:32:09] 10Analytics, 10EventBus, 10Operations, 10hardware-requests, and 2 others: SSDs for main Kafka clusters - https://phabricator.wikimedia.org/T166341#3972959 (10Ottomata) 05Open>03declined Ya, sounds good. (probably no SSDs for future nodes, FYI) [18:32:51] (03PS2) 10Joal: Add dataframe conversion to new schema function [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410241 (owner: 10Ottomata) [18:34:18] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3972964 (10Gilles) The problem is that the mechanics of EventLogging aren't collecting just aggregate counts, they're keeping individual records ab... [18:37:43] (03CR) 10Mforns: [C: 04-1] "Hey! Going over all the mentioned points in the task:" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/410488 (owner: 10Fdans) [18:38:21] (03CR) 10Mforns: [C: 04-1] "Sorry for the messy format :[" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/410488 (owner: 10Fdans) [18:38:53] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3972974 (10Nemo_bis) To exclude "DNT requests" from pageview counts, we'd at least need to keep tracking the proportion of such requests in each ti... [18:52:43] (03CR) 10jerkins-bot: [V: 04-1] Add dataframe conversion to new schema function [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410241 (owner: 10Ottomata) [18:55:27] 10Analytics, 10Analytics-EventLogging, 10Readers-Web-Backlog (Tracking): Spike: Explore an API for logging events sampled by session - https://phabricator.wikimedia.org/T168380#3973019 (10Jdlrobson) [18:56:24] 10Analytics, 10Analytics-EventLogging, 10Readers-Web-Backlog (Tracking): Spike: Explore an API for logging events sampled by session - https://phabricator.wikimedia.org/T168380#3973021 (10Jdlrobson) Does Wikimedia-Events have a project? [19:25:29] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3973126 (10Tbayer) >>! In T187277#3972957, @bd808 wrote: > How will the EventLogging client side javascript know if an event is a "page view like"... [19:32:10] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3973158 (10Tbayer) Furthermore, let's keep in mind that the preview EL event being discussed here (T184793) would always be immediately preceded by... [19:37:36] 10Analytics, 10Analytics-EventLogging: Failure in eventlogging schema for mediawiki/revision/visibility-change - https://phabricator.wikimedia.org/T187362#3973172 (10awight) [19:57:38] (03CR) 10Ottomata: "Joal, I think this is our problem:" [analytics/refinery/source] (jsonrefine) - 10https://gerrit.wikimedia.org/r/410241 (owner: 10Ottomata) [20:14:00] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3973410 (10Nuria) >appy to review older discussions if you find the links, but for now it looks like it's consensus that DNT does not apply to aggr... [20:38:21] 10Analytics, 10InteractionTimeline, 10Anti-Harassment (AHT Sprint 15): Measure how many unique people visit the Timeline - https://phabricator.wikimedia.org/T187374#3973471 (10dbarratt) [20:40:33] 10Analytics: Measure DNT usage across geographies and wikis - https://phabricator.wikimedia.org/T187376#3973474 (10Nuria) [20:41:23] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3973496 (10Nuria) >To exclude "DNT requests" from pageview counts, we'd at least need to keep tracking the proportion of such requests in each time... [20:45:18] 10Analytics: Measure DNT usage across geographies and wikis - https://phabricator.wikimedia.org/T187376#3973474 (10Tgr) I imagine the most relevant dimension would be across browsers, as different browsers might make the option easier or harder to find, word it differently etc. (or even make it on by default as... [21:34:59] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3969855 (10Fjalapeno) Yeah it sounds like we are trying to use EL “the wrong way” Previews just aren’t page views. I realize in some ways we want... [21:53:33] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3969855 (10Ottomata) > Yeah it sounds like we are trying to use EL “the wrong way” > I think this issue is a clear marker that we need to figure... [22:06:38] 10Analytics, 10Analytics-Wikistats: Pageviews by Country Monthly should specify month in question - https://phabricator.wikimedia.org/T187389#3973745 (10Nuria) [22:14:04] 10Analytics-Kanban: Small map UI changes - https://phabricator.wikimedia.org/T187205#3967632 (10Nuria) @fdans I think you need to fix commit message so it gets linked to this bug: https://gerrit.wikimedia.org/r/#/c/410488/ [22:18:00] 10Analytics-Cluster, 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Design map visualization on UI - https://phabricator.wikimedia.org/T175422#3973788 (10Nuria) [22:18:07] 10Analytics-Kanban, 10Analytics-Wikistats: Create Daily & Monthly pageview dump with country data and Visualize on UI - https://phabricator.wikimedia.org/T90759#3973790 (10Nuria) [22:18:09] 10Analytics-Cluster, 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Design map visualization on UI - https://phabricator.wikimedia.org/T175422#3592991 (10Nuria) 05Open>03Resolved [22:18:58] 10Analytics-Kanban, 10Patch-For-Review: Launch top per country pageviews on UI - https://phabricator.wikimedia.org/T185510#3973792 (10Nuria) @fdans please add points [22:19:21] 10Analytics-Kanban, 10Research-landing-page, 10Patch-For-Review: Pageviews/Stats on research.wikimedia.org - https://phabricator.wikimedia.org/T186819#3973795 (10Nuria) 05Open>03Resolved [22:20:05] 10Analytics-Kanban, 10Patch-For-Review: Do not show a split section if there is nothing to split by - https://phabricator.wikimedia.org/T186813#3973796 (10Nuria) 05Open>03Resolved [22:20:31] 10Analytics, 10Patch-For-Review: Update druid to latest release (0.10) - https://phabricator.wikimedia.org/T164008#3973799 (10Nuria) [22:20:33] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Upgrade Analytics Cluster to Java 8 - https://phabricator.wikimedia.org/T166248#3973797 (10Nuria) 05Open>03Resolved [22:20:39] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Remove EL capsule from meta and add it to codebase - https://phabricator.wikimedia.org/T179836#3973800 (10Nuria) 05Open>03Resolved [22:59:42] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3973927 (10Tgr) Some people in this discussion seem to labor under a misinterpretation of what DNT means. DNT expresses the user preference of not... [23:05:56] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3973952 (10Ottomata) > Given that EventLogging collects neither unique identifiers not IP / useragent by default FYI, we do collect user agent, but... [23:28:39] 10Analytics-Kanban: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#3974013 (10Jdlrobson) > Replying to https://phabricator.wikimedia.org/T184793#3971935 here (this seems a better place to have that conversation) > @Jdlrobson, maybe we can schedule a quick hangout to go down Tilma... [23:40:37] 10Analytics-Kanban: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#3974070 (10Ottomata) Just an event on your cal, feel free to move :) [23:40:46] 10Analytics, 10Analytics-EventLogging: Should it be possible for a schema to override DNT in exceptional circumstances? - https://phabricator.wikimedia.org/T187277#3974071 (10Tgr) >>! In T187277#3972957, @bd808 wrote: > What is the damage to the movement if "page view like" events are undercounted by 11% as lo... [23:51:56] 10Analytics-Kanban: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#3974111 (10Ottomata) @Jdlrobson I don't know what your working hours are, GCal automatically declines. Put an event on my calendar or find me on IRC tomorrow. (I'm not working this Friday). [23:53:20] 10Analytics, 10DC-Ops, 10Operations, 10ops-codfw: Decomission eventlog2001 - https://phabricator.wikimedia.org/T182397#3974123 (10RobH) 05Open>03Resolved a:03RobH >>! In T182397#3866939, @MoritzMuehlenhoff wrote: > This host still shows up in puppetdb, i.e. misses the deactivate step (e.g. visible in... [23:56:34] 10Analytics, 10Analytics-Wikistats: Wikistats pageviews by country table view - https://phabricator.wikimedia.org/T187407#3974151 (10Nuria) [23:56:46] 10Analytics-Kanban, 10Operations, 10ops-eqiad: Degraded RAID on analytics1055 - https://phabricator.wikimedia.org/T172809#3974169 (10RobH)