[01:45:44] 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2018-02-27 (1.31.0-wmf.23)), 10Patch-For-Review: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#4170212 (10Tbayer) @mforns How far are we on this? I see we have a table already that appears to be updated continuously (gre... [02:07:47] 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2018-02-27 (1.31.0-wmf.23)), 10Patch-For-Review: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#4170242 (10Nuria) Source_page is available on the event data: hive (event)> select event.source_url, event_source_title fro... [06:24:33] !log roll restart of all middlemanagers on druid100[123] - realtime tasks piled up from hours [06:24:34] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [06:24:36] joal: --^ [06:25:23] not really sure what happened, there were no realtime datapoints for banner impressions and the overlord console showed the middlemanagers with a lot of tasks in their "running" queues [06:29:30] super weird, I'll check later on [07:10:02] Hi elukey - thanks for having shaked the middle managers [07:10:25] elukey: I think that yesterday issues with the job restart lead to wrong state of druid RT tasks [07:11:04] elukey: from the streaming job side, no error, and now I can see data in druid datasource [07:14:15] !log Rerun webrequest-druid-daily-wf-2018-4-30 [07:14:16] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:45:25] joal: yep looks good now, I hoped it was a transient error, fiuuu [07:45:26] :) [08:16:14] (03CR) 10Fdans: [V: 031] "@Joal: dry run tested successfully in stat1005" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/429410 (https://phabricator.wikimedia.org/T188556) (owner: 10Fdans) [09:30:19] 10Analytics, 10EventBus, 10GlobalRename, 10MediaWiki-JobQueue, and 5 others: Global renames get stuck at metawiki - https://phabricator.wikimedia.org/T193254#4170558 (10alanajjar) @Pchelolo @mobrovac @Tgr Would be helpful if we create tracking task for stuck renames? (like T169440) [09:30:27] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Goal, and 3 others: FY17/18 Q4 Program 8 Services Goal: Complete the JobQueue transition to EventBus - https://phabricator.wikimedia.org/T190327#4170561 (10mobrovac) [09:33:40] 10Analytics, 10EventBus, 10GlobalRename, 10MediaWiki-JobQueue, and 5 others: Global renames get stuck at metawiki - https://phabricator.wikimedia.org/T193254#4170563 (10mobrovac) >>! In T193254#4169720, @Tgr wrote: > So we should probably get the core bug fixed. I filed {T193471} for this. >>! In T193254... [09:45:03] 10Analytics, 10ChangeProp, 10EventBus, 10ORES, and 2 others: Drop "non bot" condition from ORES changeprop rules - https://phabricator.wikimedia.org/T187927#4170577 (10mobrovac) 05Open>03Resolved [10:11:45] 10Analytics, 10EventBus, 10GlobalRename, 10MediaWiki-JobQueue, and 5 others: Global renames get stuck at metawiki - https://phabricator.wikimedia.org/T193254#4170642 (10Tgr) >>! In T193254#4170563, @mobrovac wrote: > I filed {T193471} for this. Thanks! > I think we should go ahead and switch these two fo... [11:22:41] 10Analytics: Varnishkafka does not play well with varnish 5.2 - https://phabricator.wikimedia.org/T177647#4170769 (10R4q3NWnUx2CEhVyr) Found it... it was my fault no issue with the API. I also found out that there is a "vut->idle_f" callback that can be set for rd_kafka_poll so it will be easier than what I tho... [11:44:00] !log False positive only in webrequest-load-check_sequence_statistics-wf-upload-2018-5-1-6 [11:44:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:15:52] 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2018-02-27 (1.31.0-wmf.23)), 10Patch-For-Review: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#4170881 (10Tbayer) >>! In T186728#4170242, @Nuria wrote: > Source_page is available on the event data: > > hive (event)> se... [13:25:40] o/ [13:26:09] yoohoo [13:31:18] ottomata: wanna merge that puppy? [13:31:35] yeah! [13:31:37] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Patch-For-Review, 10Services (watching): Upgrade to Stretch and Java 8 for Kafka main cluster - https://phabricator.wikimedia.org/T192832#4170988 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by otto on neodymium.eqiad.wmnet for hosts: ``` ['ka... [13:32:11] so fdans i'm going to merge, and then run the script with different args to verify ti works yaa? [13:32:35] ottomata: sounds good [13:39:22] 10Analytics, 10EventBus, 10JobRunner-Service, 10MediaWiki-Database, and 2 others: Wikimedia\Rdbms\LoadBalancer::{closure}: found writes pending - https://phabricator.wikimedia.org/T191282#4170994 (10Tgr) p:05Triage>03High "High" at the point of filing this task was 100/hour. Now it's a something like 4... [13:42:27] 10Analytics, 10EventBus, 10JobRunner-Service, 10MediaWiki-Database, and 2 others: Wikimedia\Rdbms\LoadBalancer::{closure}: found writes pending - https://phabricator.wikimedia.org/T191282#4171001 (10Tgr) Almost all seem to come from the job queue, unfortunately I don't think the job name is recorded. [13:43:42] fdans: looks like it works to me! took me a min to realize i couldn't create the hardlinks without being root [13:43:59] \o/ [13:44:11] couple of notes: can you make a patch that sets mode => 0744 on the $archive_script [13:44:11] ? [13:44:16] it should be execultable [13:44:27] doing now [13:44:49] you can also remove the .sh from the end of $archive_path (you don't need to remove it from the file in git puppet) [13:44:57] that way it looks more like a usual script/comand [13:45:15] it also might be nice to add [13:45:17] set -ex [13:45:20] intead of just set -e [13:45:31] since you are piping stdout into devnull anyway [13:45:38] that way it is easier to trouble shoot if somethign isn't working [13:45:46] -x will print out the commands inthe script as they are being run [13:46:10] yea i was doing that to debug :) adding [13:46:35] oo actually [13:46:36] wait [13:46:44] don't do it, it looks like set -x prints them to stderr [13:47:02] fdans: maybe just add a little message then right beore you run the commands [13:47:02] like [13:47:23] echo "Hardlink copying $MAXMIND_DB_SOURCE_DIR to $MAXMIND_DB_ARCHIVE_DIR/$CURRENT_DATE" [13:47:24] and [13:47:45] echo "Copying $MAXMIND_DB_ARCHIVE_DIR/$CURRENT_DATE into HDFS at $HDFS_ARCHIVE_DIR" [13:48:04] or whatever [13:56:51] 10Analytics, 10EventBus, 10JobRunner-Service, 10MediaWiki-Database, and 2 others: Wikimedia\Rdbms\LoadBalancer::{closure}: found writes pending - https://phabricator.wikimedia.org/T191282#4171013 (10Pchelolo) > Almost all seem to come from the job queue, unfortunately, I don't think the job name is recorde... [14:00:57] 10Analytics: Varnishkafka does not play well with varnish 5.2 - https://phabricator.wikimedia.org/T177647#4171020 (10R4q3NWnUx2CEhVyr) I created a review https://gerrit.wikimedia.org/r/#/c/430069/ I am still lacking some testing particularly with regards to kafka handling. [14:10:09] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Patch-For-Review, 10Services (watching): Upgrade to Stretch and Java 8 for Kafka main cluster - https://phabricator.wikimedia.org/T192832#4171039 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['kafka1003.eqiad.wmnet'] ``` and were **ALL** succ... [14:17:57] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Patch-For-Review, 10Services (watching): Upgrade to Stretch and Java 8 for Kafka main cluster - https://phabricator.wikimedia.org/T192832#4171053 (10Ottomata) Revered partman: https://gerrit.wikimedia.org/r/#/c/430072/ [14:29:46] ottomata: yayyyyy they are back to working! https://gerrit.wikimedia.org/r/#/c/430071/ [14:30:02] fdans: does karma testing work for you on dashiki? [14:30:11] I upgraded something and mine's hanging with some stupid error [14:30:29] not an error really, just says it's waiting for Phantom [14:30:33] milimetric: lemme download, I haven't touched dashiki with this computer [14:30:41] tried switching to Chrome, didn't work, strange [14:31:54] 10Analytics, 10Analytics-Kanban: Upgrade Kafka on jumbo cluster to 1.1.0 (latest) - https://phabricator.wikimedia.org/T193495#4171069 (10Ottomata) p:05Triage>03Normal [14:32:38] wow nice fdans [14:33:29] milimetric: which command are you running for the tests? [14:33:36] karma start [14:34:21] same thing happening with chrome [14:34:23] hm... [14:35:26] merged fdans [14:35:37] thank youuu [14:35:53] ottomata: the puppet patch is also ready for cr :P [14:36:30] Hi my fellow js-masters [14:36:57] milimetric: i'm getting [14:36:59] I could do wih some help - Pivot seems broken for datasource Banner Activity :( [14:37:01] https://www.irccloud.com/pastebin/fWcWy5dI/ [14:37:11] joal: batcave? [14:37:15] sure [14:37:16] OMW [14:37:42] huh fdans weird [14:39:42] OMW is Oh My Wiki? [14:40:24] dsaez: generally on my way, but i think it should be on my wiki from now on [14:40:44] ;) [14:42:09] elukey: there still is an issue with druid :( [14:42:46] fdans: you tested the script changes as is? [14:43:05] e.g. if i merge and dont' test myself we are confident the cron will be ok? :) [14:45:09] joal: yeah tasks are piling up [14:45:11] like this morning [14:45:17] elukey: :( [14:45:48] what version of tranquillity are we using? [14:45:54] elukey: looking at that [14:45:58] maybe it is not fully compatible with 0.10? [14:46:45] elukey: tranquility-spark_2.11 version 0.8.2 [14:47:30] elukey: also, coordinator says that some datasources are not fully loaded (even without realtime) [14:50:02] elukey: version we use is last from amven [14:50:37] https://github.com/druid-io/tranquility/releases looks the last tagged, but in the readme they write 0.9.0 ? [14:51:27] 10Analytics, 10Analytics-Kanban: Upgrade Kafka on jumbo cluster to 1.1.0 (latest) - https://phabricator.wikimedia.org/T193495#4171107 (10Ottomata) [14:51:50] hm elukey [14:51:52] elukey: where's pivot running again? [14:51:57] elukey: What machine does pivot run on? [14:52:02] milimetric: thorium [14:52:04] thx [14:52:08] hello :) [14:52:36] 10Analytics, 10Analytics-Wikistats, 10ORES, 10Scoring-platform-team: Discuss Wikistats integration for ORES - https://phabricator.wikimedia.org/T184479#4171115 (10Halfak) One that I think would be interesting for Wikistats is a count of the number of non-redirect main namespace articles that fall into a se... [14:52:39] joal: where did you see that the coordinator didn't load everything? [14:52:43] elukey: I'm gonna try to kill the tasks that should be finished [14:52:51] in coord main UI page [14:52:57] red dots elukey [14:55:25] the UI is a bit difficult to read, does it mean that for some reason segments are not loaded? [14:55:32] correct elukey [14:56:51] but it doesn't tell us which ones right? [14:58:52] nothing weird in the logs, I am not sure where to look for weirdness [14:59:47] a-team I’ll be a couple min late to standup [15:00:14] 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2018-02-27 (1.31.0-wmf.23)), 10Patch-For-Review: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#4171157 (10Nuria) Ah, sorry I missed that you asked for this earlier. We shall talk internally about wether we can retain tho... [15:01:48] https://github.com/druid-io/tranquility/pull/233 doesn't look good.. [15:02:14] ping fdans [15:03:56] aaah sorry elukey, I forgot you were off, I owe you some vacation time [15:05:00] milimetric: for what?? :D :D [15:05:49] joal: I am checking http://localhost:8081/druid/coordinator/v1/loadstatus?full (after tunnelling to druid1001) [15:05:50] elukey: I manually killed RT tasks for banner, cleaner now, but man this is no good :( [15:06:27] ah snap now I see why, historical logs are showing up weirdness [15:06:44] io.druid.java.util.common.ISE: JavaScript is disabled [15:06:45] elukey: what type? [15:06:47] ahahahah [15:07:02] elukey: this is the problem we have with pivot, shouldn't be related to loading [15:07:25] ah is it known?? Didn't see it before [15:07:33] logs are horrible [15:07:46] full of those [15:07:46] elukey: nope, new stuff [15:07:56] but woudn't pivot affect brokers? [15:08:31] elukey: brokers transfers queries to historical [15:09:17] mmm it is not what I knew, I thought they (brokers) were making requests pulling data from one or more historical and cache it [15:09:52] elukey: broker receives the request from client, then, if no result cached, forwards it to historical nodes [15:10:10] okok [15:10:18] anyhow, the first occurrence of the javascript issue is [15:10:18] 2018-04-30T15:32:56,492 ERROR io.druid.query.ChainedExecutionQueryRunner: Exception with one of the sequences! [15:10:21] io.druid.java.util.common.ISE: JavaScript is disabled [15:10:23] that is when we did the upgrade [15:10:40] elukey: yessir, I think itis [15:11:27] elukey: My understanding of the problem with realtime tasks: the realtime tasks don't manage to hand-off segments to historical nodes (which correlates with the issue we see of historical not having all segments for various datasources) [15:12:17] elukey: No realime data for the tasks I have killed, while before killing them, data was available [15:12:37] but how realtime data affects also say webrequests? [15:12:52] elukey: I don't think banner affect webrequest [15:13:04] elukey: I think historical nodes are having issue loading segments [15:13:24] sure sure [15:13:29] this is my understanding as well [15:14:24] ok [15:14:42] elukey: we are the cave, talking about druid in a minute - do you want to join? [15:15:05] are you guys handling it? If so I'd prefer to skip for today [15:15:24] elukey: we discuss as of now - please leave, i's holiday ;) [15:15:55] ack thanks, will re-check later, ping me on hangouts if needed [15:15:59] cheers [15:20:15] ah joal look to druid1002's historical log [15:20:16] Exception loading segment[pageviews-daily_2018-04-01T00:00:00.000Z_2018-05-01T00:00:00.000Z_2018-05-01T02:37:22.821Z_1] [15:20:25] Caused by: java.lang.IllegalArgumentException: Could not resolve type id 'hdfs' into a subtype of [simple type, class io.druid.segment.loading.LoadSpec] at [Source: N/A; line: -1, column: -1] [15:20:35] elukey: recent error? [15:20:54] yeah [15:20:58] and before I can see dataSource='pageviews-daily', binaryVersion='9'}} [15:20:58] Right [15:21:14] same thing that was happening yesterday on druid1001 [15:21:37] Yes, I recall that [15:21:43] elukey: restart d1002? [15:21:55] :( [15:22:00] !log restart druid-historical on druid1002 - Caused by: java.lang.IllegalArgumentException: Could not resolve type id 'hdfs' into a subtype of [15:22:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:22:09] it is loading segments now [15:22:17] * joal thanks elukey a lot for still working while saying he wouldn't [15:23:04] elukey: Do you think we should shake d1003 just in case? [15:23:32] I am checking it now, seems fine :( [15:23:59] I am wondering if there is some inconsistency when loading cached segments generated by druid 0.9.2 [15:24:09] elukey: segment loading loops: down to ~10, then back up to 85 [15:24:12] ah there you go, same error on 1003 [15:24:17] elukey: something is wrong :( [15:26:39] elukey: it seems to looks better [15:33:38] I should be back :) [15:33:45] Hello :) [15:33:51] !log restart historical on druid1003 - exceptions in the logs [15:33:52] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:34:03] * joal 'll thanks elukey again and again today :) [15:34:21] ah now 100% loaded [15:34:28] uff I hope that it will not come back again [15:34:42] elukey: SO DO I !!!! [15:34:44] joal: what about real time then? [15:34:54] elukey: realtime works super fine [15:34:56] look at that [15:35:17] so the historicals were causing this mess [15:35:22] https://gist.github.com/jobar/964c58b84396bd1d2532f2ffab55290d [15:35:28] yessir elukey [15:35:51] elukey: RT couldn't handoff there files to historical, therefore they didn't wanna die [15:36:14] elukey: And actually, even with manual killing of the tasks, data is here [15:37:08] that explains it yes [15:37:35] luckily we followed Druid's instructions to avoid any production issue while upgrading [15:37:40] * elukey grumbles [15:37:45] elukey: the fact that data didn't disappear while I had the tasks killed manually makes me feel the thing ius kinda resilient (ehen it works ...) [15:37:53] elukey: indeed ... [15:38:20] all right, logging off for real now, otherwise Marika will kill me :P [15:38:23] byyyyeee [15:38:25] o/ [15:38:46] Bye elukey - Thanks again ! [15:43:04] 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2018-02-27 (1.31.0-wmf.23)), 10Patch-For-Review: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#4171281 (10Nuria) We will be doing the work of adding these fields and adding a work item to quantify the impact into identit... [15:51:58] https://www.irccloud.com/pastebin/KVUfFyG9/ [15:55:05] 10Analytics, 10EventBus, 10JobRunner-Service, 10MediaWiki-Database, and 2 others: Wikimedia\Rdbms\LoadBalancer::{closure}: found writes pending - https://phabricator.wikimedia.org/T191282#4171298 (10Tgr) ({T142313} is the task for adding context information to logs.) [16:04:31] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4171321 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by otto on neodymium.eqiad.wmnet for hosts: ``` ['analytics1047.eqiad.wmnet'] ``` T... [16:04:36] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4171322 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by otto on neodymium.eqiad.wmnet for hosts: ``` ['analytics1048.eqiad.wmnet'] ``` T... [16:08:20] (03PS4) 10Framawiki: Add export to HTML [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/427020 (https://phabricator.wikimedia.org/T117644) [16:13:04] (03CR) 10Framawiki: Add export to HTML (031 comment) [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/427020 (https://phabricator.wikimedia.org/T117644) (owner: 10Framawiki) [16:34:08] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4171411 (10Ottomata) [16:36:41] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4171423 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1048.eqiad.wmnet'] ``` and were **ALL** successful. [16:37:10] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4171426 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1047.eqiad.wmnet'] ``` and were **ALL** successful. [17:03:37] 10Analytics, 10Analytics-Kanban: Upgrade Kafka on jumbo cluster to 1.1.0 (latest) - https://phabricator.wikimedia.org/T193495#4171540 (10Ottomata) Done in deployment-prep, looks good! [17:10:25] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Goal, and 3 others: FY17/18 Q4 Program 8 Services Goal: Complete the JobQueue transition to EventBus - https://phabricator.wikimedia.org/T190327#4171556 (10Pchelolo) [17:10:30] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Services (done): The .meta.domain is incorrect in EventBus when other wiki is used - https://phabricator.wikimedia.org/T192363#4171552 (10Pchelolo) 05Open>03Resolved a:03Pchelolo THis has been deployed, the domain is now reported correctly. [17:11:37] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4171561 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by otto on neodymium.eqiad.wmnet for hosts: ``` ['analytics1046.eqiad.wmnet'] ``` T... [17:11:41] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4171562 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by otto on neodymium.eqiad.wmnet for hosts: ``` ['analytics1045.eqiad.wmnet'] ``` T... [17:28:23] 10Analytics, 10Analytics-Kanban: Upgrade Kafka on jumbo cluster to 1.1.0 (latest) - https://phabricator.wikimedia.org/T193495#4171599 (10Ottomata) @elukey, everything is working in deployment-prep. Any objections if I start this tomorrow (wednesday?) [17:33:49] (03CR) 10Zhuyifei1999: [C: 031] Add export to HTML [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/427020 (https://phabricator.wikimedia.org/T117644) (owner: 10Framawiki) [17:56:20] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4171741 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1046.eqiad.wmnet'] ``` and were **ALL** successful. [17:57:12] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4171743 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1045.eqiad.wmnet'] ``` and were **ALL** successful. [18:03:32] 10Analytics-Dashiki, 10Analytics-Kanban: TimeseriesData tests broken in Dashiki - https://phabricator.wikimedia.org/T193513#4171759 (10Milimetric) [18:03:48] 10Analytics-Dashiki, 10Analytics-Kanban: TimeseriesData tests broken in Dashiki - https://phabricator.wikimedia.org/T193513#4171774 (10Milimetric) p:05Triage>03High [18:09:18] 10Analytics, 10Collaboration-Team-Triage, 10EventBus, 10MediaWiki-JobQueue, and 2 others: Make EchoNotification job JSON-serializable - https://phabricator.wikimedia.org/T192945#4171780 (10jmatazzoni) [18:17:32] 10Analytics, 10Analytics-Kanban: Upgrade Kafka on jumbo cluster to 1.1.0 (latest) - https://phabricator.wikimedia.org/T193495#4171827 (10elukey) >>! In T193495#4171599, @Ottomata wrote: > @elukey, everything is working in deployment-prep. Any objections if I start this tomorrow (wednesday?) None, let me know... [18:22:51] (03PS1) 10Milimetric: Fix broken tests and karma versions [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/430096 (https://phabricator.wikimedia.org/T193513) [18:23:48] 10Analytics-Dashiki, 10Analytics-Kanban, 10Patch-For-Review: TimeseriesData tests broken in Dashiki - https://phabricator.wikimedia.org/T193513#4171876 (10Milimetric) [18:35:01] 10Analytics, 10EventBus, 10Wikimedia-Logstash, 10Patch-For-Review, 10Services (watching): EventBus HTTP Proxy service does not report errors to logstash - https://phabricator.wikimedia.org/T193230#4171923 (10Ottomata) I've deployed this to prod, and restarted a few EventBus instances, but they don't send... [18:42:57] milimetric: Found it! https://github.com/druid-io/druid/pull/3818 [18:43:19] I'll follow up tomorrow morning with Luca (http://druid.io/docs/0.10.0-rc2/development/javascript.html) [18:43:24] nice joal [18:43:34] yeah, sounds like a config tweak and we should be ok [18:43:51] here's why pivot does what it does: https://github.com/druid-io/druid/pull/3818#issuecomment-272591722 [18:44:31] makes sense milimetric [18:46:09] 10Analytics, 10EventBus, 10Wikimedia-Logstash, 10Patch-For-Review, 10Services (watching): EventBus HTTP Proxy service does not report errors to logstash - https://phabricator.wikimedia.org/T193230#4171975 (10Ottomata) Ah, this did not actually work in deployment-prep. The logs I saw in logstash there we... [18:46:27] 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2018-02-27 (1.31.0-wmf.23)), 10Patch-For-Review: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#4171977 (10Tbayer) Thanks! And sure, the future pageviews sanitizing project should apply here too (and e.g. I don't think we... [18:48:01] 10Analytics: Publish data on seen page previews - https://phabricator.wikimedia.org/T193524#4171981 (10Tbayer) [18:52:10] 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2018-02-27 (1.31.0-wmf.23)), 10Patch-For-Review: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#4172021 (10Tbayer) [18:52:12] 10Analytics: Publish data on seen page previews - https://phabricator.wikimedia.org/T193524#4172020 (10Tbayer) [18:52:32] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4172022 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by otto on neodymium.eqiad.wmnet for hosts: ``` ['analytics1044.eqiad.wmnet'] ``` T... [18:52:37] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4172023 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by otto on neodymium.eqiad.wmnet for hosts: ``` ['analytics1043.eqiad.wmnet'] ``` T... [19:14:01] 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2018-02-27 (1.31.0-wmf.23)), 10Patch-For-Review: Record and aggregate page previews - https://phabricator.wikimedia.org/T186728#4172176 (10Nuria) >On the other hand, as a reminder, we already make some link usage data with (source, target) In aggreggat... [19:24:51] (03PS1) 10Milimetric: Add ability to pivot a file in tabs layout [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/430113 (https://phabricator.wikimedia.org/T126279) [19:36:09] (03CR) 10Nuria: [V: 032 C: 032] "Tests work for me, they run despite npm install giving some errors." [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/430096 (https://phabricator.wikimedia.org/T193513) (owner: 10Milimetric) [19:38:47] 10Analytics, 10Analytics-Kanban, 10EventBus, 10Wikimedia-Logstash, and 2 others: EventBus HTTP Proxy service does not report errors to logstash - https://phabricator.wikimedia.org/T193230#4172492 (10Ottomata) [19:41:04] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10EventBus, 10Services (watching): Modern Event Platform (with EventLogging of the Future (EoF)) - https://phabricator.wikimedia.org/T185233#4172546 (10Ottomata) [19:58:11] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4172617 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1044.eqiad.wmnet'] ``` and were **ALL** successful. [19:58:23] PROBLEM - HDFS corrupt blocks on analytics1001 is CRITICAL: 5 ge 5 https://grafana.wikimedia.org/dashboard/db/analytics-hadoop?orgId=1&panelId=39&fullscreen [19:58:26] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Reimage the Debian Jessie Analytics worker nodes to Stretch. - https://phabricator.wikimedia.org/T192557#4172618 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1043.eqiad.wmnet'] ``` and were **ALL** successful. [19:58:45] ooo [19:59:20] hm [19:59:40] weird, i except that to go down and recovery shortly...i just finished reimaging 2 hosts [19:59:43] will watch [20:18:57] (03PS2) 10Milimetric: Add ability to pivot a file in tabs layout [analytics/dashiki] - 10https://gerrit.wikimedia.org/r/430113 (https://phabricator.wikimedia.org/T126279) [20:21:12] nuria_: k, the pivot thing is ready and documented, you can review and test if you want, but I added the other front endy folks too [20:21:27] pivot thing: dashiki pivoting, not Imply Pivot [20:33:11] 10Analytics, 10EventBus, 10MediaWiki-JobQueue, 10Goal, and 3 others: FY17/18 Q4 Program 8 Services Goal: Complete the JobQueue transition to EventBus - https://phabricator.wikimedia.org/T190327#4172699 (10Pchelolo) [21:09:58] ottomata: you have a minute? [21:42:49] 10Analytics, 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Some metrics don't work in the topic selector - https://phabricator.wikimedia.org/T188268#4172865 (10sahil505) [21:44:04] fdans: this may just be a silly github UI question, but the uap-core folder in https://github.com/wikimedia/analytics-ua-parser looks empty, how does one access the actual content? Is that a direct reference to upstream? [22:05:45] 10Analytics, 10Product-Analytics: Assess impact of ua-parser update on core metrics - https://phabricator.wikimedia.org/T193578#4172950 (10Tbayer) [22:07:45] HaeB: latest code https://gerrit.wikimedia.org/r/#/c/429527/ [22:24:15] RECOVERY - HDFS corrupt blocks on analytics1001 is OK: (C)5 ge (W)2 ge 1 https://grafana.wikimedia.org/dashboard/db/analytics-hadoop?orgId=1&panelId=39&fullscreen [22:26:47] 10Analytics, 10Product-Analytics: Assess impact of ua-parser update on core metrics - https://phabricator.wikimedia.org/T193578#4172950 (10Nuria) >What percentage of monthly Wikipedia uniques devices (a core metric we report to the board on a monthly basis, calculated based on webrequests with agent_type = 'us... [22:40:13] (03PS3) 10Nuria: [WIP] UA parser specification changes for OS version [analytics/ua-parser/uap-java] (wmf) - 10https://gerrit.wikimedia.org/r/429527 (https://phabricator.wikimedia.org/T189230) [22:40:14] 10Analytics, 10Product-Analytics: Assess impact of ua-parser update on core metrics - https://phabricator.wikimedia.org/T193578#4173091 (10Tbayer) >>! In T193578#4173032, @Nuria wrote: >>What percentage of monthly Wikipedia uniques devices (a core metric we report to the board on a monthly basis, calculated ba... [22:47:48] 10Analytics, 10Product-Analytics: Assess impact of ua-parser update on core metrics - https://phabricator.wikimedia.org/T193578#4173107 (10Nuria) >but without some actual data I would not be comfortable relying on the assumption for our core metrics. Certainly the algorithm itself hasn't relied on it so far. m...