[07:06:31] morning! [07:06:50] from the failed oozie coordinator: [07:06:50] org.apache.oozie.command.CommandException: E0800: Action it is not running its in [OK] state, action [0026973-170228165458841-oozie-oozi-W@mark_raw_dataset_done] [07:07:33] it might be something like the last time, so either hdfs or yarn daemons on one of the debian nodes misbheaving [07:14:11] !log restarted webrequest-load-wf-misc-2017-3-20-3 [07:14:12] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [07:26:42] but I can't find anything.. [07:26:43] mmmm [07:43:32] 10Analytics, 06Operations, 15User-Elukey: Investigate recent Kafka Burrow alarms for EventLogging - https://phabricator.wikimedia.org/T160886#3113658 (10elukey) [07:46:20] 10Analytics, 06Operations, 15User-Elukey: Investigate recent Kafka Burrow alarms for EventLogging - https://phabricator.wikimedia.org/T160886#3113672 (10elukey) [07:47:53] opened --^ for the Burrow lag alert [07:48:27] (brb) [08:09:43] 10Analytics, 10Analytics-EventLogging, 10DBA, 10ImageMetrics: Drop EventLogging tables for ImageMetricsLoadingTime and ImageMetricsCorsSupport - https://phabricator.wikimedia.org/T141407#3113686 (10Tgr) Probably some Labs box or personal vagrant install still has ImageMetrics installed. I guess this is a m... [08:21:01] hello team :] [08:21:17] o/ [08:23:08] \o [08:23:27] xD this does not look good [08:56:25] 10Analytics, 10Analytics-General-or-Unknown, 10Wikidata, 07Story: [Story] Statistics for Wikidata API usage - https://phabricator.wikimedia.org/T64873#3113711 (10Lydia_Pintscher) @Addshore yes except it is broken. [09:02:54] 10Analytics, 10Analytics-General-or-Unknown, 10Wikidata, 07Story: [Story] Statistics for Wikidata API usage - https://phabricator.wikimedia.org/T64873#3113718 (10Addshore) >>! In T64873#3113711, @Lydia_Pintscher wrote: > @Addshore yes except it is broken. Ahhh, I was unaware of this! ``` PHP Fatal error... [09:18:16] 10Analytics, 06Operations, 06WMDE-Analytics-Engineering, 15User-Addshore: /a/mw-log/archive/api on stat1002 no longer being populated - https://phabricator.wikimedia.org/T160888#3113734 (10Addshore) [09:18:36] 10Analytics, 10Analytics-General-or-Unknown, 10Wikidata, 07Story: [Story] Statistics for Wikidata API usage - https://phabricator.wikimedia.org/T64873#3113747 (10Addshore) I have created a task that should fix this dashboard @ T160888 [09:19:21] 10Analytics, 06Operations, 06WMDE-Analytics-Engineering, 15User-Addshore: /a/mw-log/archive/api on stat1002 no longer being populated - https://phabricator.wikimedia.org/T160888#3113734 (10Addshore) p:05Triage>03High [09:23:42] 10Analytics, 06Operations, 06WMDE-Analytics-Engineering, 15User-Addshore: /a/mw-log/archive/api on stat1002 no longer being populated - https://phabricator.wikimedia.org/T160888#3113734 (10elukey) Seems a perfect match with merge time of https://gerrit.wikimedia.org/r/#/c/341570 [09:55:38] addshore: o/ - going to check what's wrong with the rsync, will let you know in a bit [09:56:51] elukey: awesome! :) [09:57:03] I'm sur the logs (whereever they are) will probably make it fairly obvious :) [10:04:20] I am pretty sure it is a network acls problem, but I need to figure out which rule needs to be changed.. in theory we did all the work needed, in practice somethig is missing :D [10:10:08] elukey: if it helps this is the origional ticket that the rsync was added for https://phabricator.wikimedia.org/T118739 [10:11:20] wait... not [10:11:24] thats totally unrelated ... :D [10:15:11] elukey: Hi ! I think we need a deep investigation on why some jobs fail erratically :( [10:15:39] joal: hello! I tried this morning but found nothing in the logs [10:15:40] :( [10:15:46] same for me this weekend [10:16:07] addshore: I tried to run the rsync manually and it starts from api.log-20170224.gz, but it takes a long time to send the files (30/50GB each) [10:17:46] hi joal :] [10:17:56] Hi mforns [10:18:57] elukey, joal, I'm about to create the temporary keyspace+tables in prod cassandra to test legacy pageviews load job [10:19:32] k mforns - same advise as for fdans: Please use an explicitly different keyspace name (for us not to mistake) [10:20:28] I've changed the original keyspace name to test_lgc_pageviews_per_project_test and the other only change is 'eqiad': 3 instead of 'datacenter1': 3 [10:20:45] do you think this is an enough recognizable keyspace name? [10:21:13] sounds good mforns :) Thanks ! [10:21:22] cool thx for the check [10:23:28] done [10:25:05] elukey, joal, anything against me launching the job to load legacy pageviews into prod cassandra now (5.5GB gzipped)? [10:25:18] mforns: no problem for me :) [10:26:20] k joal :], elukey? [10:26:21] +1 [10:27:36] thx :] [10:36:37] addshore: I am trying to run the rsync manually, I am seeing some discrepancies between modification times on mwlog1001 and stat1002, so rsync is restarting from api.log-20170224.gz [10:36:55] aaaah [10:37:03] so It's trying to sync all of the old stuff too [10:37:07] yeah [10:38:29] aaaaand the notifications are going to MAILTO=otto@wikimedia.org [10:38:31] :P:P:P [10:40:44] hahaha [10:41:46] 10Analytics, 06Operations, 06WMDE-Analytics-Engineering, 15User-Addshore: /a/mw-log/archive/api on stat1002 no longer being populated - https://phabricator.wikimedia.org/T160888#3113859 (10elukey) I checked on mwlog1001 and stat1001, the network connection seems fine and not blocked. For some reason files... [10:54:19] joal: the two failed misc/maps coordinators seemed to break in mark_raw_dataset_done [10:54:59] I don't recall what the step does [11:01:39] elukey: I think the job I restarted this weekend failedin refine stage [11:02:39] elukey: mark_raw_dataset_done flagged the raw data as to be checked from statistics perspective [11:05:51] joal: yep it fails in the refine stage looking via -job info [11:06:01] -job logs shows mark_raw_dataset_done failures [11:06:12] :( [11:06:38] I was wondering what things are done in that step, like writing in there, reading etc.. there could be some I/O that fails [11:06:50] maybe buried under some logs :D [11:07:42] elukey: indeed, the mark dataset done is about writing a empty file ! [11:07:48] (03PS7) 10Mforns: Add oozie workflow to load projectcounts to AQS [analytics/refinery] - 10https://gerrit.wikimedia.org/r/339421 (https://phabricator.wikimedia.org/T156388) [11:11:09] joal: ok grepping error on hdfs logs helps [11:11:10] I can see [11:11:11] /var/log/hadoop-hdfs/hadoop-hdfs-datanode-analytics1042.log:2017-03-19 23:02:26,545 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: analytics1042.eqiad.wmnet:50010:DataXc [11:11:15] eiver error processing WRITE_BLOCK operation src: /10.64.36.130:53639 dst: /10.64.53.22:50010 [11:11:32] elukey: Had not seen that! [11:14:18] it says Connection reset by peer [11:14:21] that is weird :D [11:14:32] hm [11:16:57] well I find them in other non debian hosts too [11:17:02] so it might be a red herring [11:21:25] elukey: you're investigating the misc error, right? [11:26:41] (03PS8) 10Mforns: Add oozie workflow to load projectcounts to AQS [analytics/refinery] - 10https://gerrit.wikimedia.org/r/339421 (https://phabricator.wikimedia.org/T156388) [11:27:55] (03PS14) 10Joal: Add oozie jobs for mw history denormalized [analytics/refinery] - 10https://gerrit.wikimedia.org/r/341030 (https://phabricator.wikimedia.org/T160074) [11:27:57] (03PS8) 10Joal: Add oozie job for standard metrics computation [analytics/refinery] - 10https://gerrit.wikimedia.org/r/342197 (https://phabricator.wikimedia.org/T160151) [11:30:12] hey joal :] I applied some corrections to the oozie job, that was buggy. I'm also thinking to load partial data, like 1 year only, so that I can check faster. Is there any drawback in loading partial data? [11:30:38] mforns: not that I can think of [11:30:41] k [11:30:49] mforns: You can easily drop some files [11:31:16] mforns: since the files are generated automatically, formnat is coherent, so working 1 file should be the same as wrorking 10 [11:31:23] joal, sure [11:31:33] mforns: hopefully ;) [11:31:38] :] [11:36:13] 10Analytics, 06Operations, 06WMDE-Analytics-Engineering, 15User-Addshore: /a/mw-log/archive/api on stat1002 no longer being populated - https://phabricator.wikimedia.org/T160888#3114094 (10elukey) This is probably the current error: ``` elukey@stat1002:/a/mw-log/archive/api$ sudo -u stats /usr/bin/rsync -... [11:36:43] joal: yes the last one [11:36:52] but IIRC it is the same as the maps one [11:37:06] elukey: cause the error I see on the maps one (the one from this weekend) are different :( [11:37:29] what is the coord id? [11:39:46] elukey: workflow: 0024822-170228165458841-oozie-oozi-W [11:41:48] E0800: Action it is not running its in [OK] state, action [0024822-170228165458841-oozie-oozi-W@mark_raw_dataset_done [11:41:52] isn't thi the same one? [11:42:40] elukey: Not from oozie job --info 0024822-170228165458841-oozie-oozi-W [11:45:32] yes but it fails in refine too as the other workflow [11:46:14] mmm I can also see [11:46:15] org.apache.oozie.command.CommandException: E0800: Action it is not running its in [OK] state, action [0024822-170228165458841-oozie-oozi-W@mark_add_partition_done] [11:46:37] elukey: how? [11:47:59] joal: how? [11:48:04] sorry didn't get the question [11:48:18] elukey: How did you get this error? [11:48:41] ah with -log instead of -info [11:49:16] Ah ok :) [11:50:16] (sorry slow monday, it takes a while to parse and execute :) [11:50:55] elukey: I checked yarn logs for the job that is given in the --info result: sudo -u hdfs yarn logs --applicationId application_1488294419903_58912 [11:51:04] elukey: Errors I see are VERY weird [11:51:54] (03PS9) 10Joal: Add oozie job for standard metrics computation [analytics/refinery] - 10https://gerrit.wikimedia.org/r/342197 (https://phabricator.wikimedia.org/T160151) [11:51:57] (03PS9) 10Joal: Add oozie job loading MW history in druid [analytics/refinery] - 10https://gerrit.wikimedia.org/r/328154 (https://phabricator.wikimedia.org/T141473) [11:56:28] like https://yarn.wikimedia.org/jobhistory/logs/analytics1036.eqiad.wmnet:8041/container_e42_1488294419903_58918_01_000001/job_1488294419903_58918/hdfs/syslog/?start=0 ? [11:56:35] org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: TA_TOO_MANY_FETCH_FAILURE at SUCCESS_FINISHING_CONTAINER [11:58:24] nope, not that one [11:59:39] 10Analytics, 10Analytics-General-or-Unknown, 06WMDE-Analytics-Engineering, 10Wikidata, and 2 others: [Story] Statistics for Wikidata API usage - https://phabricator.wikimedia.org/T64873#3114115 (10Addshore) 05Open>03Resolved a:03Addshore [12:00:07] 10Analytics, 06Operations, 06WMDE-Analytics-Engineering, 15User-Addshore: /a/mw-log/archive/api on stat1002 no longer being populated - https://phabricator.wikimedia.org/T160888#3114123 (10elukey) Restarted with one day of contimeout and --progress: ``` elukey@stat1002:/a/mw-log/archive/api$ sudo -u stats... [12:01:44] brb [12:06:02] (03PS10) 10Joal: Add oozie job loading MW history in druid [analytics/refinery] - 10https://gerrit.wikimedia.org/r/328154 (https://phabricator.wikimedia.org/T141473) [12:20:02] (03PS10) 10Joal: Add oozie job for standard metrics computation [analytics/refinery] - 10https://gerrit.wikimedia.org/r/342197 (https://phabricator.wikimedia.org/T160151) [12:20:04] (03PS11) 10Joal: Add oozie job loading MW history in druid [analytics/refinery] - 10https://gerrit.wikimedia.org/r/328154 (https://phabricator.wikimedia.org/T141473) [12:27:50] 06Analytics-Kanban, 10Fundraising-Backlog, 13Patch-For-Review: Productionize banner impressions druid/pivot dataset - https://phabricator.wikimedia.org/T155141#3114172 (10JAllemandou) Hi @Jseddon , We setup a (non-production) near-realtime job a while ago indeed. Couple of weeks ago, we upgraded our cluster... [12:41:05] joal, yt? [12:41:10] I am mforns [12:41:16] :] [12:41:29] I assume that aqs_loader's password has to be passed by hand? [12:41:42] mforns: correct [12:41:53] and mforns, no _ in name [12:41:56] aqsloader [12:42:03] oh yes [12:42:45] joal, is aqsloader's password the same as aqs's password? [12:45:03] nope - Giving you password in provate [12:45:56] thx! [12:56:29] a-team: I have a fully succesfully tested flow for denormalized history :) [12:56:32] YAY ! [12:56:47] \\\o/// [13:04:39] \o/ [13:07:57] mforns: if you want to double check, only change noticeable is the addition of a dataset specific for hive synchronisation in https://gerrit.wikimedia.org/r/#/c/341030/14/oozie/mediawiki/history/datasets.xml (see comment at the beginning of file) [13:08:30] Before merging, I'll ask ottomata to confirm he's ok with it (flag and dataset names) [13:08:36] And now, a break :) [13:08:38] Laters lads [13:08:43] 10Analytics, 06Operations, 06WMDE-Analytics-Engineering, 15User-Addshore: /a/mw-log/archive/api on stat1002 no longer being populated - https://phabricator.wikimedia.org/T160888#3114299 (10Ottomata) > should we change the MAILTO option in the stats crontab Oo, does it email me? For sure let's change it. [13:08:47] bye joal [13:14:31] the pageview coordinator that failed reports TA_TOO_MANY_FETCH_FAILURE as misc/maps [13:14:34] sigh [13:20:38] so the mapred job failed for [13:20:38] java.io.FileNotFoundException: /var/lib/hadoop/data/g/yarn/logs/application_1488294419903_65012/container_e42_1488294419903_65012_01_000001 [13:21:59] !log restarted pageview-hourly-wf-2017-3-20-11 [13:22:00] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:22:25] 10Analytics, 06Operations, 15User-Elukey: Investigate recent Kafka Burrow alarms for EventLogging - https://phabricator.wikimedia.org/T160886#3114320 (10Ottomata) > it seems that we should set broker.version.fallback=0.9.0.1 everywhere Hm, maybe, except that the burrow alarms we got are all for consumer proc... [13:24:24] 10Analytics, 06Operations, 15User-Elukey: Investigate recent Kafka Burrow alarms for EventLogging - https://phabricator.wikimedia.org/T160886#3114326 (10elukey) Nono everything auto-recovered by itself, I just reported the errors in the task/email. The only thing that died was mirrormaker but after the puppe... [13:39:58] 10Analytics, 06Operations, 06WMDE-Analytics-Engineering, 15User-Addshore: /a/mw-log/archive/api on stat1002 no longer being populated - https://phabricator.wikimedia.org/T160888#3114349 (10elukey) The above job failed, so I tried to rsync only the March files and I can see success: ``` elukey@stat1002:~$... [13:48:05] joal: \o/ [13:48:08] that's awesome [13:50:31] i'm having a hard time distinguishing between all the alerts today, which are testing, handled already, or real [13:51:30] milimetric: all handled (more or less), crazy day :) [13:51:40] there seems to be a hdfs issue ongoing though [13:52:01] hm, a particular box or all over? [13:53:06] difficult to tell, sometimes there is a failure that seems (not really sure) that might be related to not finding things on hdfs [14:00:44] thing is, it doesn't happen only on Debian nodes [14:00:50] that was my first suspicion [14:03:55] 10Analytics, 06Operations, 15User-Elukey: Investigate recent Kafka Burrow alarms for EventLogging - https://phabricator.wikimedia.org/T160886#3114387 (10elukey) >>! In T160886#3114320, @Ottomata wrote: > > This reminds me though...we upgraded librdkafka everywhere except for eventlog1001! It is still runni... [14:24:50] elukey, no logs in yarn neither [14:30:03] 10Analytics, 06Operations, 06WMDE-Analytics-Engineering, 15User-Addshore: /a/mw-log/archive/api on stat1002 no longer being populated - https://phabricator.wikimedia.org/T160888#3114426 (10elukey) So after my tests we should see files syncing correctly during the next couple of days, and this will unblock... [14:30:49] 10Analytics, 06Operations, 15User-Elukey: Investigate recent Kafka Burrow alarms for EventLogging - https://phabricator.wikimedia.org/T160886#3114439 (10elukey) The prev comment is of course not related, sorry. [14:31:29] mforns: appid? [14:32:05] ah I should have it [14:37:53] 10Analytics-Tech-community-metrics: Have "Last Attracted Developers" information for Gerrit (already exists for Git) - https://phabricator.wikimedia.org/T151161#3114457 (10Aklapper) That `git_demographics_newcomers` visualization shown in the "Attracted developers" widget is based on the `author_min_date` field... [14:42:13] elukey, sorry 1 sec [14:42:25] elukey, application_1488294419903_65135 [14:42:37] yeah I checked, really weird [14:42:40] no headphones [14:42:46] we might need joal's superpowers [14:42:46] no logs? :( [14:42:49] ok [14:53:21] 10Analytics, 10Analytics-Dashiki: Change default timeline for browser reports to be recent (not 2015) - https://phabricator.wikimedia.org/T160796#3111134 (10Milimetric) ok, so we have two things to consider: default timespans and override timespans. So two questions, keeping in mind this would change all dash... [15:00:38] 10Analytics, 10Analytics-EventLogging, 10DBA, 10ImageMetrics: Drop EventLogging tables for ImageMetricsLoadingTime and ImageMetricsCorsSupport - https://phabricator.wikimedia.org/T141407#3114522 (10Nuria) >Probably some Labs box or personal vagrant install still has ImageMetrics installed. ' Neither would... [15:03:09] (03CR) 10Ottomata: [C: 031] Add snapshot to sqoop and namespace_map scripts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/341586 (https://phabricator.wikimedia.org/T160152) (owner: 10Joal) [15:12:48] 10Analytics, 06Operations, 06WMDE-Analytics-Engineering, 15User-Addshore: /a/mw-log/archive/api on stat1002 no longer being populated - https://phabricator.wikimedia.org/T160888#3113734 (10Nuria) We have to fix this at this time but ... have we though to publish this info via kafka instead of udp2log? medi... [15:19:34] (03CR) 10Ottomata: "Couple of nits." (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/341030 (https://phabricator.wikimedia.org/T160074) (owner: 10Joal) [15:20:09] milimetric: , this reportupdater job is ready for merge? if so i'll do it [15:20:09] https://gerrit.wikimedia.org/r/#/c/343246 [15:20:19] OH< You said it was [15:20:20] thanks [15:23:05] yep, it's ready [15:23:45] 10Analytics, 06WMDE-Analytics-Engineering, 10Wikidata: [Task] dashboard showing browser usage distribution for Wikidata - https://phabricator.wikimedia.org/T130102#3114564 (10Nuria) {F4079411} [15:24:38] 10Analytics, 06WMDE-Analytics-Engineering, 10Wikidata: [Task] dashboard showing browser usage distribution for Wikidata - https://phabricator.wikimedia.org/T130102#3114566 (10Nuria) Will be closing this task as http://pivot.wikimedia.org ( to which wikimedia de has access) provides this data. [15:26:22] 10Analytics, 06WMDE-Analytics-Engineering, 10Wikidata: [Task] dashboard showing browser usage distribution for Wikidata - https://phabricator.wikimedia.org/T130102#3114575 (10Nuria) Also, have in mind that browser usage is really not that different per project and overall, the overall info should be sufficie... [15:26:39] 10Analytics, 06WMDE-Analytics-Engineering, 10Wikidata: [Task] dashboard showing browser usage distribution for Wikidata - https://phabricator.wikimedia.org/T130102#3114587 (10Nuria) 05Open>03Resolved [15:27:08] 10Analytics, 10Analytics-Cluster, 06DC-Ops, 06Operations, 10ops-eqiad: Analytics1028 hdfs daemon died because of disk errors - https://phabricator.wikimedia.org/T159632#3114589 (10Ottomata) @Cmjohnson can we take care of this this week? [15:30:37] done Cron[reportupdater_limn-edit-data-edit-beta-features]/ensure: created [15:30:42] milimetric: ^ [15:30:59] cool, thx [15:31:54] 06Analytics-Kanban: Implement Pages Created & Count of Edits full vertical slice - https://phabricator.wikimedia.org/T131779#3114619 (10Nuria) [15:31:56] 06Analytics-Kanban: Load edit history data into Druid - https://phabricator.wikimedia.org/T131786#3114618 (10Nuria) 05Open>03Resolved [15:32:11] 06Analytics-Kanban: Implement Pages Created & Count of Edits full vertical slice - https://phabricator.wikimedia.org/T131779#2178389 (10Nuria) [15:32:13] 06Analytics-Kanban: Load edit history data into Druid - https://phabricator.wikimedia.org/T131786#2178517 (10Nuria) 05Resolved>03Open [15:32:40] 06Analytics-Kanban: User history in hadoop - https://phabricator.wikimedia.org/T134793#3114624 (10Nuria) [15:32:42] 10Analytics, 07Spike: Spike - Slowly Changing Dimensions on Druid - https://phabricator.wikimedia.org/T134792#3114622 (10Nuria) 05Open>03Resolved [15:33:39] (03PS15) 10Joal: Add oozie jobs for mw history denormalized [analytics/refinery] - 10https://gerrit.wikimedia.org/r/341030 (https://phabricator.wikimedia.org/T160074) [15:33:41] (03PS11) 10Joal: Add oozie job for standard metrics computation [analytics/refinery] - 10https://gerrit.wikimedia.org/r/342197 (https://phabricator.wikimedia.org/T160151) [15:33:43] (03PS12) 10Joal: Add oozie job loading MW history in druid [analytics/refinery] - 10https://gerrit.wikimedia.org/r/328154 (https://phabricator.wikimedia.org/T141473) [15:34:54] 06Analytics-Kanban: Redact data so it can be public - https://phabricator.wikimedia.org/T145091#3114631 (10Nuria) Not resolved but declined as we are importing data from labs, and that data is already public. [15:35:09] 06Analytics-Kanban: Redact data so it can be public - https://phabricator.wikimedia.org/T145091#3114632 (10Nuria) 05Open>03Resolved [15:35:11] 06Analytics-Kanban: Wikistats 2.0. - https://phabricator.wikimedia.org/T130256#3114633 (10Nuria) [15:36:50] (03CR) 10Joal: "Done !" (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/341030 (https://phabricator.wikimedia.org/T160074) (owner: 10Joal) [15:37:39] 10Analytics, 10Analytics-Dashiki: Dashiki should load stale data if new data is not available due to network/api conditions - https://phabricator.wikimedia.org/T138647#3114639 (10Nuria) [15:40:23] 10Analytics, 06Operations, 06WMDE-Analytics-Engineering, 15User-Addshore: /a/mw-log/archive/api on stat1002 no longer being populated - https://phabricator.wikimedia.org/T160888#3114646 (10Addshore) >>! In T160888#3114531, @Nuria wrote: > We have to fix this at this time but ... have we though to publish t... [15:41:32] 06Analytics-Kanban, 10Analytics-Wikistats: Develop site map and overview and detail page wireframes - https://phabricator.wikimedia.org/T153466#3114653 (10Nuria) Work on visuals can be seen here, closing: https://phabricator.wikimedia.org/T152033 [15:41:38] 06Analytics-Kanban, 10Analytics-Wikistats: Visual Language for http://stats.wikimedia.org replacement - https://phabricator.wikimedia.org/T152033#3114655 (10Nuria) [15:41:40] 06Analytics-Kanban, 10Analytics-Wikistats: Develop site map and overview and detail page wireframes - https://phabricator.wikimedia.org/T153466#3114654 (10Nuria) 05Open>03Resolved [15:41:54] 06Analytics-Kanban: Wikistats 2.0. - https://phabricator.wikimedia.org/T130256#3114657 (10Nuria) [15:45:21] 10Analytics, 10Analytics-EventLogging, 06Collaboration-Team-Triage, 10MediaWiki-ContentHandler, and 5 others: Multiple MediaWiki hooks are not documented on mediawiki.org - https://phabricator.wikimedia.org/T157757#3114686 (10Nuria) p:05Triage>03Low [15:46:35] 10Analytics: Update undocumented EventLogging mediawiki hooks - https://phabricator.wikimedia.org/T158331#3033632 (10Nuria) https://github.com/wikimedia/mediawiki-extensions-EventLogging/blob/1ed64ccc32bcd56149b7a95f16c895239edd64b1/includes/EventLoggingHooks.php#L79 [15:50:59] 10Analytics, 06WMDE-Analytics-Engineering, 10Wikidata: [Task] dashboard showing browser usage distribution for Wikidata - https://phabricator.wikimedia.org/T130102#3114710 (10Addshore) Thanks for the look to pivot @Nuria! I'll add a link to it from our list of dashboards. https://pivot.wikimedia.org/#pagevie... [15:51:35] 10Analytics, 10Analytics-Dashiki: Change default timeline for browser reports to be recent (not 2015) - https://phabricator.wikimedia.org/T160796#3114711 (10Nuria) p:05Triage>03High [15:51:42] 10Analytics, 06Operations, 06WMDE-Analytics-Engineering, 15User-Addshore: /a/mw-log/archive/api on stat1002 no longer being populated - https://phabricator.wikimedia.org/T160888#3114712 (10Ottomata) Oh! Is all this API data from mw-log udp2log already in Kafka then? If so, we can use kafkatee now to writ... [15:54:31] 10Analytics, 06Operations, 06WMDE-Analytics-Engineering, 15User-Addshore: /a/mw-log/archive/api on stat1002 no longer being populated - https://phabricator.wikimedia.org/T160888#3114716 (10Addshore) >>! In T160888#3114712, @Ottomata wrote: > Oh! Is all this API data from mw-log udp2log already in Kafka th... [15:55:16] 10Analytics, 06Editing-Analysis: editor-engagement dashboard on edit-analysis stopped updating on ~ 2017-02-21 - https://phabricator.wikimedia.org/T160807#3111614 (10Nuria) Helen was working on this dashboards but we do not think her work was finished. Let us know how you want to proceed, this data is availabl... [15:55:30] 10Analytics, 06Editing-Analysis: editor-engagement dashboard on edit-analysis stopped updating on ~ 2017-02-21 - https://phabricator.wikimedia.org/T160807#3114721 (10Nuria) p:05Triage>03Normal [15:56:09] 10Analytics, 10Analytics-Cluster: Filter local IPs before checking for geo info - https://phabricator.wikimedia.org/T160822#3114737 (10Nuria) p:05Triage>03Normal [15:58:40] 10Analytics, 10Analytics-General-or-Unknown, 10Wikidata: Grafana: "wikidata-api" doesn't update anymore - https://phabricator.wikimedia.org/T160825#3112111 (10Nuria) ping @Addshore [15:59:39] 06Analytics-Kanban, 06Operations, 06WMDE-Analytics-Engineering, 15User-Addshore: /a/mw-log/archive/api on stat1002 no longer being populated - https://phabricator.wikimedia.org/T160888#3114753 (10Nuria) [16:00:11] 10Analytics, 10Analytics-General-or-Unknown, 10Wikidata: Grafana: "wikidata-api" doesn't update anymore - https://phabricator.wikimedia.org/T160825#3112111 (10JAllemandou) Just had a quick look at oozie jobs, and they seem successfull. Let's trouble that with @addshore. [16:01:21] 10Analytics, 10Analytics-General-or-Unknown, 06WMDE-Analytics-Engineering, 10Wikidata, 15User-Addshore: Grafana: "wikidata-api" doesn't update anymore - https://phabricator.wikimedia.org/T160825#3114766 (10Addshore) [16:02:13] 10Analytics, 14Trash: --- Items above are triaged ----------------------- - https://phabricator.wikimedia.org/T115634#3114768 (10Nuria) p:05Triage>03High [16:02:39] 10Analytics, 10Analytics-General-or-Unknown, 06WMDE-Analytics-Engineering, 10Wikidata, 15User-Addshore: Grafana: "wikidata-api" doesn't update anymore - https://phabricator.wikimedia.org/T160825#3112111 (10Addshore) This will be fixed once T160888 is closed. The data for this dashboard still comes from t... [16:03:01] 10Analytics, 10Analytics-General-or-Unknown, 06WMDE-Analytics-Engineering, 10Wikidata, 15User-Addshore: Grafana: "wikidata-api" doesn't update anymore - https://phabricator.wikimedia.org/T160825#3112111 (10Addshore) a:03Addshore [16:03:43] 10Analytics, 14Trash: --- Items above are triaged ----------------------- - https://phabricator.wikimedia.org/T115634#1728670 (10Nuria) 05Open>03stalled p:05High>03Lowest [16:05:42] 10Analytics, 10Dumps-Rewrite: Improve mediawiki data redaction - https://phabricator.wikimedia.org/T146444#3114786 (10Nuria) Analytics is importing data for mediawiki edit reconstruction from labs, data is public thus redaction is not needed. [16:06:48] 10Analytics, 10Dumps-Rewrite: Improve mediawiki data redaction - https://phabricator.wikimedia.org/T146444#3114787 (10Nuria) 05Open>03declined [16:07:22] 10Analytics: Make Spark 2.1 easily available on new CDH5.10 cluster - https://phabricator.wikimedia.org/T158334#3114789 (10Nuria) p:05Triage>03Normal [16:07:51] 10Analytics: Make Spark 2.1 easily available on new CDH5.10 cluster - https://phabricator.wikimedia.org/T158334#3033748 (10Nuria) Will help us solve oozie-hive issues with HiveContext (currently we are working around those) [16:08:27] 10Analytics: Alarm on data quality issues - https://phabricator.wikimedia.org/T159840#3114797 (10Nuria) p:05Triage>03Normal [16:13:25] 10Analytics: Alarm on data quality issues - https://phabricator.wikimedia.org/T159840#3080411 (10Milimetric) Take a look at Prophet for forecasting that could automatically detect data quality issues: https://facebookincubator.github.io/prophet/ [16:15:04] 10Analytics, 10MediaWiki-API: Copy cached API requests from raw webrequests table to ApiAction - https://phabricator.wikimedia.org/T155478#3114835 (10Nuria) Why don't we have a meeting to talk about how do we want to evolve api error logs? We have now api publishing into hadoop but also we are maintaining udp2... [16:15:12] 10Analytics, 10MediaWiki-API: Copy cached API requests from raw webrequests table to ApiAction - https://phabricator.wikimedia.org/T155478#3114836 (10Nuria) p:05Triage>03Low [16:16:15] 10Analytics, 10Analytics-Dashiki, 13Patch-For-Review: Create dashboard for upload wizard - https://phabricator.wikimedia.org/T159233#3061266 (10Nuria) what is the priority of this work? Does it fall under editing? reading? [16:18:13] 10Analytics, 10Analytics-Dashiki: Add error component to Dashiki - https://phabricator.wikimedia.org/T157697#3114853 (10Nuria) p:05Triage>03Low [16:18:31] 10Analytics, 10Analytics-Dashiki, 07Easy: Add error component to Dashiki - https://phabricator.wikimedia.org/T157697#3114855 (10Nuria) [16:20:17] 10Analytics, 10Analytics-Cluster: Hadoop: Remove priority queue and add a new one with lower-than-user priorty - https://phabricator.wikimedia.org/T156841#3114858 (10Nuria) p:05Triage>03Normal [16:21:20] 10Analytics: Find out what happens to the old rows in the revision table - https://phabricator.wikimedia.org/T142535#2538504 (10Nuria) p:05Triage>03Lowest [16:22:48] 10Analytics, 10Analytics-Cluster: Hadoop: Remove priority queue and add a new one with lower-than-user priorty - https://phabricator.wikimedia.org/T156841#3114861 (10JAllemandou) [16:23:35] 10Analytics, 10MediaWiki-API: Copy cached API requests from raw webrequests table to ApiAction - https://phabricator.wikimedia.org/T155478#2944303 (10Anomie) > We have now api publishing into hadoop but also we are maintaining udp2log infrstructure. I note the hadoop data and the udp2log data serve different... [16:23:48] 10Analytics: User limits for stat machines. Limit space on /home dir and possibly /tmp - https://phabricator.wikimedia.org/T151904#3114866 (10Nuria) p:05Triage>03Normal [16:24:50] 10Analytics: Meta-statistics on MediaWiki history reconstruction process - https://phabricator.wikimedia.org/T155507#3114867 (10Nuria) p:05Triage>03Normal [16:37:17] 10Analytics, 06Editing-Analysis: editor-engagement dashboard on edit-analysis stopped updating on ~ 2017-02-21 - https://phabricator.wikimedia.org/T160807#3111614 (10Neil_P._Quinn_WMF) @Nuria, this is the new dashboard that Helen created. It stopped working on February 21—none of us (Helen included) touched it... [16:39:14] 10Analytics, 06Editing-Analysis: editor-engagement dashboard on edit-analysis stopped updating on ~ 2017-02-21 - https://phabricator.wikimedia.org/T160807#3114936 (10Nuria) Right, I understand. I am wondering if these are needed in the light of all data for edit reconstruction being available in pivot. Let us... [16:51:19] ottomata, mforns, milimetric: SHould we go and merge those patches? [16:51:39] joal, mediawiki history ones? [16:51:45] yessir [16:52:53] 10Analytics, 10MediaWiki-API: Copy cached API requests from raw webrequests table to ApiAction - https://phabricator.wikimedia.org/T155478#2944303 (10Ottomata) > We have now api publishing into hadoop but also we are maintaining udp2log infrstructure. This really should be stated as API logs published to Kafka... [16:52:59] 06Analytics-Kanban, 10DBA, 13Patch-For-Review: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3114957 (10Nuria) @Marostegui I think @otto can run script in our end, let us know if that is OK with you and we will take a small outage and run script tomorrow [16:55:23] 06Analytics-Kanban, 10DBA, 13Patch-For-Review: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3114962 (10jcrespo) @Marostegui is out today- if you can wait, I would wait one extra day to be sure he can make it. [16:58:57] (03CR) 10Ottomata: [C: 031] Add oozie jobs for mw history denormalized [analytics/refinery] - 10https://gerrit.wikimedia.org/r/341030 (https://phabricator.wikimedia.org/T160074) (owner: 10Joal) [16:59:34] joal: +1 for all merges :) [16:59:42] i gotta run out for a bit, will be back later [16:59:49] ottomata: no prob [16:59:49] feel free to merge and deploy refinery [17:00:11] mforns: any luck with oozie? :( [17:00:13] if you do that, i'll push a puppet change for cron today [17:00:15] milimetric, mforns, : you +1 ottomata view as well? [17:00:17] (without merging) [17:00:26] ok ottomata [17:00:51] yeah, joal, I don't have time to look at it any more, but what I've seen looks great [17:00:52] joal, totally +1 [17:00:55] +1 [17:01:08] MERRRRRRRGING ! [17:01:21] There are like, 5 or 6 patches :) [17:01:49] elukey, yes a bit, thanks to joseph we found a log in yarn ui that said: problems reading a zipped file [17:02:29] we checked the temporary data file, but it was ok, and we concluded that it was the cassandra jar as you and me were talknig before [17:02:38] sigh [17:02:43] beback in a bit [17:02:45] so the log was in yarn? [17:03:05] the day that I will be able to debug a oozie failure I'll be probably 70 years old [17:03:11] milimetric nuria: how do I cancel all queued up reportupdater jobs? (I use crontab that runs a script that runs update_report.py and I need to cancel all of them so I can edit a start date in a config yaml [17:03:15] the problem was me, overriding the refinery_directory instead of the oozie_directory when launching the job [17:03:25] (03CR) 10Joal: [V: 032 C: 032] "Self Merging for deploy." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/322103 (https://phabricator.wikimedia.org/T160155) (owner: 10Milimetric) [17:03:27] elukey, yes in yarn (ui) [17:03:31] not command line [17:03:39] sure sure [17:03:45] thanks for the update :) [17:04:05] (03CR) 10Joal: [C: 032] "Self merging for deploy." [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/325312 (https://phabricator.wikimedia.org/T144717) (owner: 10Joal) [17:04:25] bearloga, you can just kill reportupdater [17:04:37] bearloga: you can just change the date too, the report will pick it up on the next run [17:04:48] but bearloga, wait, you should be scheduling that through our puppet config? [17:05:39] (03CR) 10Joal: [V: 032 C: 032] "Self merging for deploy." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/341030 (https://phabricator.wikimedia.org/T160074) (owner: 10Joal) [17:05:58] 06Analytics-Kanban, 13Patch-For-Review: Create hive tables and queries for metrics computation out of mediawiki denormalized history - https://phabricator.wikimedia.org/T160155#3115001 (10Nuria) [17:06:13] elukey, np, I'm having still problems, but there's progress at least :] [17:06:14] mforns milimetric: is there a way to find out which pid is reportupdater's so I can kill it? [17:06:32] ps auxfw | grep reportupdater [17:06:46] thank you!!! [17:07:09] (03CR) 10Joal: [V: 032 C: 032] "Self merging for deploy." (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/342030 (https://phabricator.wikimedia.org/T160153) (owner: 10Joal) [17:07:51] (03CR) 10Joal: [V: 032 C: 032] "Self merging for deploy." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/341586 (https://phabricator.wikimedia.org/T160152) (owner: 10Joal) [17:08:27] (03CR) 10Joal: [V: 032 C: 032] "Self merging for deploy." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/342197 (https://phabricator.wikimedia.org/T160151) (owner: 10Joal) [17:08:51] milimetric mforns: thank you! :) [17:08:56] (03CR) 10Joal: [V: 032 C: 032] "Self merging for deploy." [analytics/refinery] - 10https://gerrit.wikimedia.org/r/328154 (https://phabricator.wikimedia.org/T141473) (owner: 10Joal) [17:09:00] np bearloga :] [17:10:53] * joal feels lighter with all those patches merged :) [17:11:33] (03Merged) 10jenkins-bot: Add mediawiki history spark jobs to refinery-job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/325312 (https://phabricator.wikimedia.org/T144717) (owner: 10Joal) [17:19:42] (03PS1) 10Joal: Correct typo in mediawiki history job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/343672 [17:20:05] (03CR) 10Joal: [V: 032 C: 032] Correct typo in mediawiki history job [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/343672 (owner: 10Joal) [17:26:20] (03PS1) 10Joal: Update changelog.md to 0.0.43 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/343674 [17:27:05] elukey: --^ [17:27:09] if you have a minute [17:29:27] 10Analytics, 10MediaWiki-API: Copy cached API requests from raw webrequests table to ApiAction - https://phabricator.wikimedia.org/T155478#3115099 (10Tgr) This seems like an "if it works, don't try to fix it" thing to me. udp2log is used by all MediaWiki logging, having the API as an additional client is no ma... [17:30:34] (03PS1) 10Joal: Bump jar version in webrequest load job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/343676 [17:32:27] 06Analytics-Kanban: Synchronise changes for productionisation of mediawiki history jobs - https://phabricator.wikimedia.org/T160154#3115103 (10JAllemandou) [17:36:21] (03CR) 10Elukey: [C: 031] Update changelog.md to 0.0.43 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/343674 (owner: 10Joal) [17:40:20] elukey: Thanks ! Can I move forward and deploy? [17:44:39] hm, looks like I have lost elukey :) [17:45:44] joal: no problem for me but I'll be afk in a bit :( [17:46:12] elukey: I'll deploy refinery-source, then wait for ottomata for the rest [17:46:58] (03CR) 10Joal: [C: 032] Update changelog.md to 0.0.43 [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/343674 (owner: 10Joal) [17:47:23] !log Deploy refinery-source to archiva [17:47:24] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:50:18] super thanks :) [17:57:49] 06Analytics-Kanban, 10Pageviews-API, 13Patch-For-Review: Monthly aggregate endpoint returns unexpected results and invalid timestamp - https://phabricator.wikimedia.org/T156312#3115183 (10MusikAnimal) @Nuria Works great, thank you! :) [18:06:04] ottomata: Please let me know when you arrive, I'd need some help :) [18:10:34] * elukey goes afk team! see you next week! [18:10:45] ciao elukey [18:10:47] o/ [18:12:45] 10Analytics, 10MediaWiki-API: Copy cached API requests from raw webrequests table to ApiAction - https://phabricator.wikimedia.org/T155478#3115248 (10Nuria) > Replacing it with Kafka does not seem to have any benefit. I think it does:. We have an outstanding request of publishing cached requests from webreques... [18:13:41] bearloga: do you have your reportupdater jobs scheduled in puppet? [18:14:27] nuria: we do not. we want to have fine control over the environment the scripts run in [18:14:52] bearloga: what things are you interested in controlling? do they run on 1002? [18:19:49] 06Analytics-Kanban, 10DBA, 13Patch-For-Review: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3099166 (10Tbayer) Sorry folks, am I reading this correctly: All active EL tables will be renamed and replaced by the new version (with the parsed UAs)? That would be a pr... [18:22:01] 10Analytics, 10Analytics-Cluster, 06Operations, 06Research-and-Data, 10Research-management: GPU upgrade for stats machine - https://phabricator.wikimedia.org/T148843#3115297 (10Halfak) a:05ellery>03Halfak [18:25:49] 06Analytics-Kanban, 10DBA, 13Patch-For-Review: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3115312 (10Nuria) >That would be a pretty disruptive change, which at the very least should be announced beforehand to minimize confusion about unexpected results and give... [18:33:39] a-team, legacy pageview data is loaded to cassandra prod in test keyspace, quick check looks good :] [18:33:54] tomorrow will vet it more [18:34:11] wow, nice [18:34:25] awesome mforns :) [18:36:57] (03PS9) 10Mforns: Add oozie workflow to load projectcounts to AQS [analytics/refinery] - 10https://gerrit.wikimedia.org/r/339421 (https://phabricator.wikimedia.org/T156388) [18:37:49] byeeee [18:38:32] 06Analytics-Kanban, 10DBA, 13Patch-For-Review: Change length of userAgent column on EL tables - https://phabricator.wikimedia.org/T160454#3115371 (10Tbayer) >>! In T160454#3115312, @Nuria wrote: >>That would be a pretty disruptive change, which at the very least should be announced beforehand to minimize con... [18:46:32] nuria: , yt? [18:46:40] yessir [18:46:46] ^ ottomata [18:46:51] i don't think we should add instructions for how to get to dev tools in a browser [18:46:54] its different in every browser [18:47:03] and even the instructinos you wrote for chrome are different for me [18:47:10] ah, right, ok, please removed [18:47:17] "ah, right, ok, please remove" [18:47:23] k done [18:47:50] ottomata: sounds good, i tried a bit to see if there was an easy way to get with couple lines a d3 bubble plot we could add to demo [18:48:03] ottomata: but couldn't get anything super concise for console [18:48:13] yeah [18:48:28] ottomata: Heya [18:48:32] yo! [18:48:43] ottomata: can I borrow you for ops time now, or do go for that later? [18:48:56] ottomata: jenkins build broke :( [18:50:37] now is good [18:50:44] k ottomata [18:51:19] I don't really now what the error is, I have ideas, but no clear info [18:51:22] ottomata: https://integration.wikimedia.org/ci/job/analytics-refinery-release/55/console [18:52:08] oh actually, did not readf carefully enough ottomata - Seems to be an authorization error [18:52:16] nuria: howdy! sorry, had to take care of medical stuff. okay, so we've got our stuff running on 1002. our main concerns are: keeping the repo (https://github.com/wikimedia/wikimedia-discovery-golden/) clone in sync with origin and control over the library of R packages (stat1002:/a/discovery/r-library) our scripts require, and making sure certain reports are [18:52:16] run after the other ones (we use reportupdater to make forecasts, so we have metric calculations scheduled to run first) [18:52:35] Could not transfer artifact org.wikimedia.analytics.refinery:refinery:pom:0.0.43 from/to archiva.releases (https://archiva.wikimedia.org/repository/releases/): Failed to transfer file: https://archiva.wikimedia.org/repository/releases/org/wikimedia/analytics/refinery/refinery/0.0.43/refinery-0.0.43.pom. [18:52:35] hm [18:52:40] yea [18:53:01] bearloga: let's talk about this on our sync up [18:53:07] sure! [18:53:13] hm yeah joal that is weird [18:53:25] ottomata: ACLs, password change, reimage? [18:53:33] Those are the things I can think of [18:53:45] yeah ,but it is 401 directly, irght? so not network ACL [18:53:48] the http req went through [18:53:58] archiva is returning 401 to maven [18:54:07] checking archiva logs...(if i rememer how) [18:54:29] ottomata: You've been doing too much product-infra recently ;) [18:54:33] haha [18:54:36] joal: also lack of space [18:54:42] ? [18:54:51] in archiva you mean nuria [18:54:53] ? [18:54:56] right [18:58:19] hmmm org.apache.archiva.security.ArchivaServletAuthenticator [] - Authorization Denied [ip=127.0.0.1,permission=archiva-upload-repository,repo=releases] : no matching permissions [18:58:22] nuria lack fo space? [18:58:38] ottomata: when moving things arround in archiva no? [18:59:01] ottomata: but hey, lack of permits seems like it is [18:59:35] yeehaw! [18:59:35] https://blog.wikimedia.org/2017/03/20/eventstreams/ [18:59:48] nuria ya seems like enough space on archiva box [19:00:02] congrats ottomata - Awesome piece of work :) [19:01:05] ottomata: ok, the gifs work how we envision them, great, I think the first paragraph sets teh right tone so everyone can relate [19:01:45] ya [19:01:56] i will reach out to the folks who submitted the demos shortly [19:02:09] joal: has this ever happened before? with archiva failling due to perms? [19:02:29] ottomata: not that I know :( [19:02:56] ottomata: could be related to a change in jenkins? [19:03:58] possible [19:04:46] joal: are you logged into jenkins? (just checking) [19:05:07] I am, but I can recheck ! [19:05:12] (03Abandoned) 10Milimetric: Updated result of validation after creating cohort. [analytics/wikimetrics] - 10https://gerrit.wikimedia.org/r/263911 (owner: 10Wassan.anmol) [19:05:28] ottomata: I think jenkins wouldn't have allowed me to trigger a release [19:06:35] yeah [19:06:36] i think so too [19:06:43] joal: did another build succeed? [19:06:43] https://integration.wikimedia.org/ci/job/analytics-refinery-release/56/ [19:07:16] yes, I tried a build, and it succeeded - problem really seems to come after the build (at release action) [19:08:29] 10Analytics-Tech-community-metrics: Have "Last Attracted Developers" information for Gerrit (already exists for Git) - https://phabricator.wikimedia.org/T151161#3115662 (10Dicortazar) @Aklapper we have to work in two ways with this analysis: # Improve the performance of the analysis (it's quite slow nowadays...)... [19:08:33] oh, that's a separate jenkins job? [19:09:43] 10Analytics, 10MediaWiki-API: Copy cached API requests from raw webrequests table to ApiAction - https://phabricator.wikimedia.org/T155478#3115686 (10Tgr) > Having mediawiki requests being published to kafka for cached/non cached and error logs would make importing all that data into 1 place a lot easier. We... [19:09:43] ottomata: I think build is just a part of release [19:10:27] yes, but, hm, what is differenet about how you launche job 56 vs 55? [19:10:38] (i've only released with jenkins like once) [19:10:54] 10Analytics: Improve SSH access information in onboarding documentation - https://phabricator.wikimedia.org/T160941#3115695 (10Milimetric) [19:11:08] ottomata: release url: https://integration.wikimedia.org/ci/job/analytics-refinery-release/m2release/ [19:11:19] ottomata: build url: https://integration.wikimedia.org/ci/job/analytics-refinery-release/build?delay=0sec [19:11:47] joal: ah ok, so build just builds to see if it can build [19:11:52] and release builds and releases? [19:11:59] correct ottomata [19:12:05] k [19:12:21] maybe madhuvishy is around and would have an idea? [19:12:55] ya, madhuvishy if you are around, i'm looking to find out why auth failed on a maven release from jenkins. all i see on archiva is that auth was denied [19:13:09] hello :) [19:13:13] we are wondering if somehting has changed on jenkins side...maybe it doesn' thave archiva pw anymore? not sure where or how that is configured [19:13:14] hello! :) [19:13:21] Hi madhuvishy :) [19:13:28] Thanks for answering [19:16:03] (03CR) 10Addshore: [C: 032] Add rcenhancedfilters beta feature [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/343545 (owner: 10Addshore) [19:17:06] ottomata: I see build 56 succeeded [19:17:25] yes, madhuvishy that is just a build [19:17:26] build works fine [19:17:32] its uploading jars to archiva that fails [19:17:41] so build & release together fail [19:17:52] in https://integration.wikimedia.org/ci/job/analytics-refinery-release/55/console [19:18:00] Failed to transfer file: https://archiva.wikimedia.org/repository/releases/org/wikimedia/analytics/refinery/refinery/0.0.43/refinery-0.0.43.pom. Return code is: 401, ReasonPhrase:Unauthorized. [19:18:11] ah perform maven release? [19:18:13] yes [19:18:18] that also does build btw [19:18:21] right [19:18:26] its the release part of it that is failling [19:18:52] do to auth error when PUTing to release repository in archiva [19:18:55] due* [19:19:01] ottomata: did the archiva-ci creds change? [19:19:04] no [19:19:14] but, maybe jenkins changed? where /how does jenkins get them? [19:19:35] i don't think so [19:19:36] from https://integration.wikimedia.org/ci/credentials/ [19:20:52] i just logged into archiva with the pw i have for archiva-ci [19:21:12] joal: maybe while we poke around, you should just try again? [19:21:18] maybe it was some fluke we can all hold our noses at... [19:21:30] i'll tail archiva logs [19:21:56] ottomata: I think it'll fail due to git tags :( [19:21:59] oh no [19:22:01] yea [19:22:07] ottomata: do you have rights to remove git tags? [19:22:41] ottomata: We maybe could rewing git, remove the tag, and try again? [19:22:48] ok [19:22:49] i think so [19:22:54] tag is v0.0.43 [19:22:55] ? [19:23:09] correct ottomata [19:23:29] ottomata: i wonder if someone edited https://integration.wikimedia.org/ci/configfiles/ [19:23:43] ottomata: can you rewrite history ? Or should we just overwrite commits with ol.d versions? [19:23:46] i'm looking at the Global maven settings.xm [19:23:48] xml [19:23:59] and it's missing the server credential entries [19:24:01] looking too...i'm going to add these links to the nice wiki tech docs you wrote too [19:24:03] oh! [19:24:03] hm [19:24:10] which will explain [19:24:22] oh but [19:24:27] [19:24:28] right? [19:24:33] maybe it just doesn't display them? [19:24:50] it def tried to PUT to the correct URI [19:24:55] it jsut failed auth [19:25:49] 10Analytics, 10EventBus, 06Services (next): Create mediawiki.page-restrictions-change event - https://phabricator.wikimedia.org/T160942#3115730 (10Pchelolo) [19:26:40] yeah that's a bit weird but may be [19:27:01] madhuvishy: do you think its worth just fixing up git and trying again? [19:27:29] yeah - i'm not sure what could have changed on jenkins - the creds are the same and are still there [19:27:35] ok, i'm going to try it... [19:29:51] ok, i deleted git tags and reset master [19:29:54] retrying release [19:29:55] https://integration.wikimedia.org/ci/job/analytics-refinery-release/57/ [19:31:53] ottomata: need to go for diner, can I let you with [19:32:00] the thing? I'll come back after [19:32:24] ya [19:32:30] no joal [19:32:31] oop [19:32:32] shah [19:32:33] haha [19:32:34] np* [19:32:37] no problem :) [19:32:48] huhu :) [19:32:55] thanks [19:35:53] ottomata: same error :( [19:36:03] ottomata: gone for real now [19:41:12] rats [19:47:10] madhuvishy: why are there two maven settings.xml files? [19:47:11] https://integration.wikimedia.org/ci/configfiles/ [19:48:35] and, madhuvishy do you remember if that file used to have more settings visible? like how to find archiva? [19:49:40] ottomata: only the global one is being used [19:49:44] we can remove the other [19:49:46] ok [19:49:48] let's do that [19:49:56] done [19:49:56] i'm not super sure about the settings being visible [19:50:03] but may be we can add and see? [19:50:11] there were two [19:50:16] deploy and release i think [19:50:23] we should only need the archiva-ci one [19:50:23] no? [19:50:34] that's the one that jenkins should use to upload to archiva [19:50:42] yeah the same archiva creds for both [19:50:53] 10Analytics, 10EventBus, 06Services (next): Create mediawiki.page-restrictions-change event - https://phabricator.wikimedia.org/T160942#3115840 (10Pchelolo) One minor problem: normally in our events for changed properties we would have old values and new values, but the `ArticleProtectionComplete` hook provi... [19:50:59] but i vaguely remember server.release and server.deploy or something like that [19:50:59] but, i'm confused abouthow server creds work [19:51:14] do we need to specify serverId? [19:51:16] https://integration.wikimedia.org/ci/configfiles/editConfig?id=org.jenkinsci.plugins.configfiles.maven.MavenSettingsConfig83ca7e26-00f0-4ec8-b6fa-68de9208f702 [19:51:38] yup pretty sure [19:51:57] 10Analytics, 10EventBus, 06Services (next): Create mediawiki.page-restrictions-change event - https://phabricator.wikimedia.org/T160942#3115841 (10Jdlrobson) That sounds fine for my use case. I just need to be able to know a page has become protected or unprotected. [19:52:14] ottomata: i got it from the settings.xml you had on the server [19:52:25] 10Analytics, 10EventBus, 06Services (next): Create mediawiki.page-restrictions-change event - https://phabricator.wikimedia.org/T160942#3115842 (10Ottomata) If we do that, we should probably still design the schema so that old value fields are still there, but optional. [19:53:09] hmm, ok [19:53:14] i see [19:53:54] madhuvishy: did I make a change? [19:53:55] https://integration.wikimedia.org/ci/configfiles/editConfig?id=org.jenkinsci.plugins.configfiles.maven.MavenSettingsConfig83ca7e26-00f0-4ec8-b6fa-68de9208f702 [19:54:12] yeah [19:54:15] the form is a little weird [19:54:15] ok [19:54:17] trying again then.. [19:54:32] don't know why they got removed, if they did [19:54:44] ya we'll see if this helps, who knows... [19:55:32] (03Merged) 10jenkins-bot: Add rcenhancedfilters beta feature [analytics/wmde/scripts] - 10https://gerrit.wikimedia.org/r/343545 (owner: 10Addshore) [19:56:44] hm madhuvishy: [19:56:45] your Apache Maven build is setup to use a config with id org.jenkinsci.plugins.configfiles.maven.MavenSettingsConfig.ArchivaCredentialsSettings but can not find the config [19:56:59] from https://integration.wikimedia.org/ci/job/analytics-refinery-release/58/console [19:58:34] :| [19:58:58] ottomata: i'm pretty sure that was the id before [19:59:10] yeah [19:59:29] you can only change it by creating a new file i think [20:01:22] change it? hm, myabe becuase we deleted that other config? [20:02:31] ah hmmm [20:02:41] one sec fixing [20:03:03] ok... [20:05:28] ottomata: there's archiva-ci and archiva-deploy [20:05:39] are there two different creds? [20:05:41] archiva-ci shoudl be the one jenkins shoudl use [20:05:45] -deploy was around before it existed [20:05:47] ah right [20:05:55] was used to deploy via cli [20:06:00] 'deploy' aka release [20:06:07] we can probably remove that from jenkins [20:07:36] ottomata: okay https://integration.wikimedia.org/ci/configfiles/editConfig?id=org.jenkinsci.plugins.configfiles.maven.MavenSettingsConfig [20:07:42] removed the global one [20:08:11] https://integration.wikimedia.org/ci/job/analytics-refinery-release/configure under Build -> Advanced [20:08:25] it's using provided settings.xml and this one is selected [20:08:28] and the id is right [20:10:02] ah advanced [20:10:02] ok [20:10:07] ok, so i shoudl try again? [20:10:07] :) [20:11:41] I guess! [20:18:09] joal: i'm building now, we will see! [20:18:12] k [20:18:13] but ya, go to bed! [20:18:59] ottomata: I'll be off tomorrow afternoon - I'll deploy refinery (if jenkins gets fixed tonight) [20:19:23] ottomata: tomorrow morning, and we'll see for cron either tomorrow evening or on wednesday - ok? [20:20:06] k ja! [20:23:29] ottomata: I think it's a success :) [20:23:32] Awesome :) [20:23:41] Thanks again madhuvishy ! :) [20:23:57] * joal disappears in the night [20:25:51] yeah it succeeded! so I guess those settings got removed from that file somehow [20:27:54] great! who knows?! thanks a bunch madhu [20:28:01] it woulda taken me forever to find all those links [20:28:18] no problem :) [20:35:14] 10Analytics, 07Documentation: Improve SSH access information in onboarding documentation - https://phabricator.wikimedia.org/T160941#3115980 (10Aklapper) [21:09:07] 10Analytics, 10Analytics-Dashiki: Change default timeline for browser reports to be recent (not 2015) - https://phabricator.wikimedia.org/T160796#3116118 (10Krinkle) >>! In T160796#3114499, @Milimetric wrote: > 1. would you like the default to be X days (and if so, what's X)? Or should the default stay all-ti... [22:45:46] 10Analytics, 10Analytics-EventLogging, 10DBA, 10ImageMetrics: Drop EventLogging tables for ImageMetricsLoadingTime and ImageMetricsCorsSupport - https://phabricator.wikimedia.org/T141407#3116413 (10Tgr) Some broken spider then, which has cached some old JS for infinity. All requests are from the same clien... [23:42:16] 10Analytics, 06Developer-Relations, 10MediaWiki-API, 06Reading-Admin, and 3 others: Is User-Agent data PII when associated with Action API requests? - https://phabricator.wikimedia.org/T154912#3116568 (10Tgr) Anything that involves tracking usage levels is infeasible without user agents, for the same reas...