[06:37:48] 10Analytics, 10Analytics-EventLogging, 10DBA: dbstore1002 crashed - https://phabricator.wikimedia.org/T170308#3442644 (10Marostegui) 05Resolved>03Open x1 broke with: ``` Could not execute Update_rows_v1 event on table wikishared.echo_unread_wikis; ``` Looks like that table needs to be reimported [06:54:58] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Firewalls appear to be preventing spark executors from talking to spark driver on stat1005 - https://phabricator.wikimedia.org/T170496#3442659 (10MoritzMuehlenhoff) Ok, can you open a separate task for the Spark/firewall issue with more details/ste... [07:33:18] 10Analytics, 10Analytics-EventLogging, 10DBA: dbstore1002 crashed - https://phabricator.wikimedia.org/T170308#3442731 (10Marostegui) 05Open>03Resolved I have fixed the issue with the missing record of the table. [07:56:42] 10Analytics: Upgrade AQS to node 6.11 - https://phabricator.wikimedia.org/T170790#3442777 (10elukey) [08:54:22] 10Analytics: Please Check Pivot Data on Campaign Banners - https://phabricator.wikimedia.org/T170792#3442840 (10GoranSMilovanovic) [09:03:15] GoranSM: o/ [09:03:31] elukey: Hi [09:03:34] I am reading https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_controlling_the_import_process to familiarize with your task since our sqoop expert is on vacation [09:04:09] the IO error that you are experiencing in HDFS seems valid, sqoop tries to write in /user but you don't have perms [09:04:25] I don't remember if you tried also without --targetdir [09:04:42] (or warehousedir) [09:05:37] elukey: you're great, thank you. I haven't tried without the --targetdir or --warehousedir parameters. Also, I want to avoid using the main database - I am building a large-scale system, and the usage of the goransm database is a must have. [09:06:57] the other thing that I suspect is that if you use --username research and not targetdir you'll find your table files in /user/research/tablename/... [09:10:03] elukey: You know what confuses me? The --username and --password parameters in Sqoop are referring to the mySQL access and should have nothing do to with any access rights on the HDFS system, right? [09:10:34] elukey: So, why would the values of those parameters direct where the Hive tables would go in our filesystem? [09:10:53] GoranSM: sorry you are probably right, --username/pass are for mysql and your username will be matched against hdfs [09:11:56] so I'd suggest to test without --target-dir to see if it ends up in /user/etc.. [09:12:07] elukey: I see. Again, there is no way for me to access MariaDB from production by using goransm as a username, right - I must use research as username, since it corresponds with the respective mySQL group's access rights? [09:12:52] GoranSM: afaik yes, the access model at the moment is not fine grained for the mysql analytics dbs [09:13:10] but your username (the one that launches the sqoop command) will be used for HDFS [09:13:26] I still need to figure out why sqoop tries to create metadata files in /usr [09:13:29] err /user [09:13:43] causing the import to fail for permissions [09:14:19] GoranSM: another thing that we could test is --warehouse-dir /user/goransm instead of --target-dir [09:14:39] elukey: Ok, now that's clear. And one final, formal but important question: am I violating any of our internal rules and guidelines if I start using /user/research/tablename/... as a storage instead of my goransm database on Hadoop? Of course, assuming that everything works out after removing the --targetdir parameter? [09:15:02] elukey: I've already tested with --warehouse, nothing different happens. [09:15:47] GoranSM: I am not sure about /user/research, theoretically there should not be any issues but I'd follow up with somebody from research to double check [09:16:07] I need to go afk for a bit now (dentist) but I'll read later [09:16:21] feel free to write in here your progress, I'll catch up asap [09:16:22] :) [09:16:28] * elukey afk (dentist) [09:16:45] elukey: Thank you very much for support. I will report back. [10:38:50] 10Analytics, 10CirrusSearch, 10Discovery, 10MediaWiki-extensions-WikibaseRepository, and 2 others: Define metrics for search result quality for the entity selector widget on wikidata. - https://phabricator.wikimedia.org/T170400#3443106 (10daniel) 05Open>03Resolved a:03daniel @Jan_Dittrich defined rel... [11:17:03] * elukey lunch! [11:53:32] @elukey After I've dropped the --targetdir sqoop parameter, nothing changed. By the way, there is no such thing as /user/research/ on /mnt/hfds/. [11:54:01] elukey: Again, when I run the command, the wbc_entity_usage table is created in the main database, and the convention is not work from there. [11:54:30] elukey: Sincerely, I think we should focus on figuring out why I can't write to the goransm Hadoop database on production. [11:57:01] elukey: And, by the way, on /mnt/hdfs/user/goransm I have somehow managed to create the wbc_entity_usage files, it happened a few days ago, and I can't rm it because it's a read-only filesystem. Now, in spite of the fact that there's some data there, when I do beeline: use goransm; show tables; there's not such thing as wbc_entity_usage table (only centralnoticebannerhistory, for which I have no idea why is it found in the [11:57:27] elukey: Please, help. HELP. HEEEEEEEEEEEELLLLLPPPP :'( [12:08:07] Goran seems not online [13:05:49] GoranSM: so /mnt/hdfs has nothing to do with HDFS file system, since it is a fuse mount.. [13:06:11] let's take it out of the picture, it only serves as quick read only unix fs [13:06:44] any put/get/delete/etc.. needs to happen on hdfs [13:55:21] 10Analytics-Kanban, 10DBA, 10Operations, 10Patch-For-Review, 10User-Elukey: Puppetize Piwik's Database and set up periodical backups - https://phabricator.wikimedia.org/T164073#3443582 (10elukey) Fixed the predump script that wasn't able to backup to `/srv/backup` on bohrium, now everything should be ok.... [13:57:24] 10Analytics-Kanban: Finish Oozie job to create interlanguage links dataset - https://phabricator.wikimedia.org/T170818#3443602 (10Milimetric) [13:57:28] 10Analytics-Cluster, 10Analytics-Kanban, 10WMDE-Analytics-Engineering, 10Patch-For-Review, 10User-Addshore: Move statistics::wmde jobs from stat1002 -> stat1005 - https://phabricator.wikimedia.org/T170472#3443615 (10Ottomata) Weird, time should definitely be available. I guess it's not installed by defa... [13:58:34] hey everyone [13:59:15] ottomata: woo! [13:59:38] :) [13:59:38] hiii [14:01:20] backups for piwik should work now :) [14:01:59] helo [14:02:13] mforns: o/ [14:02:21] hey elukey :] [14:02:40] shall we start the cleaner for 2014 on dbstore1002? It is running out of disk space :( [14:03:31] the script seems working fine up to now on db1047 [14:04:50] 10Analytics-Kanban: Finish Oozie job to create interlanguage links dataset - https://phabricator.wikimedia.org/T170818#3443666 (10Milimetric) [14:04:54] 10Analytics, 10Analytics-Cluster, 10Language-Team, 10MediaWiki-extensions-UniversalLanguageSelector, and 3 others: Migrate table creation query to oozie - https://phabricator.wikimedia.org/T170764#3443664 (10Milimetric) [14:05:01] 10Analytics-Kanban, 10DBA, 10Operations, 10Patch-For-Review, 10User-Elukey: Puppetize Piwik's Database and set up periodical backups - https://phabricator.wikimedia.org/T164073#3443667 (10jcrespo) FYI, we uniformized not a long time ago local dumps going to /srv/backups on database hosts, in case you wan... [14:06:04] 10Analytics-Kanban, 10DBA, 10Operations, 10Patch-For-Review, 10User-Elukey: Puppetize Piwik's Database and set up periodical backups - https://phabricator.wikimedia.org/T164073#3443674 (10elukey) @jcrespo sorry it is `/srv/backups`, missed a 's' in my last post :( [14:07:39] mforns: related to https://phabricator.wikimedia.org/T170457 the IE 11 problem is just that it doesn't have Array.find defined, we need to polyfill it, that should be the only problem (or we can just use the lodash version) [14:08:00] milimetric, no... it wasn't that [14:08:08] (03CR) 10Ottomata: "This should probably live in the hive/ directory with other create table statements." (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/365517 (https://phabricator.wikimedia.org/T170764) (owner: 10Amire80) [14:08:14] I was just writing on the task [14:08:30] oh, ok, cool [14:08:36] nvm me then :) [14:08:44] milimetric, I already tried to replace find, then assign, and then Promises wouldn't work :] [14:09:10] the thing was, we were bundling babel-promises, but not importing them... [14:09:17] I added import 'babel-polyfill'; in the main.js [14:09:18] ah ok. I really don't understand why the IE team is still holding onto like web strategy from 1995... so weird [14:09:24] and everything seems to work now [14:09:44] ok, cool, I was wondering how exactly that works [14:09:46] somebody with sqoop's magical abilities can check https://phabricator.wikimedia.org/T170052 ? [14:09:51] does it increase the build a lot? [14:09:54] I'll check it elukey [14:10:43] thanks! [14:12:28] holy cow that task has a looooong history [14:12:47] milimetric, it increases it from 962 KB to 1.02 MB [14:12:55] milimetric: yeah it was the pass leak, you can go to the end [14:13:02] mforns: that's fine [14:13:06] k :] [14:15:40] GoranSM: hi - are you around? [14:16:03] I can help with sqoop, but I'm not sure exactly what the goal is since in the task you seem to be trying a few things [14:16:16] (which seem like work-arounds, but I'm not sure which is the main goal) [14:18:08] 10Analytics-Kanban, 10Analytics-Wikistats: Define, Document (and test) Desktop and Mobile browser support for wikistats 2.0 - https://phabricator.wikimedia.org/T170457#3443725 (10mforns) Hey, I've been digging into the IE11 problem. After implementing a couple methods for IE to work (find and assign - which... [14:18:30] fdans, yt? [14:18:57] mforns: hellooo [14:19:02] heloooo [14:19:07] helolooo [14:19:21] helolololooo [14:19:25] emm.. can you please test again the Safari Mobile? [14:19:30] xD [14:19:31] sure! [14:19:37] wait, I'll deploy to develop [14:20:12] sorry push [14:21:19] fdans, is it possible for you to test on Safari from develop? [14:21:26] I pushed already [14:21:35] I thiiiiink so! [14:21:41] k [14:23:09] 10Analytics, 10Services, 10WMF-Legal: License for pageview data - https://phabricator.wikimedia.org/T170602#3436402 (10GWicke) The general license disclaimer quoted above was added by legal. It focuses on contributions, so doesn't really apply to read-only metrics. If you would like to change the license fo... [14:23:35] 10Analytics-Kanban, 10Operations, 10Wikimedia-Stream, 10hardware-requests, 10Patch-For-Review: Decommission RCStream - https://phabricator.wikimedia.org/T170157#3443765 (10Ottomata) [14:23:46] 10Analytics, 10WMF-Legal, 10Services (watching): License for pageview data - https://phabricator.wikimedia.org/T170602#3443767 (10GWicke) [14:24:42] 10Analytics-Kanban, 10Operations, 10Wikimedia-Stream, 10hardware-requests, 10Patch-For-Review: Decommission RCStream - https://phabricator.wikimedia.org/T170157#3443768 (10Ottomata) @Robh, I started the decommission process for rcs1001 and rcs1002, but may have taken it farther than I should. I followed... [14:27:06] (03CR) 10Mforns: [V: 032 C: 032] "LGTM!" [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/365415 (https://phabricator.wikimedia.org/T170286) (owner: 10Nuria) [14:31:15] !log set innodb_flush_log_at_trx_commit on bohrium to 1 (default value)- T164073 [14:31:15] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:31:15] T164073: Puppetize Piwik's Database and set up periodical backups - https://phabricator.wikimedia.org/T164073 [14:32:16] * fdans got in tantrum because he couldn't access a local ip address from his phone, then realised wi-fi was off [14:33:05] mforns: wikiselector is working now!! [14:33:13] fdans, woohooo! [14:33:22] thank you for that :D [14:33:35] ok, will update docs [14:33:43] thanks for testing :] [14:34:41] 10Analytics, 10Analytics-Cluster: Enable base::firewall on stat boxes after restricting Spark REPL ports. - https://phabricator.wikimedia.org/T170826#3443840 (10Ottomata) [14:34:50] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Firewalls appear to be preventing spark executors from talking to spark driver on stat1005 - https://phabricator.wikimedia.org/T170496#3433298 (10Ottomata) Oook, done! T170826 [14:35:31] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Firewalls appear to be preventing spark executors from talking to spark driver on stat1005 - https://phabricator.wikimedia.org/T170496#3443844 (10Ottomata) For now though, what should I do? The icinga alerts for DENY are currently ACKed for stat10... [14:40:13] 10Analytics-Cluster, 10Analytics-Kanban, 10Operations, 10Patch-For-Review: Firewalls appear to be preventing spark executors from talking to spark driver on stat1005 - https://phabricator.wikimedia.org/T170496#3443855 (10MoritzMuehlenhoff) [14:43:07] 10Analytics-Kanban, 10Documentation, 10Services (watching): Document revision-create event for EventStreams - https://phabricator.wikimedia.org/T169245#3443859 (10Ottomata) Also done here: https://wikitech.wikimedia.org/wiki/EventStreams#API [14:43:13] fdans, do you have a desktop Safari? [14:43:23] yessir [14:43:43] anything in particular you want me to look at mforns [14:45:25] fdans, no, just wikistats overall [14:45:54] fdans, cause you tested Mobile Safari no before? [14:46:02] yes [14:46:58] ok [14:47:38] mforns: tiny thing... in safari, the table view doesn't have headers with dark green background [14:47:50] the background is white, so headers are invisible [14:47:55] fdans, aha [14:47:57] mh [14:47:59] ok [14:48:26] rest of staff works great [14:48:30] stuff* [14:48:43] fdans, aha, this looks like a CSS problem rather than transpiling/polyfills no? [14:48:50] (back in 3, going to buy a bottle of water) [14:48:56] k [14:49:03] mforns: that seems to be the case, I can look at it don't worry [14:49:09] 10Analytics-Kanban, 10EventBus, 10Easy, 10Services (watching): EventBus logs don't show up in logstash - https://phabricator.wikimedia.org/T153029#3443874 (10Ottomata) Ah ok, I remember what's going on here. So https://phabricator.wikimedia.org/T150106#2777178 is about eventlogging error event logs confli... [14:49:14] np it's just for the docs [14:52:06] 10Analytics-Kanban, 10EventBus, 10Easy, 10Services (watching): EventBus logs don't show up in logstash - https://phabricator.wikimedia.org/T153029#3443887 (10Ottomata) Ah, I think the conflict was with the `eventlogging_EventError` topic. This data contains EventLogging Analytics events that did had error... [14:56:35] 10Analytics-Kanban, 10Analytics-Wikistats: Define, Document (and test) Desktop and Mobile browser support for wikistats 2.0 - https://phabricator.wikimedia.org/T170457#3443907 (10mforns) Final update for browser tests. The following browsers are supported by Wikistats2 and have been tested with positive resul... [14:58:51] 10Analytics-Kanban, 10Analytics-Wikistats: Review by legal department of text on wikistats site - https://phabricator.wikimedia.org/T163229#3443913 (10fdans) @Slaporte the text in the footer was changed to @Nuria's suggested one a couple of weeks ago: {F8788723} Is there anything you'd like to add or modify? [14:59:04] 10Analytics-Cluster, 10Analytics-Kanban, 10Language-Team, 10MediaWiki-extensions-UniversalLanguageSelector, and 3 others: Migrate table creation query to oozie - https://phabricator.wikimedia.org/T170764#3443915 (10Milimetric) a:05Amire80>03Milimetric [15:05:57] elukey: thanks for stat1005 cron error [15:05:59] investigating... [15:11:13] 10Analytics, 10Analytics-Cluster: Enable base::firewall on stat boxes after restricting Spark REPL ports. - https://phabricator.wikimedia.org/T170826#3443826 (10EBernhardson) > All Hadoop workers need to be able to talk to the REPL port, while (I think) only localhost needs to reach the web GUI port, since it... [15:13:33] 10Analytics, 10Analytics-Cluster: Enable base::firewall on stat boxes after restricting Spark REPL ports. - https://phabricator.wikimedia.org/T170826#3443971 (10Ottomata) Ahh, yeah maybe the GUI port is only for local mode. [15:34:02] GoranSM: let's have a meeting during your work hours and I can try and help out. I'll try to be around early tomorrow (I'm in New York) to set it up. [15:38:11] 10Analytics-EventLogging, 10Analytics-Kanban: ChangesListHighlights events missing from MySQL starting 2017-07-11 - https://phabricator.wikimedia.org/T170486#3444117 (10Ottomata) Ok, events have been backfilled. However, I accidentally backfilled TOO many events, in that I did not filter out bot events during... [15:38:25] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Contributors-Analysis, and 6 others: Record an event every time a new content namespace page is created - https://phabricator.wikimedia.org/T150369#3444118 (10Ottomata) Ok, events have been backfilled. However, I accidentally backfilled TOO man... [15:41:52] milimetric: is there public info/discussion about the auto confirm page create thing? [15:45:01] 10Analytics-Kanban: Upgrade Druid to 0.9.2 as a temporary measure - https://phabricator.wikimedia.org/T170590#3444156 (10Nuria) [15:49:51] 10Analytics, 10Analytics-Data-Quality: Please Check Pivot Data on Campaign Banners - https://phabricator.wikimedia.org/T170792#3444179 (10Nuria) [15:51:17] 10Analytics: Upgrade AQS to node 6.11 - https://phabricator.wikimedia.org/T170790#3442777 (10Nuria) We have version: nuria@aqs1008:~$ node -v v6.9.1 [15:52:27] ottomata: there's a long email chain and a task or two in phab, what would you like to know? [15:53:18] 10Analytics: Upgrade AQS to node 6.11 - https://phabricator.wikimedia.org/T170790#3442777 (10Nuria) This is about upgrading debian package but we should probably rebuild deps with this version and deploy. [15:54:25] milimetric: actually, i found some pretty good links, village pump [15:54:32] i described this problem to a friend, and she wanted to read more [15:54:48] 10Analytics-Kanban: Clean up PageContentSaveComplete event if there are no data users - https://phabricator.wikimedia.org/T170720#3444239 (10Nuria) a:03Nuria [15:54:57] oh yeah, cool [15:56:35] 10Analytics, 10Analytics-EventLogging: Alarm on errors on /var/log/upstart/eventlogging* files - https://phabricator.wikimedia.org/T170620#3444257 (10Nuria) p:05Triage>03High [15:56:38] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Replacement of stat1002 and stat1003 - https://phabricator.wikimedia.org/T152712#3444259 (10Ottomata) [15:56:51] 10Analytics: Upgrade AQS to node 6.11 - https://phabricator.wikimedia.org/T170790#3444262 (10Nuria) p:05Triage>03Normal [15:58:05] 10Analytics, 10Analytics-EventLogging: Alarm on errors on /var/log/upstart/eventlogging* files - https://phabricator.wikimedia.org/T170620#3437262 (10Nuria) Or rather alarm in process flapping [16:02:09] 10Analytics-Kanban, 10WMF-Legal, 10Services (watching): License for pageview data - https://phabricator.wikimedia.org/T170602#3444295 (10Nuria) [16:02:20] 10Analytics-Kanban, 10WMF-Legal, 10Services (watching): License for pageview data - https://phabricator.wikimedia.org/T170602#3436402 (10Nuria) [16:05:05] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure: deployment-kafka01 out of disk space - https://phabricator.wikimedia.org/T170523#3444318 (10Nuria) [16:06:23] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure: deployment-kafka01 out of disk space - https://phabricator.wikimedia.org/T170523#3434437 (10Nuria) Any of us during ops week should be able to do it, ping @Ottomata , what is the best way to free space [16:07:18] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure: deployment-eventlogging03 out of disk space - https://phabricator.wikimedia.org/T170522#3444339 (10Nuria) [16:10:22] 10Analytics-Kanban: Measure Community Backlog. - https://phabricator.wikimedia.org/T155497#3444378 (10Nuria) [16:11:34] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure: deployment-kafka01 out of disk space - https://phabricator.wikimedia.org/T170523#3444383 (10Ottomata) Look for what is causing the used space. Looks like /var/log/daemon.log* was really huge. Not really sure why, I see lots of errors about puppet failing, bu... [16:12:31] 10Analytics, 10Data-release: Wikipedia Clickstream dataset. Programmatic Access - https://phabricator.wikimedia.org/T134231#3444406 (10Nuria) Scheduling on this quarter pending third party contributions [16:15:55] 10Analytics, 10Patch-For-Review: Implement a standard page title normalization algorithm (same as mediawiki) - https://phabricator.wikimedia.org/T126669#3444420 (10Nuria) [16:15:57] 10Analytics, 10Pageviews-API: Track page views by page ID rather than title (handles moved pages) - https://phabricator.wikimedia.org/T159046#3444419 (10Nuria) [16:16:45] 10Analytics, 10Patch-For-Review: Implement a standard page title normalization algorithm (same as mediawiki) - https://phabricator.wikimedia.org/T126669#2020367 (10Nuria) [16:16:47] 10Analytics, 10Pageviews-API: Track page views by page ID rather than title (handles moved pages) - https://phabricator.wikimedia.org/T159046#3055240 (10Nuria) [16:17:38] 10Analytics: Alarm on data quality issues - https://phabricator.wikimedia.org/T159840#3080411 (10Nuria) p:05Normal>03High [16:22:16] 10Analytics-Kanban, 10Analytics-Wikistats: Addition of Unique Devices metric - https://phabricator.wikimedia.org/T170461#3444456 (10Nuria) a:03fdans [16:22:45] 10Analytics-Kanban, 10WMF-Legal, 10Services (watching): License for pageview data - https://phabricator.wikimedia.org/T170602#3444460 (10Nuria) a:03mforns [16:25:07] 10Analytics-Kanban, 10Analytics-Wikistats: Cleanup Routing code - https://phabricator.wikimedia.org/T170459#3444468 (10Nuria) a:03mforns [16:27:49] !log set innodb_flush_log_at_trx_commit on bohrium to 2 and sync_binlog=300 to reduce iowait - T164073 [16:27:51] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:27:51] T164073: Puppetize Piwik's Database and set up periodical backups - https://phabricator.wikimedia.org/T164073 [16:28:29] 10Analytics-Kanban, 10Analytics-Wikistats: Set up continuos integration for wikistats 2.0 UI - https://phabricator.wikimedia.org/T170458#3444503 (10Nuria) a:03fdans [16:35:01] 10Analytics-Data-Quality, 10Analytics-Kanban: Pageview drop in ro.wikipedia hu.wikipedia and fr.wikipedia - https://phabricator.wikimedia.org/T170845#3444534 (10Nuria) [16:40:19] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure: deployment-kafka01 out of disk space - https://phabricator.wikimedia.org/T170523#3444604 (10Nuria) The cause is likely logging of mediawiki revision-create events last week [16:51:29] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Contributors-Analysis, and 5 others: Visualize page create events for all wikis - https://phabricator.wikimedia.org/T170850#3444668 (10Nuria) [17:01:46] 10Analytics-Data-Quality, 10Analytics-Kanban: Pageview drop in ro.wikipedia hu.wikipedia and fr.wikipedia - https://phabricator.wikimedia.org/T170845#3444715 (10Tgr) [17:06:32] all right piwik fully puppetized and bacula backups should happen once a week from now on [17:06:41] it looks that this journey has ended :) [17:07:13] really happy that without the IOs traffic the cpu iowait is zero now (together with some "relaxed" mysql settings though) [17:07:30] all right, going afk people, see you tomorrow! [17:09:06] 10Analytics-Data-Quality, 10Analytics-Kanban: Pageview drop in ro.wikipedia hu.wikipedia and fr.wikipedia - https://phabricator.wikimedia.org/T170845#3444733 (10Tgr) YoY in June it is a 61% drop for rowiki, 53% for huwiki, 42% for frwiki, 34% for dewiki. As mentioned in the mailing list, it is entirely driven... [17:17:44] 10Analytics-Data-Quality, 10Analytics-Kanban, 10Reading-Web-Backlog: Pageview drop in ro.wikipedia hu.wikipedia and fr.wikipedia - https://phabricator.wikimedia.org/T170845#3444766 (10Tgr) Also, the decrease started in early 2016, not this April, and has been pretty linear since then. Since the drop has been... [17:28:46] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Contributors-Analysis, and 5 others: Visualize page create events for all wikis - https://phabricator.wikimedia.org/T170850#3444814 (10Nuria) Creating ticket so @kaldari 's team and Analytics can coordinate on this regard [17:32:20] 10Analytics-Data-Quality, 10Analytics-Kanban, 10Reading-Web-Backlog: Pageview drop in ro.wikipedia hu.wikipedia and fr.wikipedia - https://phabricator.wikimedia.org/T170845#3444834 (10Nuria) >Since the drop has been gradual and there has been no change in traffic labelled as spider/bot (and its tiny compared... [17:44:22] ottomata: https://gerrit.wikimedia.org/r/#/c/365662/ :D [17:44:30] I checked and the analytics-wmde user has the time command now! [17:47:38] 10Analytics, 10Data-release: Wikipedia Clickstream dataset. Programmatic Access - https://phabricator.wikimedia.org/T134231#3444973 (10DarTar) Wonderful. I imagine the plan is not to generate daily aggregates though, right? The task description seems to suggest so. [17:49:40] 10Analytics, 10Data-release: Wikipedia Clickstream dataset. Programmatic Access - https://phabricator.wikimedia.org/T134231#3444979 (10Nuria) To be clear, the part we are aiming to do this quarter (if we get a third party contributtor) is just the creation of dataset, let me make sure there is no other ticket... [17:49:42] addshore: merged :) [17:49:46] thanks! [17:50:48] 10Analytics, 10Data-release: Wikipedia Clickstream dataset. Programmatic Access - https://phabricator.wikimedia.org/T134231#2258736 (10Nuria) Ya , sorry, this is the one: https://phabricator.wikimedia.org/T158972 [17:52:53] 10Analytics, 10Data-release: Wikipedia Clickstream dataset. Programmatic Access - https://phabricator.wikimedia.org/T134231#3445005 (10DarTar) Got it, shall we close this as a dupe? [17:53:18] 10Analytics, 10Research-and-Data: productionize ClickStream dataset - https://phabricator.wikimedia.org/T158972#3445008 (10DarTar) [17:53:25] 10Analytics, 10Data-release: Wikipedia Clickstream dataset. Programmatic Access - https://phabricator.wikimedia.org/T134231#3445009 (10Nuria) No, this is a different ticket, is about API access not dataset creation. [17:59:24] milimetric: i guess public geowiki stuff isn't used anymore? [17:59:25] http://gp.wmflabs.org/ [18:00:57] ottomata: yeah, it stopped being useful when we had to redact it [18:02:16] ok, there are jobs monitoring that it is working :) [18:02:23] apparently they don't work very well :p [18:04:22] heh, this makes sense. Yeah, I think I removed that domain name when we killed limn1 [18:06:18] milimetric: do we need to continue to push to the public data repo then? [18:06:46] i guess it doesn't hurt... [18:06:53] although it would make configuration easier [18:06:56] won't have to push to a remote gerrit repo [18:07:00] I don't think so, though that whole process could really use an overhaul. I think we had slated that for Q1, but that was before double baby and other things that came up [18:07:05] yeah [18:08:48] haha milimetric uh oh [18:08:49] PYTHON_SHIM_BASE_DIR_ABS=/home/milimetric/geowiki-dependencies/ [18:08:57] ottomata decided that it's not worth the effort to puppetize each [18:08:57] # and every of erosen's python repos. [18:09:02] 10Analytics-Cluster, 10Analytics-Kanban, 10Security, 10User-Addshore: Access rights for HDFS on stat100* for Sqoop tasks - https://phabricator.wikimedia.org/T170052#3418073 (10Milimetric) @GoranSMilovanovic I pinged you in IRC but in case you missed it, let's meet to go over the problem here. I wrote the... [18:09:03] # That's bad. [18:09:03] # Like really bad. [18:09:04] # But at least it allows us to run the scripts for now. [18:09:11] :) [18:09:17] oof ok [18:09:21] lemme sycn your homedir right now.. [18:09:22] yep, I just copied that from Christian [18:09:54] sorry, I can go and cleanup everything I know of on stat1002 if you want [18:10:09] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Replacement of stat1002 and stat1003 - https://phabricator.wikimedia.org/T152712#3445063 (10Addshore) [18:10:11] 10Analytics-Cluster, 10Analytics-Kanban, 10WMDE-Analytics-Engineering, 10Patch-For-Review, 10User-Addshore: Move statistics::wmde jobs from stat1002 -> stat1005 - https://phabricator.wikimedia.org/T170472#3445062 (10Addshore) 05Open>03Resolved [18:11:50] this is stat1003 [18:11:53] but maybe? [18:14:59] (03PS1) 10Ottomata: No longer push to public data repo, we don't use it [analytics/geowiki] - 10https://gerrit.wikimedia.org/r/365669 (https://phabricator.wikimedia.org/T152712) [18:15:05] milimetric: ^ [18:15:42] (03CR) 10jerkins-bot: [V: 04-1] No longer push to public data repo, we don't use it [analytics/geowiki] - 10https://gerrit.wikimedia.org/r/365669 (https://phabricator.wikimedia.org/T152712) (owner: 10Ottomata) [18:18:19] (03CR) 10Ottomata: [V: 032 C: 032] No longer push to public data repo, we don't use it [analytics/geowiki] - 10https://gerrit.wikimedia.org/r/365669 (https://phabricator.wikimedia.org/T152712) (owner: 10Ottomata) [18:20:54] ottomata: on it, should I move stuff to stat1006? [18:21:04] 5? how is it 1003 -> 1006 and 1002 -> 1005? [18:23:20] questiion: in wmf.webrequest table, the last hour of traffic (this one) is not reflected? [18:23:38] ya that [18:23:43] milimetric: what else needs moved? [18:23:46] i rsynced your home dir [18:23:51] gonna do that for everybody, but just haven't done it yet [18:24:02] oh ok, that should be fine then [18:24:04] (03PS1) 10Ottomata: Remove checking of public web page data, since it no longer exists [analytics/geowiki] - 10https://gerrit.wikimedia.org/r/365671 (https://phabricator.wikimedia.org/T152712) [18:24:12] I don't know of anything, just looking [18:24:20] (03CR) 10Ottomata: [V: 032 C: 032] Remove checking of public web page data, since it no longer exists [analytics/geowiki] - 10https://gerrit.wikimedia.org/r/365671 (https://phabricator.wikimedia.org/T152712) (owner: 10Ottomata) [18:24:33] I always thought the geowiki stuff ran out of /srv/geowiki [18:24:51] it does milimetric [18:24:56] but the deps are in your home dir :/ [18:24:59] ah [18:25:02] gotcha [18:25:06] i've rsynced over /srv/geowiki too [18:25:07] oh right, I remember [18:26:04] I have a cron to update VE deployments, but I don't think those exist anymore so it's ok if that job dies [18:26:16] ok, nothing else on 1003 or 1002 [18:26:46] oh ottomata on 1002 I have the thing that updates the GeoIP databases [18:27:02] there's a cron that runs /a/milimetric/GeoIP-toolbox/update_data_files.sh [18:27:19] aye, milimetric i haven't started migrating any user stuff yet [18:27:27] k [18:27:31] so anything that is purely managed by just you can wait [18:27:39] i'm trying to get all the automated /pupppetized stuff over first [18:27:45] then I will sycn homedirs [18:27:52] and send announcemnet about migrating, and set a timeline [18:28:02] oh ok, I thought I missed the train, but I'm early, gotcha [18:28:15] ya, but, if you want to go ahead and do yours, feel free [18:28:21] one new rule [18:28:29] don't create /srv/$USERNAME dirs anymore [18:28:33] just use your homedir [18:28:36] (it is actually on /srv) [18:28:51] i'll rsync your 1002 home now too [18:28:57] in case you want to keep being early :) [18:32:14] ottomata: in wmf.webrequest table, the last hour of traffic (this one) is not reflected? [18:35:39] SMalyshev: it takes a while for the data to get bucketed up by the hour, get refined, and land in HDFS under wmf.webrequest [18:35:53] definitely this current hour is not in there because it's not over yet [18:36:05] and the last hour usually takes about an hour to get through, so usually it's about 2 hours behind [18:36:12] milimetric: is there a way to see what's happening right now? [18:36:17] 10Analytics-Kanban, 10Research-and-Data, 10Research-collaborations, 10Research-management, 10Patch-For-Review: Oozie job to extract data for WDQS research - https://phabricator.wikimedia.org/T146064#3445373 (10ggellerman) [18:36:23] 10Analytics, 10Reading-analysis, 10Research-and-Data, 10Research-consulting: [Epic] Update official Wikimedia press kit with accurate numbers - https://phabricator.wikimedia.org/T117221#3445379 (10ggellerman) [18:36:31] actually, milimetric i betcha it is a python dep problem [18:36:33] I don't need much refinement, just some requests [18:36:34] SMalyshev: yeah, there are some streaming jobs that do just that, what do you need [18:36:35] since we synced everything over from ubuntu [18:36:47] SMalyshev: you get them a litttle bit sooner in wmf_raw.webrequset [18:36:49] 10Analytics: Making geowiki data public - https://phabricator.wikimedia.org/T131280#3445413 (10ggellerman) [18:36:53] 10Analytics, 10Research-and-Data-Archive: geowiki data for Global Innovation Index - https://phabricator.wikimedia.org/T131889#3445411 (10ggellerman) 05Open>03Resolved [18:36:53] milimetric: all queries to wdqs from certain ip [18:37:08] 10Analytics: Delete eventlogging alerts e-mail list - https://phabricator.wikimedia.org/T170864#3445422 (10Nuria) [18:37:33] SMalyshev: the easiest way is to consume from Kafka yourself with a filter, which is what kafkatee is for. [18:37:46] 10Analytics: Delete eventlogging alerts e-mail list - https://phabricator.wikimedia.org/T170864#3445439 (10Nuria) https://lists.wikimedia.org/mailman/admindb/eventlogging-alerts [18:38:01] milimetric: but kafka is only forward, not? or I can go back too? [18:38:39] well, you can specify an offset to start consuming, so you can guess and find the time you need (time-based consuming coming in v.10) [18:39:37] milimetric: but not in hive, right? [18:40:20] SMalyshev: right, for it to be in Hive it has to get processed and put in HDFS, that takes a while 'cause it's a lot of data [18:40:32] SMalyshev: right, not in hive, there is no real time info in hive either, the newest one will be the hour previously refined [18:40:41] SMalyshev: if you set up streaming jobs you can land stuff in Druid for real-time querying [18:41:02] ^ errrrrrrrr not totally true milimetric :/ since we don't have tranquility pupppetized [18:41:21] yes, not officially supported at the moment [18:41:24] SMalyshev: this dataset in pivot might help: https://pivot.wikimedia.org/#webrequest/totals/2/EQUQLgxg9AqgKgYWAGgN7APYAdgC5gQAWAhgJYB2KwApgB5YBO1Azs6RpbutnsEwGZVyxALbVeAfQlhSY4AF9kwYhBkc86FWs7AKVOoxZt1XTDnxEylJQaat2nbub7VBS4XPwiFSrQ43Kqv74MmIASsTkAObiSgAmAK4MxNq8AAoAjAAiVMxg1OYAtBnypYoA2gC6yOQJADZ1SoSkYMwo5cDNrcDVVTX1dUA [18:41:25] but for one-off things people just consume from kafka I think, right? [18:41:37] milimetric: not really [18:41:47] milimetric: people either use pivot webrequest dataset [18:42:00] right now I don't really need that much.... I have some abuse going on on query service and I am trying to fgure out the full picture [18:42:09] milimetric: or consume the prior hour [18:42:22] and see if I can set up some controls to detect it before it brings the service down [18:42:27] milimetric: if they want past data [18:42:43] SMalyshev: ya you can consume from kafka, either from CLI on stat1004 (or other boxes too), or in spark-streaming [18:42:54] SMalyshev: for forward data kafka (as milimetric said) would work [18:42:59] ottomata: I was trying to find docs on that, kafkacat? kafkatee? [18:43:02] kafkacat [18:43:27] SMalyshev: https://wikitech.wikimedia.org/wiki/Kafka#Produce.2FConsume_to_kafka [18:43:51] milimetric: ok, thanks, will read that! [18:43:51] kafkacat is described there a bit, take a look at the help for it on stat1002 [18:44:00] ottomata, SMalyshev, milimetric: but if you need data from couple past hours, best source is webrequest dataset on pivot (sampled) [18:44:02] like kafkacat --help [18:44:30] nuria_: yeah, he was saying he needs it more current 'cause he's looking into abuse, makes sense [18:45:15] milimetric: we need to fix gewoiki on stat1006! :o [18:45:27] i could revert, but i'd rather not [18:45:38] ottomata: I saw a message alluding to that but I think some IRC messages got dropped, what happened? [18:45:52] python deps i think [18:45:58] because yours are from ubuntu [18:45:58] where do I see errors? [18:46:13] ah, makes sense [18:46:25] nuria_: I look at pivot and it's very nice but I get way less requests than are in the logs [18:46:29] milimetric: run [18:46:33] SMalyshev: yes, it is sampled due to volume [18:46:33] sudo -u stats /usr/bin/python /srv/geowiki/scripts/geowiki/process_data.py -o /srv/geowiki/logs --wpfiles /srv/geowiki/scripts/geowiki/data/all_ids.tsv --daily --start=`date --date='-2 day' +\%Y-\%m-\%d` --end=`date --date='0 day' +\%Y-\%m-\%d` --source_sql_cnf=/srv/geowiki/.globaldev.my.cnf --dest_sql_cnf=/srv/geowiki/.research.my.cnf [18:46:56] SMalyshev: noted here: https://pivot.wikimedia.org/ [18:47:00] SMalyshev: 1/128 [18:47:08] nuria_: aha, I see [18:47:15] SMalyshev: i know strange sampling rate but for abuse should do it [18:47:26] SMalyshev: just think of it 1/100 [18:47:58] yeah so 115/hr is really 10k/hr... Looks plausible, that's what I see in the logs [18:50:09] SMalyshev: and you known about throttling settings per IP that went in not that long ago, right/ [18:51:10] nuria_: hmm not sure... [18:51:14] probably not :) [18:53:23] SMalyshev: do check it out in varnish they might not apply as those should throttle traffic to 100 reqs per sec per IP (I *think* i remember those right) [18:58:20] nuria_: I have limit 5 req/ip/server but that's not enough unfortunately :( [18:58:28] ottomata: I'm debugging the thing, maybe there's a way to just change the code [18:58:48] it seems to be a relatively minor problem, just mysql.connection is different somehow [18:58:52] milimetric: maybe so ya [18:58:54] i.e. in 99% of cases it is enough but in this particular one somebody found a way to cause trouble even with this [18:59:26] SMalyshev: ah i see, ip is switching then, right? does webrequest data in pivot help answer your questions? [19:01:31] milimetric: python-mysqldb version changed from 1.2.3-2ubuntu1 on stat1002 to 1.3.7-1.1 on stat1003 [19:01:34] sorry on stat1006 [19:01:38] stat1003 -> stat1006 [19:01:41] ottomata: do you have the outage document for eventlogging page-create issues? [19:01:47] so ya, maybe not a home dir dep [19:02:13] ottomata: yeah, it's weird though, it seems that a bunch of methods were removed from cursor.connection, which seems super strange [19:02:13] nuria_: I added this section here: https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging#Data_Quality_Issues [19:02:20] haven't created incident doc yet [19:02:21] can do now [19:02:32] nuria_: no, the ip is the same. it's just that the query is too heavy even for 5 reqs/s apparently [19:02:48] maybe it hits some bug in blazegrap or soemthing... [19:02:53] SMalyshev: ah ok, even with lower request ration they can clog system [19:03:17] yup [19:12:36] 10Analytics, 10Analytics-EventLogging, 10Analytics-Kanban, 10Contributors-Analysis, and 3 others: Visualize page create events for all wikis - https://phabricator.wikimedia.org/T170850#3445646 (10Aklapper) @Nuria: You may want to remove a good bunch of the copied project tags? [19:13:07] nuria_: https://wikitech.wikimedia.org/wiki/Incident_documentation/20170711-EventLogging [19:14:09] ottomata: ok, edit data quality docs for eventlogging [19:14:41] ottomata: sorry! [19:14:44] * nuria_ editing [19:14:58] and not doing a good job [19:15:57] nuria_: ? [19:16:15] ottomata: sorry, [19:16:23] ottomata: metyping all wrong [19:16:43] ah k [19:16:44] :) [19:17:31] ottomata: finally (mediawiki tables get me every time), updated doc with: https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging#Data_Quality_Issues [19:17:41] ah great, thanks [19:18:01] ottomata: seems to be running, I just hacked up the code a bit, will submit a change to match [19:18:09] this whole thing's so ugly [19:18:22] 10Analytics-EventLogging, 10Analytics-Kanban: ChangesListHighlights events missing from MySQL starting 2017-07-11 - https://phabricator.wikimedia.org/T170486#3445675 (10Ottomata) FYI, added: https://wikitech.wikimedia.org/wiki/Incident_documentation/20170711-EventLogging [19:18:30] ok great [19:18:36] we have just made it 1% less ugly [19:18:40] by removing this public data push [19:18:44] that was pretty nasty [19:23:13] (03PS1) 10Milimetric: Update code to work with MySQLDb 1.3.7 [analytics/geowiki] - 10https://gerrit.wikimedia.org/r/365687 [19:23:36] (03CR) 10Milimetric: [V: 032 C: 032] Update code to work with MySQLDb 1.3.7 [analytics/geowiki] - 10https://gerrit.wikimedia.org/r/365687 (owner: 10Milimetric) [19:24:42] (03CR) 10jerkins-bot: [V: 04-1] Update code to work with MySQLDb 1.3.7 [analytics/geowiki] - 10https://gerrit.wikimedia.org/r/365687 (owner: 10Milimetric) [19:24:55] (03CR) 10jerkins-bot: [V: 04-1] Update code to work with MySQLDb 1.3.7 [analytics/geowiki] - 10https://gerrit.wikimedia.org/r/365687 (owner: 10Milimetric) [19:27:01] milimetric 1, jenkins 0 [19:28:33] ok ottomata, I merged and pulled manually so that job should be fine now, though it'd be nice to keep an eye on it just in case something else is messed up that didn't get executed this time [19:28:52] ok great yeah [19:29:00] it runs now? [19:29:13] yeah, I am running it now again just to check merge was ok [19:29:25] gerat [19:29:26] seems fine [19:55:21] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure: deployment-eventlogging03 out of disk space - https://phabricator.wikimedia.org/T170522#3434423 (10Nuria) a:03Nuria [19:55:57] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure: deployment-eventlogging03 out of disk space - https://phabricator.wikimedia.org/T170522#3434423 (10Nuria) I have dropped logs in var/log and also srv/eventlogging but still there is a lot of space consumed by db that i have not been able to drop [20:01:49] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3445820 (10Ottomata) [20:02:15] nuria_: ^^ haven't looked at ticket, but that box also has a mysql database with el events [20:03:51] ottomata: ya, i tried to drop it but it no work [20:04:02] ottomata: i left cmd running on a screen for like days [20:04:13] ottomata: rebooting now and trying something else [20:04:59] nuria_: can always just wipe and make a new box :D [20:05:28] ottomata: where are all the druid ports defined and what's running on each one? [20:06:02] https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Ports#Druid [20:06:16] sweet, thx [20:06:33] milimetric: Hi, my timezone is CEST, meaning pretty much whatever is good for NYC time is good for me (8h advantage) [20:06:49] milimetric: In relation to the sqoop thing [20:08:11] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3445840 (10Ottomata) [20:10:52] GoranSM: cool, so I'll ping you tomorrow morning then around 09:00 my time? That means 15:00 your time I think [20:11:39] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3445844 (10GoranSMilovanovic) Hi, as of the account: goransm, from your list - it's mine. I work as contractor Data Analyst for WMDE, have LDAP set, access to Labs and Produc... [20:12:06] milimetric: Great. Thank you for offering help on this. I'll be around. [20:12:10] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3445848 (10Mholloway) [20:15:32] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3445861 (10bmansurov) Hi, I (bmansurov) need access to these boxes as I query stats tables from time time as part of event logging analyses or other data related tasks. Pleas... [20:15:37] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3445862 (10Mholloway) [20:18:16] 10Analytics-Cluster, 10Analytics-Kanban, 10Security, 10User-Addshore: Access rights for HDFS on stat100* for Sqoop tasks - https://phabricator.wikimedia.org/T170052#3445867 (10GoranSMilovanovic) @Milimetric Hi, I know about this and/or similar Python script. It has nothing to do with this; I work from R, f... [20:20:16] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3445872 (10Mholloway) [20:20:44] 10Analytics-Cluster, 10Analytics-Kanban, 10Security, 10User-Addshore: Access rights for HDFS on stat100* for Sqoop tasks - https://phabricator.wikimedia.org/T170052#3445873 (10Milimetric) I mean, whether it's Python or R has nothing to do with the access level. My point is, there's an example of a sqoop s... [20:22:55] nuria_: thanks for telling me about pivot btw, I didn't know until now how much I missed it :) [20:24:33] SMalyshev: excallent, we love it when people send us love (cc fdans mforns_ milimetric elukey joal ) [20:28:20] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3445888 (10Niedzielski) [20:28:24] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3445890 (10MusikAnimal) For me, see T156986. I am staff and regularly use stat1002 as a safer alternative over prod for data research. Also as the author of [[ https://tools.... [20:29:24] (03PS6) 10Smalyshev: Add tagger for Wikidata Query Service requests [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/364542 (https://phabricator.wikimedia.org/T169798) [20:29:37] 10Analytics-Cluster, 10Analytics-Kanban, 10Security, 10User-Addshore: Access rights for HDFS on stat100* for Sqoop tasks - https://phabricator.wikimedia.org/T170052#3445892 (10GoranSMilovanovic) Here's what I need to do: I need to migrate all wbc_entity_usage tables - for those projects for which these the... [20:30:49] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3445895 (10Dbrant) [20:32:33] 10Analytics-EventLogging, 10Analytics-Kanban: ChangesListHighlights events missing from MySQL starting 2017-07-11 - https://phabricator.wikimedia.org/T170486#3432990 (10kaldari) @Ottomata: Why are we filtering out bot events? I don't care much one way or the other, but was curious about the rationale. Also, if... [20:37:20] (03CR) 10Nuria: [C: 032] Add tagger for Wikidata Query Service requests [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/364542 (https://phabricator.wikimedia.org/T169798) (owner: 10Smalyshev) [20:38:07] mforns_: also please let me know of my latest changes here: https://gerrit.wikimedia.org/r/#/c/364518 [20:39:38] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3445937 (10Niharika) [20:39:40] (03CR) 10Mforns: [C: 032] "LGTM!" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/364518 (https://phabricator.wikimedia.org/T164021) (owner: 10Nuria) [20:40:35] 10Analytics-EventLogging, 10Analytics-Kanban: ChangesListHighlights events missing from MySQL starting 2017-07-11 - https://phabricator.wikimedia.org/T170486#3445943 (10Nuria) @kaldari : sorry this was confusing, we only filter bot events on non-eventbus events for non-mediawiki bots, mostly to not have data t... [20:41:31] 10Analytics-Data-Quality, 10Analytics-Kanban, 10Reading-Web-Backlog (Tracking): Pageview drop in ro.wikipedia hu.wikipedia and fr.wikipedia - https://phabricator.wikimedia.org/T170845#3445950 (10Jdlrobson) [20:44:22] (03Merged) 10jenkins-bot: Tag annotation should reflect that a tagger can return several tags [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/364518 (https://phabricator.wikimedia.org/T164021) (owner: 10Nuria) [20:49:15] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3445985 (10TJones) [20:58:16] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3446050 (10Amire80) I'm in the staff and I use stat1002 almost daily for collecting statistics about interlanguage links. [20:59:37] 10Analytics-Kanban, 10Analytics-Wikistats: Implement some example metrics as Druid queries - https://phabricator.wikimedia.org/T170882#3446060 (10Milimetric) [20:59:48] 10Analytics-Kanban, 10Analytics-Wikistats: Backend for wikistats 2.0 - https://phabricator.wikimedia.org/T156384#2973128 (10Milimetric) [21:00:25] (03PS1) 10Milimetric: Implement Wikistats metrics as Druid queries [analytics/refinery] - 10https://gerrit.wikimedia.org/r/365806 (https://phabricator.wikimedia.org/T170882) [21:01:11] (03PS1) 10Milimetric: Clean up comments from create table statement [analytics/refinery] - 10https://gerrit.wikimedia.org/r/365809 [21:01:28] (03CR) 10Milimetric: [V: 032 C: 032] Clean up comments from create table statement [analytics/refinery] - 10https://gerrit.wikimedia.org/r/365809 (owner: 10Milimetric) [21:05:27] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3446105 (10Jdcc-berkman) I've been doing research with Zhou Zhou at WMF Legal (who I believe is zhousqaured on the list above). I'm external, so I should have an expiration d... [21:06:24] 10Analytics-EventLogging, 10Analytics-Kanban: ChangesListHighlights events missing from MySQL starting 2017-07-11 - https://phabricator.wikimedia.org/T170486#3446117 (10kaldari) @Nuria, @Ottomata: Now I'm more confused. How do we define "non-mediawiki bots"? Relying on the user agent doesn't seem reliable. Peo... [21:06:38] (03PS7) 10Smalyshev: Add tagger for Wikidata Query Service requests [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/364542 (https://phabricator.wikimedia.org/T169798) [21:15:37] nuria_: I've rebased https://gerrit.wikimedia.org/r/#/c/364542/ [21:16:23] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3446154 (10Tgr) I use EL/Hive occasionally for various investigations. [21:37:02] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review: Replacement of stat1002 and stat1003 - https://phabricator.wikimedia.org/T152712#3420768 (10Dzahn) The new server ran out of disk space. 13:29 < icinga-wm> PROBLEM - Disk space on stat1006 is CRITICAL: DISK CRITICAL - free space: / 0 MB (0% inode=... [21:49:55] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3446466 (10pmiazga) Hi, I (pmiazga) work as contractor Software Engineer WMF and I need access to these boxes as I query stats tables from time to time as part of event loggi... [21:51:55] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3446476 (10atgo) Hello! Staff as well and need continued access. [22:03:16] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3446527 (10debt) [22:06:01] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3446531 (10Samwalton9) Hey - I'm a WMF contractor and needed access to test T115119. Development has stalled so I probably don't need my access anymore. [22:18:20] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3446584 (10Jalexander) Still need on my end. Biggest use is hive/beeline access for relatively routine subpoena/legal data gathering (one might need to happen today for examp... [22:20:36] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3446587 (10jrbs) [22:24:23] 10Analytics-Cluster, 10Analytics-Kanban: Audit users and account expiry dates for stat boxes - https://phabricator.wikimedia.org/T170878#3446589 (10pmiazga)