[02:39:52] harej: you need a reserach collaboration (formal one) with resercrh team [02:40:08] *with research team [02:40:11] let me dig url [02:41:58] harej: https://www.mediawiki.org/wiki/Wikimedia_Research/Formal_collaborations [02:42:39] harej: full disclosure that -at this time, this year- i think the team cannot handle more collaborations but I would e-mail and ask [02:43:09] harej: other than formal collaborations we do not provide ad-hoc access [08:27:32] Hi team [09:16:30] (03PS2) 10Joal: Upgrade restbase-modules to latest [analytics/aqs] - 10https://gerrit.wikimedia.org/r/384590 (https://phabricator.wikimedia.org/T178312) [09:18:12] fdans: Good morning! We haz dataz :D [09:18:48] awyisssssss joal [09:19:00] https://wikimedia.org/api/rest_v1/metrics/edits/aggregate/en.wikipedia/user/content/daily/20170801/20171001 [09:19:18] * joal dances in front his computer [09:19:44] \o/ [09:23:48] what is this measuring here? [09:25:37] harej: we are testing (very much alpha release, unstable, can break and so on) endpoints serving edits/editors/edited-pages statistics [10:23:34] oh joal I'm so excited :D [10:23:44] hehehe :) [10:28:06] fdans: Can I help to get visualization of those metrics? [10:29:29] (03PS2) 10Fdans: Add central notice component and detect adblock [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/383798 (https://phabricator.wikimedia.org/T177491) [10:36:54] joal: I'm currently investigating something fishy with the config + router, let's catch up during the afternoon and I'll show you what we have? [10:37:08] fdans: You're awesome :) [10:37:41] noooooo you are! [11:09:21] (03CR) 10Mobrovac: [C: 031] "Nice!" [analytics/aqs] - 10https://gerrit.wikimedia.org/r/384590 (https://phabricator.wikimedia.org/T178312) (owner: 10Joal) [12:13:46] joal: will new_registered_users always return 0 for the last month when querying from a past date to the present? [12:14:51] https://usercontent.irccloud-cdn.com/file/cldp28GV/Screen%20Shot%202017-10-17%20at%2014.14.30.png [12:17:42] 10Analytics-Tech-community-metrics, 10Developer-Relations (Oct-Dec 2017): Make Qgil a fallback for Bitergia access (lock-in) - https://phabricator.wikimedia.org/T178381#3690278 (10Aklapper) [12:54:09] elukey, hi! [12:54:37] weird fdans ! [12:54:50] hey joal :] [12:54:54] hey fdans :] [12:55:00] Hi mforns :) [12:55:50] joal, BTW, I'm trying to execute the banner data script with the hdfs user (by hand) but when I do sudo -u hdfs the PYTHONPATH is not defined [12:56:09] and I can not export it with sudo -u hdfs [12:56:18] do you know how to achieve that? [12:56:32] * from stat1005 [12:56:40] I don't mforns - I think we need elukey or ottomata [12:56:52] ok [12:59:22] fdans: This is a bug at indexation! Crap [13:28:22] 10Analytics-Tech-community-metrics, 10Developer-Relations (Oct-Dec 2017): Make Qgil a fallback for Bitergia access (lock-in) - https://phabricator.wikimedia.org/T178381#3690511 (10Aklapper) [13:28:44] (03CR) 10Ppchelko: [C: 031] Upgrade restbase-modules to latest [analytics/aqs] - 10https://gerrit.wikimedia.org/r/384590 (https://phabricator.wikimedia.org/T178312) (owner: 10Joal) [14:07:45] mforns: I know!! [14:07:47] luca taught me [14:07:55] ok, the simplest way is to sudo -u hdfs /bin/bash [14:08:01] that drops you into the shell as the hdfs user [14:08:03] milimetric, hehehe I already found a way [14:08:05] then you can export and run whatever [14:08:07] oh [14:08:13] I see! [14:08:37] I did some crazy other thing too, but I find this easier to understand and therefore remember [14:08:38] I did: sudo -u hdfs PYTHONPATH=... [14:09:02] it turns out sudo lets you pass env vars [14:09:42] but your solution is powerful! hehe I think next time I will take your advice :] [14:12:36] it's not mine, it's luca's, I was doing something super crazy, nice to know sudo takes env vars [14:21:31] joal: are the bytes-related metrics up as well? [14:21:43] not sure what's wrong with this url in that case [14:21:44] https://wikimedia.org/api/rest_v1/metrics/bytes/net/en.wikipedia.org/all-editor-types/all-page-types/monthly/2015101700/2017101700 [14:22:37] 10Analytics-Kanban, 10Performance-Team: Explore NavigationTiming by faceted properties - EventLogging refine - https://phabricator.wikimedia.org/T166414#3690723 (10mforns) [14:41:18] fdans: https://wikimedia.org/api/rest_v1/metrics/bytes-difference/net/aggregate/en.wikipedia.org/all-editor-types/all-page-types/monthly/2015101700/2017101700 [14:41:38] ooooo thanks dan [14:42:23] joal: are you ok with me doing another pull request to restbase for just little language tweaks? [14:43:23] easier than doing a review and making you do all the work [14:45:25] omg joal [14:45:28] there is datas [14:45:43] fdans: https://wikimedia.org/api/rest_v1/metrics/bytes-difference/net/en.wikipedia.org/all-editor-types/all-page-types/monthly/2015101700/2017101700 [14:45:54] yes nuria_ - We haz dataz :) [14:45:58] yes milimetric clarified [14:46:07] oh missed milimetric call - sorry [14:46:17] milimetric: Please go ahead with a new PR :) [14:46:32] milimetric: Mine has already been merged, so a new one would be better [14:47:07] and milimetric, since language is pretty much the same in restbase and our v1 files, changing both would be awesome (sorry for doubling the workload) [14:47:51] I'm soooooo happy right now <3 [14:47:52] joal: no problem at all, happy to be useful [14:53:44] fdans: So am I :0 [14:53:48] :) sorry [15:07:33] 10Analytics-Kanban, 10Patch-For-Review: Fix MediaWiki snapshot cleaner cron job - https://phabricator.wikimedia.org/T178256#3690869 (10Nuria) [15:17:15] joal: druid-public-broker.svc.eqiad.wmnet [15:17:23] ottomata: port 8082? [15:17:26] oh ya [15:17:26] ya [15:17:42] Ok ottomata, will submit a patch [15:17:43] (03CR) 10Milimetric: "comment on a comment, but otherwise +2" (032 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/384530 (https://phabricator.wikimedia.org/T178302) (owner: 10Mforns) [15:43:31] 10Analytics-Kanban, 10Patch-For-Review: Remove AppInstallIId from EventLogging purging white-list - https://phabricator.wikimedia.org/T178174#3690942 (10Nuria) >Please do not change the settings without an opportunity for the apps teams to review the tradeoffs involved Please do analyze the tradeoffs , but aga... [15:48:51] 10Analytics: Survey dashboard layout for dashiki - https://phabricator.wikimedia.org/T178399#3690954 (10Nuria) [16:12:11] 10Analytics-Cluster, 10Analytics-Kanban, 10Language-Team, 10MediaWiki-extensions-UniversalLanguageSelector, and 3 others: Migrate table creation query to oozie for interlanguage links - https://phabricator.wikimedia.org/T170764#3691012 (10Nuria) [16:20:22] (03PS14) 10Joal: Update mediawiki-history-reduced oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/379000 (https://phabricator.wikimedia.org/T174174) [16:20:27] milimetric: --^ [16:20:48] I think we'll be good with that milimetric - Also fixes mediawiki-history-beta [16:21:01] great, joal, I'll review after lunch first thing [16:21:02] thank you [16:22:41] milimetric: I'll launch a new indexation now, in case we experience failure [16:23:01] good [16:28:49] (03PS15) 10Joal: Update mediawiki-history-reduced oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/379000 (https://phabricator.wikimedia.org/T174174) [16:28:56] milimetric: Actually a godd idea ... Sorry for the new introduced bug [16:30:07] looks good now [16:30:17] Away for dinner a-team [16:32:19] i am always intimidated by https://gerrit.wikimedia.org/r/#/c/379000/15/oozie/mediawiki/history/reduced/generate_mediawiki_history_reduced.hql [16:38:49] If it's a data lake, and it's processed using stuff in the refinery repository, then does the refinery process water? Or is a lake of gasoline? [16:38:54] * AndyRussG confiscates matches [16:48:31] AndyRussG: 5 minutos sin amigos [16:58:04] wikimedia/mediawiki-extensions-EventLogging#704 (wmf/1.31.0-wmf.4 - 2cecfd7 : Timo Tijhof): The build has errored. [16:58:04] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/compare/wmf/1.31.0-wmf.4 [16:58:04] Build details : https://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/289129103 [17:18:46] 10Analytics, 10DBA, 10Operations: Prep to decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#3691322 (10faidon) [17:34:05] PROBLEM - HDFS capacity used percentage on analytics1001 is CRITICAL: CRITICAL: 72.41% of data above the critical threshold [90.0] [17:35:19] 10Analytics-Kanban, 10Patch-For-Review: Remove AppInstallIId from EventLogging purging white-list - https://phabricator.wikimedia.org/T178174#3691372 (10Nuria) ping @JMinor [17:50:40] mforns: were you able to run the cleaner script for mediawiki and delete teh snapshots we do not need? [17:50:42] *the [17:50:55] nuria_, yes, there are 6 snapshots right now [17:51:00] mforns: thanks [17:51:01] I checked the data and looks good [17:51:04] mforns: k [17:59:46] ACKNOWLEDGEMENT - HDFS capacity used percentage on analytics1001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [90.0] ottomata Should be able to delete some data soon. [18:16:25] (03PS2) 10Mforns: Fix refinery banner activity cleaner to allow for email alerts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/384530 (https://phabricator.wikimedia.org/T178302) [18:17:58] (03CR) 10Mforns: "Thanks for the review, milimetric! The comment makes total sense." (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/384530 (https://phabricator.wikimedia.org/T178302) (owner: 10Mforns) [18:18:32] (03CR) 10Milimetric: [V: 032 C: 032] Fix refinery banner activity cleaner to allow for email alerts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/384530 (https://phabricator.wikimedia.org/T178302) (owner: 10Mforns) [18:19:25] thx! :] [18:23:17] (03CR) 10Mforns: [C: 032] "LGTM!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/379000 (https://phabricator.wikimedia.org/T174174) (owner: 10Joal) [18:42:24] RECOVERY - HDFS capacity used percentage on analytics1001 is OK: OK: Less than 60.00% above the threshold [85.0] [19:44:02] 10Analytics-Kanban: Archive tables to hadoop: MobileWikiAppToCInteraction_10375484_15423246 and Edit_13457736_15423246 - https://phabricator.wikimedia.org/T177960#3691949 (10Nuria) MobileWikiAppToCInteraction_10375484_15423246 is ready to be dropped, on archive db on hdfs cc @elukey [20:00:58] joal: i get this feeling that scopp cannot deal with "." in column names, does thi ssound familiar at all? [20:01:02] *this [20:01:29] nuria_: we've not run into this with mw-history, but this is very possible [20:02:17] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review, 10Readers-Web-Backlog (Tracking): Schema:Popups suddenly stopped logging events in MariaDB, but they are still being sent according to Grafana - https://phabricator.wikimedia.org/T174815#3691968 (10Nuria) Docs look good, thanks. [20:02:35] joal: man , sqoop is a source of unexpected surprises [20:03:11] indeed nuria [20:03:25] nuria_: sqoop type conversion is another funny thing :) [20:03:55] joal: ya, some columns i just could not coerce into any type that was not string after trying for a while [20:04:12] joal: i was just mentinong this to tilman [20:04:37] joal: and i tried casting, java column blah blah [20:04:54] joal: i ended up doing convert(x using utf8) [20:05:18] nuria_: for blobs to be read as strings, this is best option I think [20:05:30] joal: event bigint columns [20:05:44] wow - that's unexpected [20:06:01] joal: I guess there must be a magical combination of parameters [20:06:10] joal: but I tried a ton [20:07:26] nuria_: it depends a lot on how they've been encoded - We've not have any issues with ints, some with string/blobs, some with booleans [20:07:41] joal: ya, i saw it on mw code [20:08:05] joal: boy , i read those python scripts like 10 times [20:08:34] joal: and i bet that there is probably a way to do it with this data, i was not able to find it [20:12:18] joal: ok, yes, for future reference columns with "." need to be double quoted [20:14:14] I think I'll forget it nuria_, but it's good to know we found it :) [21:30:17] 10Analytics-EventLogging, 10Analytics-Kanban, 10MW-1.31-release-notes (WMF-deploy-2017-10-10 (1.31.0-wmf.3)), 10Patch-For-Review: PageContenSaveComplete. Stop collecting - https://phabricator.wikimedia.org/T177101#3647234 (10Nuria) Events no longer flowing in: https://grafana.wikimedia.org/dashboard/db/eve... [21:32:44] 10Analytics-EventLogging, 10Analytics-Kanban: Refine should parse user agent field as it is done on refinery pipeline - https://phabricator.wikimedia.org/T178440#3692276 (10Nuria)