[04:51:15] 10Analytics, 10ContentTranslation: Limn language dashboard: eswiki graph is wrong/stuck - https://phabricator.wikimedia.org/T99074#3032149 (10Nuria) [08:24:17] team going afk for (hopefully) 1 hour (workers at home) [08:24:18] ttl! [08:24:20] * elukey afk [09:18:33] 06Analytics-Kanban: Check abnormal pageviews for XHamster - https://phabricator.wikimedia.org/T158071#3032452 (10Tbayer) >>! In T158071#3029918, @Nuria wrote: >>What would be the SEO benefit of scraping the page? > eh... traffic, of course, as a result of better positioning on search. Are you saying that Google... [09:24:45] 06Analytics-Kanban, 06Operations, 15User-Elukey: Periodic 500s from piwik.wikimedia.org - https://phabricator.wikimedia.org/T154558#3032454 (10elukey) Details from cp1058: ``` -- VCL_call BACKEND_FETCH -- VCL_return fetch -- FetchError no backend connection -- Timestamp Beresp: 148721... [10:46:34] 06Analytics-Kanban, 06Operations, 10Traffic, 15User-Elukey: Periodic 500s from piwik.wikimedia.org - https://phabricator.wikimedia.org/T154558#3032576 (10elukey) [10:54:26] 06Analytics-Kanban, 06Operations, 10Traffic, 15User-Elukey: Periodic 500s from piwik.wikimedia.org - https://phabricator.wikimedia.org/T154558#3032616 (10elukey) ``` elukey@oxygen:/srv/log/webrequest$ grep piwik archive/5xx.json-20170216 | jq -r '[.http_status,.dt]| @csv' | awk -F":" '{print $1}'| sort | u... [12:30:35] joal can I show you something for a second? [12:32:56] fdans: I think he is travelling until this afternoon [12:33:34] oh that's right, I misunderstood his email [12:33:43] thank you elukey :) [12:34:04] :) [12:34:07] * elukey lunch! [13:09:48] leila: hi :) [13:10:00] awesome meeting invite [13:10:07] batcave? [13:10:25] haha, milimetric. Let's give it a try, not sure if my connection will handle it. [13:10:34] oh, that's cool, irc works fine [13:11:13] ok, let's do IRC then. It's very loud in this cafe, lunch time. [13:11:24] let me pull up the latest email, milimetric. [13:13:54] leila: o/ [13:14:27] Buongiorno elukey. ;) [13:21:35] heloo [13:22:39] !log updated firewall rules for Analytics VLAN [13:22:46] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:25:22] 06Analytics-Kanban, 06Operations, 10netops: Review ACLs for the Analytics VLAN - https://phabricator.wikimedia.org/T157435#3032910 (10elukey) After running `tcpdump ip6` on a couple of hosts I realized that the puppet agent contacts puppetmaster1001 via IPv6. I added a special term called `puppet` to `analyt... [14:37:36] 10Analytics: Serbian Wikipedia edits spike 2016 - https://phabricator.wikimedia.org/T158310#3033082 (10Milimetric) [14:37:43] milimetric: got a second to batcave? [14:39:37] good mornign elukey! [14:39:42] ottomata: o/ [14:39:43] should we try to fit in the upgrade before standup? [14:39:49] sure [14:40:36] give me 15ish is it ok? [14:40:39] k [14:41:54] ottomata: do you know how the mediacounts end up in http://dumps.wikimedia.your.org/other/mediacounts ? There is an email about missing data and I was curious [14:43:42] yea i saw that too, was going to look into it [14:43:51] oozie job i think... [14:43:54] checking [14:44:16] ya two jobs [14:44:19] https://hue.wikimedia.org/oozie/list_oozie_coordinator/0024518-160420145651441-oozie-oozi-C/ [14:44:22] https://hue.wikimedia.org/oozie/list_oozie_coordinator/0024519-160420145651441-oozie-oozi-C/ [14:44:32] and then they get rsynced i think [14:44:37] to dumps.wm.org from stat1002 [14:46:58] 10Analytics, 06Research-and-Data: geowiki data for Global Innovation Index - https://phabricator.wikimedia.org/T131889#3033103 (10Milimetric) Here's a better query that accounts for how that table handles dates (ts I think is the run-time, start and end are the range): select country, sum(edits) edits fro... [14:47:14] ahhh it is not in a bundle [14:47:20] hue tricked me [14:47:43] if the data is not there we could just fire a new coordinator [14:48:06] (only for that piece of data missing) [14:48:24] maybe, am looking, i think the data is there...maybe archive job failed, looking stilll [14:49:19] elukey ya! [14:49:23] the archive job for that day totally failed! [14:49:27] https://hue.wikimedia.org/oozie/list_oozie_coordinator/0024519-160420145651441-oozie-oozi-C/ [14:49:31] go to 2nd page [14:49:36] going to rerun.. [14:50:12] nice [14:55:18] ottomata: going in the cave! [14:55:51] elu be there 2 mins [14:56:50] ottomata: batcave- [14:56:54] argh batcave-2 [14:57:56] there [14:58:16] elukey: https://hangouts.google.com/hangouts/_/wikimedia.org/a-batcave-2 [14:58:17] ? [14:59:30] hey fdans, batcave? [15:04:41] milimetric: sure! sorry went out for a second [15:04:59] omw [15:06:49] joal, yt? [15:07:04] Hey mforns, just arrived :) [15:07:08] hi! [15:07:14] can batcave for 5 mins? [15:09:12] sure mforns :) [15:09:16] maybe batcave-2 [15:09:21] ok :] OMW [15:09:53] joal, maybe batcave-3! [15:29:50] zareen: another big query is running; is taht expected? [15:30:21] joal: yes, just running 1 to check something [15:32:12] zareen: This is not cool - While I understand you have needs, I have spend quite some amount of time to make you be able to run small queries. You going for a huge one without even telling us is not nice. As nuria said, we've managed not to enforce user quota on the cluster so far - You might be the one making us have to make that change. [15:33:28] (03PS5) 10Mforns: Add spark job to aggregate historical projectviews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/337593 (https://phabricator.wikimedia.org/T156388) [15:35:05] joal: i thought it would be okay to run just one. in zareen.webrequest_extract there are a couple values of wmf_last_access which are seem weird. most of these dates are formatted like "11-Jan-2017" but a few are like "11-jan-2017" or "31-DEc-2016" - different lower case/upper case in dates and those behave differently in my table. i was checking if that occurrs only in zareen.webrequest_extract or also [15:35:05] in wmf.webrequest [15:36:37] joal: this is not for the normal queries which I run often to collect data, but to check for abnormal rows [15:36:38] zareen: I don't actually care the reason - you not even talking to us is not cool. I won't kill it, but please be so kind as to acknowledge the effort we put into helpîng you. [15:38:50] joal: i'm sorry, i didn't realize i had to ask to run one-off queries. i've completely stopped running the queries to collect data from wmf.webrequest. perhaps, i should have asked you about the rows before running this query, but thought if i could figure it out on my own it would be better than taking up more of your time. [15:40:28] joal: as i've said, i really do appreciate the help you've already provided and don't mean to undermine your efforts. creating zareen.webrequest_extract has been very useful and helpful for my work [15:58:52] 10Analytics, 10ChangeProp, 10EventBus, 10Revision-Scoring-As-A-Service-Backlog, and 2 others: Rewrite ORES precaching change propagation configuration as a code module - https://phabricator.wikimedia.org/T148714#2730864 (10Halfak) p:05Triage>03Normal [16:01:13] a-team: hola! [16:01:30] aholaY! [16:01:45] mforns, joal, ottomata : stadduppp [16:02:18] mforns, joal, ottomata, fdans : stadduppp [16:02:43] going! [16:18:03] 10Analytics: Security Upgrade for piwik - https://phabricator.wikimedia.org/T158322#3033379 (10Nuria) [16:42:10] milimetric nuria fdans mforns updated mocks please review before we meet tomorrow: https://www.dropbox.com/sh/wyqf29o81rbnk5f/AABMqc24sUggcl9ws3_FtPvca?dl=0 [16:42:28] ashgrigas, hi! will do, thanks :] [16:42:34] mforns: break break? [16:42:37] mforns thanks! [16:42:56] joal, break? :] what? [16:43:16] breaking your break? batcave? [16:44:46] joal, ok! [16:52:19] 10Analytics: Add UDF to translate from wikiCode to projectName - https://phabricator.wikimedia.org/T158330#3033556 (10mforns) [16:53:01] 10Analytics: Serbian Wikipedia edits spike 2016 - https://phabricator.wikimedia.org/T158310#3033071 (10Nuria) [16:55:50] 06Analytics-Kanban: Split unique devices data for Asiacell and non-Asiacell traffic in Iraq - https://phabricator.wikimedia.org/T158237#3033580 (10Nuria) a:03Nuria [16:58:16] 06Analytics-Kanban: Create EventStreams swagger spec docs endpoint - https://phabricator.wikimedia.org/T158066#3033592 (10Nuria) [17:02:57] 06Analytics-Kanban: Blog post about druid - https://phabricator.wikimedia.org/T157978#3033612 (10Milimetric) p:05Triage>03Normal a:03Milimetric [17:03:11] 10Analytics: upgrade druid to 0.9.2 - https://phabricator.wikimedia.org/T157977#3033616 (10Milimetric) [17:03:36] 10Analytics: Add UDF to translate from wikiCode to projectName - https://phabricator.wikimedia.org/T158330#3033618 (10Milimetric) p:05Triage>03Normal [17:03:41] 10Analytics: Agreggate banner dataset for long term retention - https://phabricator.wikimedia.org/T157582#3033619 (10Milimetric) p:05Triage>03Normal [17:03:47] 10Analytics: upgrade druid to 0.9.2 - https://phabricator.wikimedia.org/T157977#3022179 (10Milimetric) p:05Triage>03Normal [17:05:28] 06Analytics-Kanban: Fix description of webrequest table - https://phabricator.wikimedia.org/T157951#3021381 (10Milimetric) p:05Triage>03Normal [17:09:40] 10Analytics: Update undocumented EventLogging mediawiki hooks - https://phabricator.wikimedia.org/T158331#3033632 (10Milimetric) [17:25:07] 06Analytics-Kanban, 06Discovery-Analysis, 07Browser-Support-Apple-Safari: Visits/searches from Safari 10 location bar search suggestions - https://phabricator.wikimedia.org/T157796#3033673 (10mpopov) a:03mpopov Hm... seeing something that might indicate they're using our API :) looking deeper into it… [17:25:49] mforns: only way I got it working is by using spark2.0 (as expected) - I please you to keep working with us ;) [17:29:56] joal, oh! OK, xD don't worry [17:30:17] mforns: I'm going to review your patch now [17:30:31] joal, do we want to switch to spark2.0? [17:31:17] mforns: We do, but it's a significant change [17:32:04] mforns: I'm gonna merge your change as is, then wait for the CDH upgrade, and ask my beloved ops if we can make spark 2.1 our default (more work than 1.6 for them° [17:34:36] mforns: One last nit - Could you straight to -cdh5.10.0 in pom (I just realized this will be the version that'll be deployed) [17:34:40] sorry :/ [17:35:03] joal, np will do [17:35:44] joal: in general am for it, file a task? [17:35:48] maybe we can just make it easier to use [17:35:50] if not the default [17:35:55] like, officially installed jars or something [17:35:58] (03PS6) 10Mforns: Add spark job to aggregate historical projectviews [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/337593 (https://phabricator.wikimedia.org/T156388) [17:36:03] Sure ottomata, I like the idea of having your support [17:39:04] 10Analytics: Make Spark 2.1 easily available on new CDH5.10 cluster - https://phabricator.wikimedia.org/T158334#3033748 (10JAllemandou) [17:39:15] ottomata: --^ [17:39:20] Thanks :) [17:40:27] (03CR) 10Joal: [C: 031] "LGTM ! Thanks mforns for having tested all my ideas ;)" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/337593 (https://phabricator.wikimedia.org/T156388) (owner: 10Mforns) [17:40:38] thanks joal! [17:41:08] 10Analytics: Make Spark 2.1 easily available on new CDH5.10 cluster - https://phabricator.wikimedia.org/T158334#3033748 (10Ottomata) +1, i betcha we could just load the jars into hdfs and have a special wrapper script to use them. MAYBE. :) [17:59:58] ottomata: I don't know what AP REF is. It sounds like some sort of mapping has been done somewhere [18:03:04] ¯\_(ツ)_/¯ [19:05:54] milimetric: did we conclude that the data we were looking at this morning your time was from WP only? [19:06:16] so we have Tu, We, Thu nights? [19:16:55] 06Analytics-Kanban: Check abnormal pageviews for XHamster - https://phabricator.wikimedia.org/T158071#3034178 (10MusikAnimal) >>! In T158071#3032452, @Tbayer wrote: > Yes, see e.g. https://en.wikipedia.org/wiki/PageRank . But links to a page are not the same a the number of times a page was "scraped", so I still... [19:55:44] milimetric, yt? [19:56:52] hey [19:56:56] on a call, one min [19:59:25] mforns: hey, back [19:59:26] sup [19:59:57] hey milimetric :] I'm looking into translating wiki codes into project names, can you brainbounce for a sec? [20:00:50] nuria: Do you want to spend a minute looking at IQ uniaues? [20:01:06] joal: sure, give me 5 minutes [20:01:12] sure [20:01:25] mforns: omw cave [20:05:25] 06Analytics-Kanban: Check abnormal pageviews for XHamster - https://phabricator.wikimedia.org/T158071#3034409 (10Nuria) >Are you saying that Google use our pageview numbers as a search result ranking signal? That would be huge and surprising news which should be more widely known. What is the source for this? n... [20:05:51] joal: batcave looks occupied, somewhere else? [20:05:58] https://hangouts.google.com/hangouts/_/wikimedia.org/a-batcav-2 [20:06:03] nuria: --^ [20:07:32] nuria: you there ? [20:07:50] joal: sorry, no, give me a sec [20:07:58] nuria: sure, np, was wondering [20:44:16] 06Analytics-Kanban, 07Browser-Support-Apple-Safari, 06Discovery-Analysis (Current work): Visits/searches from Safari 10 location bar search suggestions - https://phabricator.wikimedia.org/T157796#3034500 (10debt) [20:45:40] nuria: confirmation for daily (all domains): underestimates are 427 and offset 2345 --> Not viable [20:45:56] Let's try with global domains next week :) [20:46:07] Good night a-team , gone for weekend ! [20:46:28] joal: super thanks [20:47:09] nite! [20:47:27] madhuvishy: PAWS rock :) [20:47:43] madhuvishy: Thanks again for putting that up ;) [20:47:54] Gone for real now! [20:48:46] joal: :D I'm glad [21:06:30] 10Quarry: Users blocked from account creation on meta can not use Quarry - https://phabricator.wikimedia.org/T157342#3002152 (10Tgr) Should be fixed with the next train. [21:21:52] bye team, cya tomorrow [21:42:17] 10Analytics, 10EventBus, 10Wikimedia-Stream, 10service-template-node, 06Services (watching): Tests for swagger spec stream routes in EventStreams - https://phabricator.wikimedia.org/T150439#3034632 (10Ottomata) Or, perhaps we should just wait until this happens: https://github.com/go-swagger/go-swagger/i... [22:24:51] 10Analytics, 10EventBus: log-events topic emitted in EventBus - https://phabricator.wikimedia.org/T155804#3034749 (10Mattflaschen-WMF) There is an interface that maps pretty well to a schema: https://phabricator.wikimedia.org/source/mediawiki/browse/master/includes/logging/LogEntry.php;f1284cbadae948be64e11363... [22:30:44] 10Analytics, 10EventBus, 06Services (watching): log-events topic emitted in EventBus - https://phabricator.wikimedia.org/T155804#3034759 (10Pchelolo) [22:35:38] 10Analytics, 10EventBus, 06Services (watching): log-events topic emitted in EventBus - https://phabricator.wikimedia.org/T155804#2955138 (10Pchelolo) Do we know the approximate rate of events here? [22:47:00] 10Quarry, 06Labs, 10Tool-Labs: Clarify Tool Labs' rules to see if Quarry and PAWS are allowed to be hosted there - https://phabricator.wikimedia.org/T152212#3034852 (10scfc) p:05Triage>03Low