[00:36:19] 10Analytics, 03Community-Tech-Sprint: Investigation: How can we improve the speed of the popular pages bot - https://phabricator.wikimedia.org/T164178#3250240 (10Niharika) >>! In T164178#3246056, @Stevietheman wrote: >>>! In T164178#3245629, @Niharika wrote: >> @Stevietheman I can't find a reference for it now... [05:50:49] 10Analytics, 06DC-Ops, 06Operations, 10ops-eqiad, 13Patch-For-Review: Decom/Reclaim analytics1027 - https://phabricator.wikimedia.org/T161597#3250586 (10elukey) Thanks @Dzahn! Next time I will not put the host in role spare but I'll remove everything! [06:11:02] 10Analytics-Cluster, 06Analytics-Kanban, 06Operations, 10ops-eqiad, 15User-Elukey: Analytics hosts showed high temperature alarms - https://phabricator.wikimedia.org/T132256#3250605 (10elukey) Checked on analytics10[32,33] and mcelog shows no events after Chris' maintenance. [06:17:34] 10Analytics-Cluster, 06Analytics-Kanban, 06Operations, 10ops-eqiad, 15User-Elukey: Analytics hosts showed high temperature alarms - https://phabricator.wikimedia.org/T132256#3250612 (10elukey) Hosts remaining to do: * analytics1060.eqiad.wmnet * analytics1029.eqiad.wmnet * analytics1037.eqiad.wmnet * an... [06:35:00] 10Analytics, 10Reading Epics, 06Wikipedia-iOS-App-Backlog, 07Spike, 05iOS-app-v5.5.0-Snake-On-A-Magic-Towel: Research and define initial technical requirements for app analytics - https://phabricator.wikimedia.org/T164801#3250617 (10Nuria) [07:04:13] * elukey commutes to the office [07:51:47] 10Analytics, 06DC-Ops, 06Operations, 10ops-eqdfw: SATA errors for stat1004 in the dmesg - https://phabricator.wikimedia.org/T162770#3250730 (10elukey) @Cmjohnson sorry for the late response, didn't notice your answer! So we have two sw raid10 already running, so I'd say AHCI (so not hw raid) but please l... [10:13:38] in EL we have table attributes called "event_action.abort.mechanism" [10:13:41] mmmm [10:13:47] never seens dots in there [10:14:42] atm I came up with ^[A-Za-z0-9_\.]+(\[[a-z0-9_]+\])?$ [10:15:12] to allow current attribute names and the new format that Marcel proposed for whitelisting single JSON fields, attributename[jsonfieldname] [10:19:35] I thought that parsing that would have been trivial but I am finding a lot of corner cases [10:19:55] for example, we were discussing that a white list with something like [10:20:02] TableName attributename [10:20:11] Tablename attributename[somefield] [10:20:19] is probably something wrong [10:25:05] (the user wanted to keep only a JSON field but it forgots to remove the prev whitelist for the whole json structure) [10:56:54] I think I found a good data structure for the problem, will talk with Marcel when he'll be online :) [12:13:46] 10Analytics-Tech-community-metrics: Redirect korma.wmflabs.org - https://phabricator.wikimedia.org/T164924#3251231 (10Nemo_bis) [12:30:22] 10Analytics-Tech-community-metrics: Redirect korma.wmflabs.org - https://phabricator.wikimedia.org/T164924#3251294 (10Aklapper) p:05Triage>03Lowest [12:39:04] elukey: do you have a couple minutti per parlare nella batcaverna? [12:40:20] fdans: only if you keep talking italian and english [12:40:43] ma of courso! [12:48:09] Joal,. I won't make it to the live systems meeting this morning [12:48:16] FYI. [12:48:25] halfak|Mobile: Ok sir [12:48:36] halfak|Mobile: nothing really new on my side [12:48:41] Just about to hey on my bike to head into the University. :D [12:48:54] halfak|Mobile: Have a good one :) [12:58:25] (03CR) 10Ottomata: EventLogging JSON -> Hive (038 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) (owner: 10Joal) [12:58:44] (03PS23) 10Ottomata: EventLogging JSON -> Hive [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) (owner: 10Joal) [13:03:58] hey elukey I'm here [13:04:30] hola! [13:08:02] (03PS10) 10Mforns: Update banner monthly job to reuse index [analytics/refinery] - 10https://gerrit.wikimedia.org/r/347653 (https://phabricator.wikimedia.org/T159727) (owner: 10Joal) [13:09:29] elukey, hey what's uppppp [13:09:29] I saw your ping about EL [13:11:45] nono I was writing stuff in the chan [13:18:41] fdans: I think that the require is not needed [13:19:15] fdans: in the plugin we have require => File['/usr/local/lib/eventlogging'] [13:19:23] that is defined in eventlogging::server [13:19:32] this will create the implicit dependency [13:19:41] ohh I see [13:20:32] elukey: removing require, pushing [13:21:04] fdans: wait a sec [13:21:16] ./modules/eventlogging/files/filters.py:5:52: W292 no newline at end of file [13:21:25] ERROR: InvocationError: '/home/jenkins/workspace/operations-puppet-tests-jessie/.tox/pep8/bin/flake8' [13:21:29] dammit [13:21:48] every time [13:22:10] also I am reviewing filters.py [13:22:19] it seems super easy but shouldn't we handle exceptions? [13:22:52] json.loads IIRC returns them if it doesn't like the json [13:24:42] moreover: [13:25:08] elukey: the filter should probably return True if json load fails right? [13:25:09] 1) I'd couple all the code changes in one block (aligning '=') and then I'd add a comment about why we are doing it [13:25:30] 2) I thought I had more but I was wrong :D [13:25:48] fdans: yeah I am not sure how the exception will be handled, but it is worth considering the case [13:26:42] or say for some reason 'is_bot' is not in the data structure [13:26:48] ---> KeyError [13:26:59] my view is that that function should only return false if it has found a bot/spider, and let downstream handle problems with json [13:27:04] yeah that makes sense [13:27:42] it might be an option but please check what happens if exclude_bots craps out :D [13:28:18] mforns: I have a data structure that should represent easily the new whitelist format [13:28:26] elukey, aha [13:28:35] mforns: so something like [13:28:58] a hash with all the table names specified in the whitelist as keys [13:29:37] each table-key points to another hash, that represents what I call the attribute prefix [13:29:55] if there is no json [field] suffix, it will point to a empty list [13:30:00] otherwise to a list of fields [13:30:32] >> 3:25 PM  1) I'd couple all the code changes in one block (aligning '=') and then I'd add a comment about why we are doing it [13:30:38] elukey, reading [13:30:38] not sure what you mean with this [13:31:38] fdans: I'd couple the definition of the plugin with the new variables that you created ($filter_function and $filter_scheme), with a brief comment on top explaining why we put those things in there [13:31:49] moreover, the '=' of the two new variables are not aligned [13:32:10] it makes my OCD problem worse :D [13:33:21] elukey, makes sense [13:33:41] plus I had a chat with Jaime about where to run the script etc.. [13:33:48] so there a couple of ? to answer [13:34:17] for example, say we run the script locally to each db [13:34:22] on the slaves first [13:34:43] if the purge happens and then the sync script runs, will the latter restore the rows from the master? [13:44:21] (03CR) 10Mforns: [C: 032] "LGTM! Please, merge if tested thx!" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/352784 (https://phabricator.wikimedia.org/T164730) (owner: 10Joal) [13:56:28] (03PS11) 10Mforns: Update banner monthly job to reuse index [analytics/refinery] - 10https://gerrit.wikimedia.org/r/347653 (https://phabricator.wikimedia.org/T159727) (owner: 10Joal) [14:00:20] fdans: holaaa standuppp [14:06:20] 10Analytics-Tech-community-metrics: Usable links for specific users or repositories - https://phabricator.wikimedia.org/T164934#3251580 (10Nemo_bis) [14:07:36] 10Analytics-Tech-community-metrics: "Wiki Editions" should be "Wiki edits" - https://phabricator.wikimedia.org/T164935#3251594 (10Nemo_bis) [14:07:51] 06Analytics-Kanban: Pageview hourly data in Pivot is not showing up correctly - https://phabricator.wikimedia.org/T164586#3238782 (10Nuria) This resolved by turning off the real time job, there were two granularities being written to the same place and druid did not like it [14:08:36] 06Analytics-Kanban, 13Patch-For-Review: Getting different versions of the same file - https://phabricator.wikimedia.org/T163338#3251609 (10Nuria) 05Open>03Resolved [14:08:51] 06Analytics-Kanban, 07Easy, 13Patch-For-Review: Story: VitalSignsUser selects Monthly Pageviews metric - https://phabricator.wikimedia.org/T75331#3251610 (10Nuria) 05Open>03Resolved [14:09:03] 06Analytics-Kanban: Piwik improvements - https://phabricator.wikimedia.org/T163000#3251612 (10Nuria) [14:09:05] 06Analytics-Kanban, 13Patch-For-Review, 15User-Elukey: Metrics and Dashboards for Piwik - https://phabricator.wikimedia.org/T163204#3251611 (10Nuria) 05Open>03Resolved [14:09:38] 06Analytics-Kanban, 15User-Elukey: Review the recent Varnishkafka patches - https://phabricator.wikimedia.org/T158854#3251615 (10Nuria) 05Open>03Resolved [14:09:58] 06Analytics-Kanban: Productionize Edit History Reconstruction and Extraction - https://phabricator.wikimedia.org/T152035#3251619 (10Nuria) [14:10:00] 06Analytics-Kanban: Synchronise changes for productionisation of mediawiki history jobs - https://phabricator.wikimedia.org/T160154#3251618 (10Nuria) 05Open>03Resolved [14:40:24] fdans: running puppet on eventlogging03 [14:40:49] Could not evaluate: Could not retrieve information from environment production source(s) puppet:///modules/eventlogging/filters.py [14:41:07] I should I have run the puppet compiler first :P [14:43:36] weird pcc looks good [14:45:13] now it works [14:45:14] mmmmm [14:45:23] there is something weird about ordering then [14:45:54] fdans: in any case, you can check in deployment-prep now [14:46:26] sorry elukey, was afk for a sec [14:50:53] so elukey I don't understand, did this fail? [14:54:04] first run it failed, then it suceeded [14:55:49] fdans: I am currently thinking if it is good to have a file that is stored under a class module in the role [14:56:08] but you can test the change now, I need to think about it a bit, tomorrow I'll let you know [14:56:13] 10Analytics-Cluster, 06Analytics-Kanban, 06Operations, 10ops-eqiad, 15User-Elukey: Analytics hosts showed high temperature alarms - https://phabricator.wikimedia.org/T132256#3251818 (10Cmjohnson) [15:01:16] joal: 1040 is broken too, needs to be repaired as 1030 :( [15:01:18] joal: looking at redirects for wikidata [15:01:22] joal: i think [15:01:57] elukey: :( [15:02:12] elukey: broken memory ? [15:02:27] maintenance issue, T164942 [15:02:27] T164942: Analytics1040 system board repair needed - https://phabricator.wikimedia.org/T164942 [15:04:29] elukey: tested, all seems to behave correctly! [15:04:31] arf [15:04:48] bot requests are not being passed to mysql [15:05:00] fdans: niceeezzz [15:05:19] elukey: you mean filters.py? [15:05:44] yeah, I am not sure about source => 'puppet:///modules/eventlogging/filters.py'.. maybe we'd need to put it in the role's files [15:06:00] because technically we are referencing a module's file from a role [15:06:19] 10Analytics-Tech-community-metrics: "Wiki Editions" should be "Wiki edits" - https://phabricator.wikimedia.org/T164935#3251886 (10Aklapper) p:05Triage>03Lowest It's in `grimoirelab/panels/dashboards5/mediawiki.json` and `grimoirelab/panels/dashboards5/overview.json` [15:08:01] 10Analytics-Tech-community-metrics: Usable links for specific users or repositories - https://phabricator.wikimedia.org/T164934#3251893 (10Aklapper) 05Open>03Invalid No. [[ https://www.mediawiki.org/wiki/Community_metrics#wikimedia.biterg.io | "You can share URLs of dashboards with applied filters by selecti... [15:09:22] fdans: going to change the code review and re-upload [15:09:37] elukey: coooool! [15:09:43] mi piace [15:13:45] elukey, can you please restart pivot :) (i have deleted a test data set) [15:14:52] mforns: done! [15:14:59] elukey, thanks! [15:17:08] nuria_: I have confirmation that the bulk of the diff for wikidata comes from per-domain on m.wikidata.org having no Last-access cookie set while still accpeting them [15:17:43] joal: ya, i just looked at redirects on wikidata [15:17:45] The bizarre thing nuria_ is that there is NO last-access-global null for mobile [15:18:26] joal: indeed bizarre [15:18:40] joal: that redirect must be handled somewhere different [15:20:10] is the active user count for Wikipedia easily available somewhere? [15:20:13] fdans: https://gerrit.wikimedia.org/r/#/c/352582/7 [15:20:25] as in, users who have been logged in recently [15:20:45] fdans: so basically I put the filters.py under the role dir, to avoid cross reference modules/roles [15:20:49] do we even have data for it now that user sessions are super long? [15:21:59] elukey: oh I see, that makes sense [15:22:14] tgr: I'm sure we don't have that in hive - any chance in eventlogging mforns ? [15:22:40] tgr: for wikipedia at large? or an specific wikipedia en.wikipedia .. etc? [15:23:04] tgr: ah , you mean users login, actual users? or devices? [15:23:05] nuria_: preferably the first, but I would take anything [15:23:16] tgr: we do not store any data for users [15:23:25] tgr: we just count devices [15:23:48] whichever, I just need the magnitude [15:23:53] logged-in only, though [15:24:39] number of pageviews with a session cookie would work as well [15:25:40] tgr: mmmm, i think we discard all cookies when storing data ... looking [15:26:22] nuria_: we are planning a new service for logged-in users (reading lists), we have data from apps to approximate what percentage of users use it, I'm trying to project an absolute usage number from that [15:27:03] tgr: it might be that you need to instrument to get that data. I can see: https://meta.wikimedia.org/wiki/Schema:LoginUserAgent [15:27:36] thx [15:27:47] tgr: if that is still on teh works that might give you an approximation, let's see [15:28:20] fdans: refreshed the change in deployment-prep and ran puppet on deployment-eventlogging03, all good [15:28:30] these days a login lasts for 365 days though so that's hard to get a reliable estimate from [15:28:45] otherwise I could just use user_touched in the DB [15:29:05] elukey: deploying requires this change to be merged [15:29:06] https://gerrit.wikimedia.org/r/#/c/352579/ [15:29:12] tgr: nah, that schema is no longer active, do not bother [15:29:21] nuria_ mforns any chance you can take a look at it? [15:29:50] fdans: no, we should not need to merge code to test on beta [15:30:02] fdans: you can pull code directly there via gerrit [15:30:04] I mean for prod [15:30:14] I'm already testing that code on beta :) [15:30:27] fdans: wait is that code review in beta right now? [15:30:38] fdans: ah ok, are you done testing on beta? [15:31:03] fdans: as in tested what you changed but also testing teh system in general making sure your change did not break anything [15:31:11] tgr: looking ta something else [15:31:13] *at [15:31:53] elukey: it was [15:31:57] tgr: i can think of a proxy for you that might work [15:32:12] let me give it another lap just to be sure that everything is ok [15:32:21] fdans: from who? Andrew? Jenkins is still saying -1 [15:33:03] yeah CI in eventlogging is broken atm, there's a task open [15:33:32] it says ./tests/test_topic.py:97:1: W391 blank line at end of file [15:33:43] weird [15:33:50] ahhhh this one is eventlogging [15:33:51] okok [15:34:48] fdans: so the puppet change needs to go BEFORE the scap one right? [15:35:07] elukey: there is a scap change? [15:35:33] there are two changes, one for EL to add the "filter" handler [15:35:44] and one in puppet, which is the one we discussed today [15:36:24] sure but you deploy el with scap :) [15:36:33] I said "scap change" to mention it [15:36:35] sorry [15:36:40] which one needs to go out first? [15:36:58] the EL one [15:37:10] super [15:37:19] all right so I'd say to do it tomorrow if the patch gets reviewed [15:37:21] ok? [15:37:42] but I'm retesting all the stuff now to make sure I haven't missed anything [15:37:51] I get the deplooyment crazies [15:38:51] tgr: still looking, give me a sec [15:39:18] so I'm going to try and deploy the change in beta from the tin, in a non-hacky way [15:39:25] and see if it works [15:40:35] tgr: would it work as an estimation of logins per wiki to look at requests for Special:UserLogin [15:40:39] tgr: ? [15:41:14] tgr: you can use nocookies attribute also to further filter your search [15:41:31] nuria_: logins don't easily translate to users [15:42:36] some users log in every day/week, others once a year [15:43:15] as an upper bound it could come in handy though, thanks for the suggestion [15:43:51] what's the best place to look at unique device counts? the reportcard 2.0? [15:44:06] tgr: let me understand what you need [15:44:19] tgr: you need the number of users that right now have a valid session? [15:45:03] elukey: scap is still giving me the ol "reference is not a tree error" when trying to deploy a change that is not merged to master [15:45:43] ideally I'd need the number of unique persons (or devices, not sure which is better) who have been logged in in the last day or month [15:46:23] if we have the number of valid sessions at a single point in time, that could be useful too [15:49:07] fdans: how did andrew solve the issue? [15:49:20] DA OPS HAMMER [15:49:29] basically, we are going to create an API, I need to know how much requests that API can be expected to get. The Android app has something near-identical to the API (but it uses local storage, not Wikipedia), we know how much the average user uses it. So I was thinking of doing (request frequency for average Android user) * (all logged-in users) / (all Android users) as a (poor) estimate [15:49:48] and I followed by just modifying stuff in the EL host repo [15:53:35] tgr: ok, i cannot find anything that would work for what you need exactly but you can 1) gather that data via eventlogging 2) estimate it with requests of loginpage [15:53:56] as every user that has logged in has requested that page and it is thus an upper bound [15:54:14] tgr: It appears on x-analytics : "ns=-1;special=Userlogin;WMF-Last-Access=09-May-2017;WMF-Last-Access-Global=09-May-2017;https=1 " [15:54:27] tgr: as "special=Userlogin" [15:55:06] tgr: let me know what you think [15:55:28] (sorry, elukey ^) [15:57:20] 10Analytics-Dashiki, 06Analytics-Kanban, 05MW-1.29-release (WMF-deploy-2017-04-04_(1.29.0-wmf.19)), 13Patch-For-Review: Move Dashiki config from CommonSettings to extension - https://phabricator.wikimedia.org/T161038#3252131 (10Nuria) 05Open>03Resolved [15:57:29] 10Analytics-Dashiki, 06Analytics-Kanban, 13Patch-For-Review: Change default timeline for browser reports to be recent (not 2015) - https://phabricator.wikimedia.org/T160796#3252132 (10Nuria) 05Open>03Resolved [15:57:36] 10Analytics-Dashiki, 06Analytics-Kanban, 13Patch-For-Review: Refactor aqs api and usage for simplicity - https://phabricator.wikimedia.org/T161933#3252133 (10Nuria) 05Open>03Resolved [15:57:44] 10Analytics-Dashiki, 06Analytics-Kanban, 13Patch-For-Review: annotations should show on tab layout - https://phabricator.wikimedia.org/T162482#3252134 (10Nuria) 05Open>03Resolved [15:57:52] 06Analytics-Kanban: Pageview hourly data in Pivot is not showing up correctly - https://phabricator.wikimedia.org/T164586#3252135 (10Nuria) 05Open>03Resolved [16:00:30] fdans: mmm we keep using it, maybe the solution is not right.. what is the error that you are seeing? [16:00:35] nuria_: thanks! to gather the data, I would need to add a logged-in flag to the Last-Access thing, that (or rather the ensuing debates about privacy) seems too much effort. The login count will be useful, but I think counting users with a recent user_touched setting is an easier way to get more or less the same data [16:01:08] https://www.irccloud.com/pastebin/cvVuU682/ [16:01:13] elukey: ^ [16:01:40] what's the best way to get device counts (not logged-in, just the generic one)? reportcard 2.0? It would be useful as an upper bound [16:02:29] tgr: you can also instrument the login page on eventlogging for a while and after remove the instrumentation [16:03:38] tgr: device counts per domain (mobile and desktop are counted separately) are here: https://analytics.wikimedia.org/dashboards/vital-signs/#projects=eswiki/metrics=UniqueDevices [16:03:49] tgr: split by domain [16:08:02] fdans: still trying to deploy? It says that the locks is hold [16:08:15] cool, thanks! [16:08:48] fdans: anyhow I tried to clean up in deployment-tin (it was a mess) properly cherry picking in /srv/etc.. :D [16:08:59] then cleaned up on eventlogging03 and running puppet [16:09:09] still can't use scap deploy -u beta though [16:09:18] need to go now but will double check later ok? [16:09:30] * elukey afk! [16:09:35] elukey: sure! [16:33:48] milimetric, yt? [16:43:49] mforns: hey, was just in my 1/1 [16:44:09] I'm measurably sadder today for missing standup [17:01:31] mforns: I have no idea why but that meeting wasn't on my calendar even though I said yes to the invite and the email was in my inbox [17:01:41] sorry about that, looked like you were fine though [17:01:50] is that why you pinged me or something else? [17:02:04] hey milimetric, sorry for the emergency ping, I was a bit intimidated :P [17:02:10] yea yea [17:02:31] milimetric, do you want to batcave for 5 mins to discuss the meeting? [17:02:36] oh, I didn't take it as an emergency ping, you gotta tag your pings like milimetric YOU THERE?!!! [17:02:44] yeah, for sure [17:02:44] omw [17:02:47] xD [17:27:18] 10Analytics, 10Fundraising-Backlog, 10MediaWiki-extensions-CentralNotice: Make banner impression counts available somewhere public - https://phabricator.wikimedia.org/T115042#1713344 (10mforns) Hi folks! Following up on the meeting that we just had ([[ https://docs.google.com/document/d/1R3G04PCe3xZAR2azWPz... [18:52:53] 10Analytics, 10Analytics-EventLogging, 06Collaboration-Team-Triage, 10Edit-Review-Improvements: [betalabs] eventlogging does not record raw count for RecentChangesTopLinks e - https://phabricator.wikimedia.org/T164976#3252928 (10Etonkovidova) [18:53:06] 10Analytics, 10Analytics-EventLogging, 06Collaboration-Team-Triage, 10Edit-Review-Improvements: [betalabs] eventlogging does not record raw count for RecentChangesTopLinks - https://phabricator.wikimedia.org/T164976#3252943 (10Etonkovidova) [19:36:24] Yay ! Internet is back ! [20:15:21] (03CR) 10Gehel: "Very minor style comments..." (034 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/327855 (https://phabricator.wikimedia.org/T162054) (owner: 10EBernhardson) [20:40:08] fdans: sorry for the late ping, now everything works [20:40:24] how did you add your changes to deployment-tin ? [20:41:00] because I just cherry picked with the gerrit commands (anonymous user - top right corner of the code review) [20:41:14] and I don't see the detached branch [20:41:18] and scap works [20:45:40] restarted also eventlogging in eventlogging03 [20:45:58] fdans: going afk again, let's sync tomorrow morning! [21:40:07] bye team ttyl