[00:06:41] 10Analytics-Data-Quality, 10Operations, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3939712 (10Dzahn) p:05Triage>03Normal @Tbayer purely from a ticket triaging perspective: since the ticket title is "vet reliability of the... [00:09:39] 10Analytics, 10Operations, 10ops-eqiad: rack/setup/install notebook100[34] - https://phabricator.wikimedia.org/T183935#3939716 (10RobH) a:05RobH>03elukey [00:10:40] 10Analytics, 10Operations, 10ops-eqiad: rack/setup/install notebook100[34] - https://phabricator.wikimedia.org/T183935#3867856 (10RobH) a:05elukey>03Gehel These are finishing their initial puppet runs and are ready to be pushed into service role. Escalating to @elukey. [00:10:43] 10Analytics, 10Operations, 10ops-eqiad: rack/setup/install notebook100[34] - https://phabricator.wikimedia.org/T183935#3939722 (10RobH) a:05Gehel>03elukey [00:28:45] 10Analytics-Kanban, 10Discovery-Analysis, 10MobileApp, 10Wikipedia-Android-App-Backlog: Bug behavior of QTree[Long] for quantileBounds - https://phabricator.wikimedia.org/T184768#3939730 (10Nuria) Tested that initializing Qtree like QTree(131072,-16,1,_,None,None) fixes issue with precision, now i wish I c... [00:34:28] 10Analytics-Kanban, 10Operations, 10User-Elukey: Expand meitnerium's root partition to 100G - https://phabricator.wikimedia.org/T186020#3939747 (10Dzahn) ``` .. Fri Feb 2 00:23:13 2018 - INFO: - device disk/1: 99.30% done, 21s remaining (estimated) Fri Feb 2 00:23:34 2018 - INFO: - device disk/1: 100.00%... [00:39:03] 10Analytics-Kanban, 10Operations, 10User-Elukey: Expand meitnerium's root partition to 100G - https://phabricator.wikimedia.org/T186020#3939748 (10Dzahn) @elukey so yea, now we'd have to restart the instance from ganeti, as the comment above says rebooting from within the instance won't do it. You said above... [01:12:16] 10Analytics-EventLogging, 10Analytics-Kanban: Hive EventLogging tables not updating since January 26 - https://phabricator.wikimedia.org/T186130#3939824 (10Tbayer) >>! In T186130#3939177, @Ottomata wrote: > Ok, still not sure why that one job was stuck, but after killing it, the next scheduled run seemed to ha... [01:35:00] (03Abandoned) 10Yuvipanda: Set URL prefix the stupid way [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/304764 (owner: 10Yuvipanda) [01:35:20] (03Abandoned) 10Yuvipanda: Use configured redis host for sessions [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/304758 (owner: 10Yuvipanda) [02:49:56] 10Analytics-Data-Quality, 10Operations, 10Traffic: Vet reliability of the response_size field for data analysis purposes - https://phabricator.wikimedia.org/T185350#3940039 (10BBlack) @Dzahn I was planning to follow up a bit on some of the remaining questions above, just haven't gotten there yet :) [04:06:53] 10Analytics, 10Operations, 10Research, 10Traffic, and 6 others: Referrer policy for browsers which only support the old spec - https://phabricator.wikimedia.org/T180921#3940150 (10Tgr) @Nuria any thoughts about the next step? Should we just enable the fallback and check the data to see if it had any unexpe... [06:41:58] 10Analytics-Kanban, 10EventBus, 10Pywikibot-core, 10Patch-For-Review: EventStreams doesnt find any messages anymore - https://phabricator.wikimedia.org/T184713#3893064 (10Dalba) The default `requests` version available on Toolforge is `2.2.1`. Is it OK to require `2.9`? [07:11:18] 10Analytics-Kanban, 10Operations, 10User-Elukey: Expand meitnerium's root partition to 100G - https://phabricator.wikimedia.org/T186020#3940290 (10elukey) >>! In T186020#3939748, @Dzahn wrote: > @elukey so yea, now we'd have to restart the instance from ganeti, as the comment above says rebooting from within... [07:41:07] morninggg [07:41:37] so we'd need to reboot the archiva host/vm [07:41:49] to get the new 100G disk [07:42:01] it is fine anytime that we are not building right? [07:54:24] elukey: good for me ! [07:55:41] hello joal ! :) [07:56:04] Hi elukey :) [07:56:20] elukey: I hope you're not too jetlaged :) [07:57:27] joal: I woke up at 5:30 AM, not super bad but not good either :D [07:57:38] surely not that good indeed :) [07:57:55] elukey: it's almost lunch time for you then !!! :D [08:00:11] ahahah [09:33:21] 10Analytics-Kanban, 10EventBus, 10Pywikibot-core, 10Patch-For-Review: EventStreams doesnt find any messages anymore - https://phabricator.wikimedia.org/T184713#3940468 (10Xqt) @Dalba: It is required to use EventStreams after the last change on server side. I checked every release of requests and sseclient... [09:56:50] !log unique_devices-per_project_family-monthly-wf-2018-1 after failure [09:56:51] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:10:41] 10Analytics-Kanban, 10EventBus, 10Pywikibot-core, 10Toolforge, 10Patch-For-Review: EventStreams doesnt find any messages anymore - https://phabricator.wikimedia.org/T184713#3940561 (10Xqt) [13:01:45] joal: thanks for the review! [13:03:25] :) [13:04:18] do you think that 2 or 4 is ok for the moment? [13:15:21] elukey: best would be to match tasks with kafka partitions [13:19:07] joal: ah that makes sense, so for the moment only one mapreduce jobs would do the trick [13:19:52] elukey: I wonder how many tasks we use for other camus jobs though - I know that for webrequest the rule I stated is respected, but for others I can't recall :) [13:20:31] it definitely makes sense to have this mapping, didn't think about it [13:41:29] hey yall [13:41:38] I'm gonna try and work today, see if baby lets me :) [13:41:50] o/ [13:44:32] 10Analytics-Kanban, 10Operations, 10User-Elukey: Expand meitnerium's root partition to 100G - https://phabricator.wikimedia.org/T186020#3940931 (10elukey) Is it a simple gnt-instance reboot meitnerium.wikimedia.org right? [13:55:25] 10Analytics-Kanban, 10Operations, 10User-Elukey: Expand meitnerium's root partition to 100G - https://phabricator.wikimedia.org/T186020#3940943 (10Dzahn) Yea, or gnt-instance shutdown gnt-instance startup [13:56:05] joal: ok if I reboot archiva? [13:56:20] elukey: +1 [14:04:21] rebooting via ganeti console [14:08:19] ottomata[m]: hey! I see you've uploaded librdkafka1 0.11.3 to jessie-wikimedia/main [14:08:54] cp1008 was upgraded already, I've restarted all varnishkafka instances there and they seem to work fine [14:09:31] what's the reason for the upload though? [14:18:54] ema: iirc it was for an issue on scb nodes, but vk will eventually point to kafka jumbo that is running kafka 1.0 now, so definitely better to have an up to date librdkafka [14:27:45] joal: while copying over data from one partion to the other something went awol and the underlying host died for some time :( [14:28:02] elukey: :( [14:28:17] elukey: I'm not in need for it now, but that's uncool [14:28:41] 10Analytics-Kanban, 10Operations, 10User-Elukey: Expand meitnerium's root partition to 100G - https://phabricator.wikimedia.org/T186020#3940974 (10elukey) New disk in place, added ext4 and everything looks good. I mounted /dev/vdb1 to /mnt/archiva and started a cp -a from /var/lib/archiva to that dir, but ga... [14:28:51] all up and running now [14:29:08] * joal claps for elukey [14:29:26] didn't do anything, it restarted by itself :P [14:29:49] I've only said a lot of imprecations [14:29:54] not sure if they count [14:31:05] elukey: They always do - At least to keep us quiet [14:40:30] 10Analytics-Kanban, 10Operations, 10User-Elukey: Expand meitnerium's root partition to 100G - https://phabricator.wikimedia.org/T186020#3940981 (10akosiaris) How nice :(. But it does look like disk IO is a possible reproduction scenario for T181121. I 'll empty ganeti1005 to avoid having any worse problems d... [14:44:37] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Verify duplicate entry warnings logged by the m4 mysql consumer - https://phabricator.wikimedia.org/T185291#3940989 (10elukey) Super interesting that deployment-eventlog02 shows the same duplicate behavior in the logs, and als... [14:44:52] this is really interesting --^ [14:56:00] Gone to catch Lino at the creche - Back for standup [15:15:27] 10Analytics-Kanban, 10Operations, 10ops-eqiad: BBU alarms flapping for analytics1038 - https://phabricator.wikimedia.org/T185409#3941041 (10elukey) >>! In T185409#3921916, @RobH wrote: > This is an older R720xd, and uses an older H710 controller. > > While @Cmjohnson can check for a spare when back onsite,... [15:26:24] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Verify duplicate entry warnings logged by the m4 mysql consumer - https://phabricator.wikimedia.org/T185291#3941047 (10elukey) Other thought: Magnus seems to suggest a call to poll(0) every time a async produce is done, so I a... [15:34:37] * elukey afk for a bit [15:40:23] 10Analytics, 10Operations, 10Research, 10Traffic, and 6 others: Referrer policy for browsers which only support the old spec - https://phabricator.wikimedia.org/T180921#3941061 (10Nuria) I see no problem with changset, my comment was about pointing out that "Flipping these Edges/Safaris to origin is going... [15:45:36] \o [15:47:20] sheesh. Last night i put together a NN based on 'a Neural Click Model for Search'. Pointed it at ~4MB of user session data, and overnight it only managed 47 epochs :S I guess this really needs gpu's :P [15:47:42] 10Analytics, 10Operations, 10Research, 10Traffic, and 6 others: Referrer policy for browsers which only support the old spec - https://phabricator.wikimedia.org/T180921#3941078 (10Nuria) Safari sessions still appear "shorter": {F12961703} Compare to chrome ones: {F12961706} So Safari is sending us no da... [15:47:56] and wrong room :P [16:05:24] ebernhardson: we understood some words there eh? [16:20:59] o/ [16:21:22] I just deleted the "Large scale data analysis" meeting we had today because we rescheduled that a long time ago [16:21:29] So I think this was an oversight of mine. [16:21:33] Let me know if I'm mistaken. [16:21:45] joal, milimetric, ottomata[m] ^ [16:29:57] nope, that's right halfak, I thought I deleted it too, but maybe there's a bunch of instances roaming around with different periodicity [16:36:53] hey a-team, do we have stand-up today? just joined [16:37:07] yeah, I'll be there [16:38:55] k [16:38:59] mforns: how did the flight go? [16:39:09] good! you? [16:39:25] long but good, no delays :) [16:39:26] slept a lot [16:41:45] i see in https://phabricator.wikimedia.org/T148843 we ordered an analytics server with an AMD FirePro GPU. is it possible to request access? [16:42:27] (i also see that it potentially never worked, tickets arn't too clear) [16:46:14] ebernhardson: yeah the latter :) [16:47:25] tl;dr is that we'd need to figure out a way to heavily test new settings that potentially involve reboots etc.., so not really suitable at the moment for stat1005 (where the gpu is0 [16:47:48] so no short term plan to make it work but we'll work on it hopefully [16:48:39] ok, well good to know. thanks! [16:50:57] Hi ebernhardson - Woaw ... Even with GPU, if it's that slow, you'll never converge 1 [16:55:37] Thanks halfak for meeting cleanup :) [16:56:31] :) [16:58:06] joal: yea it's surprisingly slow for something not very complex. [16:58:21] * ebernhardson blames laptop [16:58:27] ebernhardson: possible 1 [17:00:31] ebernhardson: can you take advantage of micro-batches with multi-core? [17:00:36] elukey: coming to standup? [17:09:22] !log Webrequest upload 2018-02-02 hours 9 and 11 dataloss warning have been checked - They are false positive [17:09:23] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:09:52] 10Analytics-EventLogging, 10Analytics-Kanban, 10Beta-Cluster-Infrastructure, 10User-Elukey: EventLogging broken in beta - https://phabricator.wikimedia.org/T185952#3941312 (10Nuria) a:03Nuria [17:10:47] 10Analytics-Kanban, 10Patch-For-Review: Wikiselector Perf issues on Chrome - https://phabricator.wikimedia.org/T185334#3941318 (10Nuria) a:05Nuria>03mforns [17:13:10] joal: i think it is, at least it's using multiple cores. But only 2.5 (laptop only has 2 real cores) [17:13:31] ebernhardson: 2 cores is not that much to train a ML model :) [17:13:53] joal: lol, indeed. But i thought 4MB was toy data sizes :P [17:14:00] ebernhardson: true ! [17:14:18] ebernhardson: possibly depends on the size of your work units [17:24:16] (03CR) 10Joal: [V: 032 C: 032] "Good for me :)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/406064 (owner: 10Ladsgroup) [17:29:48] 10Analytics, 10Operations, 10Patch-For-Review: setup/install eventlog1002.eqiad.wmnet - https://phabricator.wikimedia.org/T185667#3941375 (10Cmjohnson) [17:30:24] 10Analytics, 10Operations: setup/install eventlog1002.eqiad.wmnet - https://phabricator.wikimedia.org/T185667#3941379 (10RobH) [17:31:31] elukey: everytime you talk about system-d I have that in mind -- re [17:31:32] elukey: eventlog1002 is doing its initial puppet run [17:31:39] should i assgin it to you for implementation once thats done? [17:31:39] elukey: https://en.wikipedia.org/wiki/System_D [17:32:51] 10Analytics, 10Operations: setup/install eventlog1002.eqiad.wmnet - https://phabricator.wikimedia.org/T185667#3941388 (10RobH) [17:34:15] 10Analytics, 10Operations: setup/install eventlog1002.eqiad.wmnet - https://phabricator.wikimedia.org/T185667#3922494 (10RobH) a:05RobH>03Ottomata So due to both Faidon and Mortiz's comments, I've gone ahead and installed with stretch. If it needs to be re-imaged to fall back to an older distro, then dhcp... [17:34:23] 10Analytics, 10Operations: setup/install eventlog1002.eqiad.wmnet - https://phabricator.wikimedia.org/T185667#3941396 (10RobH) [17:34:49] 10Analytics-Kanban, 10Operations, 10ops-eqiad: BBU alarms flapping for analytics1038 - https://phabricator.wikimedia.org/T185409#3941409 (10Cmjohnson) Can this be done around 1500UTC 6 Feb? I will be swapping out another bbu at the same time. [17:35:04] 10Analytics, 10Operations: setup/install eventlog1002.eqiad.wmnet - https://phabricator.wikimedia.org/T185667#3941410 (10elukey) >>! In T185667#3936830, @faidon wrote: > I had a look at both `modules/eventlogging/files/eventloggingctl` and `modules/eventlogging/templates/upstart/*`. They all seemed fairly easy... [17:36:34] 10Analytics-Kanban, 10Operations, 10ops-eqiad: BBU alarms flapping for analytics1038 - https://phabricator.wikimedia.org/T185409#3941413 (10elukey) >>! In T185409#3941409, @Cmjohnson wrote: > Can this be done around 1500UTC 6 Feb? I will be swapping out another bbu at the same time. Fine to me! We have a b... [17:45:21] anybody knows if I need to do some special configuration to make kafka timestamps work? I try to fetch any timestamps on my local kafka setup and I get always -1 [17:48:33] 10Analytics: Wikistat Betas: expand topic explorer by default - https://phabricator.wikimedia.org/T186335#3941423 (10Nuria) [17:48:42] 10Analytics: Wikistat Beta: expand topic explorer by default - https://phabricator.wikimedia.org/T186335#3941432 (10Nuria) [17:48:50] SMalyshev: I have not played with that - can't help :( [17:49:54] 10Analytics: Wikistat Beta: expand topic explorer by default - https://phabricator.wikimedia.org/T186335#3941423 (10Nuria) [17:49:56] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats Beta - https://phabricator.wikimedia.org/T186120#3941434 (10Nuria) [17:57:45] SMalyshev: we had that problem before right? don't we have a ticket about it? [17:58:34] 10Analytics-Kanban, 10Patch-For-Review: Remove AppInstallIId from EventLogging purging white-list - https://phabricator.wikimedia.org/T178174#3941448 (10Nuria) Ping @keynote2k who will be handling these type of questions going forward [18:00:29] nuria_: not that I know of... I'm not even sure it's a problem because it seems to be working on production kafka... but my local kafka is acting weird [18:00:44] maybe I just didn't figure out the right way to do it [18:00:48] SMalyshev: let me dig i have no memory [18:00:56] SMalyshev: nah, this is happen before [18:05:53] there's also another question. I'm looking at the "dt" timestamps in meta in the mediawiki.revision-create stream and they seem not to be monotonous [18:06:08] is this normal? where these come from? [18:07:12] even stranger, they jump like this: 2018-01-23T00:39:44Z [18:07:12] 2018-01-23T01:39:45Z [18:07:12] 2018-01-23T00:39:45Z [18:07:24] by exactly an hour. No idea what's up there. [18:41:11] SMalyshev: back onto this [18:41:58] SMalyshev: they are produced directly by mediawiki [18:42:19] aha, so those are mediawiki ones, not the same as kafka ones? [18:44:18] SMalyshev: so there will be two timestamps I think (one mw filled in when even is produced) one added by kafka (outside schema) when event gets to the system [18:44:30] SMalyshev: let me triple check that original event includes a timestamp [18:45:17] ahh I found the problem.. it's my code. turns out these are timestamps with timezones and my code ignored it [18:45:28] SMalyshev: schema: https://github.com/wikimedia/mediawiki-event-schemas/blob/master/jsonschema/mediawiki/revision/create/2.yaml [18:45:45] it assumed they all utc (which I thought everything in mediawiki is) but those are not [18:46:58] SMalyshev: is this the field you are talking about? https://github.com/wikimedia/mediawiki-event-schemas/blob/master/jsonschema/mediawiki/revision/create/2.yaml#L32 [18:47:10] nuria_: yes that one [18:47:21] ok I'll fix my code then to do timezones [18:48:02] SMalyshev: ok, ya, still the kafka time for timebased consumption is internal to kafka I think and thus could be very different [18:48:22] SMalyshev: "very" meaning not same exact milisec [18:48:36] yes, that's ok I just need to use the right one [18:49:01] another question: do we have a plan to extending storage beyond 7 days? [18:52:56] SMalyshev: that one i do not know the answer to ottomata[m] is the person to ask (cc elukey ) [18:53:57] ok I can make phab task I just wanted to know if it doesn't exist yet. It's not urgent for now since I'm still working on the code, but once the code is done we'll need it [20:41:13] 10Analytics, 10Operations, 10Research, 10Traffic, and 6 others: Referrer policy for browsers which only support the old spec - https://phabricator.wikimedia.org/T180921#3941978 (10Tgr) @Nuria the changeset did not change Wikimedia referrer policy, just made MediaWiki able to do so. What I would like is to... [22:56:49] 10Quarry: Add a way to navigate from a quarry output URL to the query - https://phabricator.wikimedia.org/T185665#3942330 (10zhuyifei1999) 05Open>03stalled [23:39:11] SMalyshev: for extension of the kafka window? [23:39:28] SMalyshev: phab sounds good. I actually do not know if the limitation there is space or other