[00:30:42] Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2651084 (Neil_P._Quinn_WMF) >>! In T135762#2506770, @ellery wrote: > In order to do statistical testing, you would need to compare the fraction of users who clicked the button... [07:04:35] Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2651429 (ellery) @Neil_P._Quinn_WMF I'm saying that for any online AB test you to be able to group the experimental data by user. The proposed framework does not provide a mec... [07:16:53] Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2651433 (ellery) @Nuria I certainly don't disagree that segmentation must be done at the user level. I'm saying that the test statistics (or metrics as you are calling them)... [07:40:58] !log restart cassandra on aqs100[456] for T130861 - only aqs1004 is taking live traffic [07:40:59] T130861: Investigate and implement possible simplification of Cassandra Logstash filtering - https://phabricator.wikimedia.org/T130861 [07:41:01] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [07:54:43] this is a bit weird [07:55:05] I did nodetool-a drain && systemctl restart cassandra-a [07:55:15] all went fine and the host came back to life [07:55:19] but [07:55:34] nodetool-a status now reports for aqs1004-a a load of 638.08 GB [07:55:43] meanwhile it was ~2.5TB before [07:55:55] and I've never seen this behavior before [07:56:07] aqs1004 served traffic fine during the restart [07:59:12] Load - updates every 90 seconds [07:59:12] The amount of file system data under the cassandra data directory after excluding all content in the snapshots subdirectories. [07:59:28] root@aqs1004:/srv# du -hs * [07:59:28] 639G cassandra-a [07:59:29] 657G cassandra-b [08:00:21] so it makes sense from a disk perspective [08:00:48] maybe nodetool status was listing inconsistent data before the restart? [08:02:16] joal: --^ (when you have time) [08:14:47] elukey: never experienced that before :( [08:19:27] it makes sense from what I am reading that the load is now GBs and not TB, because it reflects what is actually stored [08:19:38] but theoretically it should update by itself [08:20:13] elukey: weird though that it was 2.5T before, no? [08:20:36] oh yes [08:21:03] I checked aqs1003 and it is TBs but it is consistent with du -hs /var/lib/cassandra [08:21:55] elukey: right [08:21:56] ah and we also need to restart the Hadoop cluster [08:23:30] joal: https://grafana.wikimedia.org/dashboard/db/aqs-cassandra-system [08:23:32] :( [08:23:44] disk load shows a big drop [08:24:20] elukey: you only restarted aqs1004-a, right? [08:25:28] yes [08:25:50] ok, at least this is coherent :) [08:26:17] but it doesn't make a lot of sense [08:26:31] agreed [08:26:37] I mean, aqs1004-a is now the only metric that makes sense [08:26:45] right [08:26:47] because the other metrics are weird [08:26:50] ahahhaah [08:26:58] new cluster new joy [08:27:23] at least, you take the time to really understand it :) [08:27:29] I never did that on the old one [08:29:11] elukey@neodymium:~$ sudo -i salt aqs100[456]* cmd.run 'du -hs /srv/cassandra*' [08:29:14] aqs1004.eqiad.wmnet: 639G /srv/cassandra-a 657G /srv/cassandra-b [08:29:17] aqs1006.eqiad.wmnet: 678G /srv/cassandra-a 623G /srv/cassandra-b [08:29:20] aqs1005.eqiad.wmnet: 658G /srv/cassandra-a 663G /srv/cassandra-b [08:29:29] horrible paste but the numbers are there [08:30:03] so I am almost sure that this is due to incosistent metrics reported by cassandra [08:30:10] that is something concerning [08:30:24] ok, something that makes sense is total space used accross cluster [08:32:14] https://issues.apache.org/jira/browse/CASSANDRA-10430 [08:33:37] weird, looking other ones [08:33:55] we have 2.2.6 :( [08:34:57] hm [08:36:24] but it must be something that affects metrics reported by Cassandra [08:36:32] elukey: agreedd [08:37:06] elukey: du gives correct results, and those match nodetool cfstats, si I think it's load that is inaccurate [08:38:22] I am going to restart aqs1004-b too to double check this ok? [08:39:01] elukey: ok, maybe another machine (5-a or b for instance?') [08:39:09] elukey: nevermind, doesn't change anythong [08:39:19] super, proceeding [08:44:17] same thing happened for aqs1004-b [08:45:02] I assume there is a bug somewhere ... But that's not great [08:47:33] elukey: I'm booked for BigDataApacheConf :) [08:48:20] I need to send the trip proposal today, too swamped during the past days :( [08:48:46] elukey: np, just letting you know :) [08:49:30] elukey: the thing that makes me happy is the disk read throuput being way lower on new aqs [08:51:45] it seems really good for the moment, disk load weirdness aside [08:52:09] yup [08:52:20] I would be tempted to finish the rolling restart and add aqs100[56] to LVS, but maybe waiting for urandom might be a safer option [08:52:34] elukey: I think you can go [08:52:35] I am going to open a phab task for this weirdness [08:52:57] elukey: I really think error is metric reporting [08:53:16] oh yes I am 99% convinced too but we invested too much time in loading :D [08:53:28] paranoia level exceeded I know [08:53:29] :D [08:53:29] :D [09:06:21] joal: check cfstats on aqs1005-a [09:06:24] Space used (total): 3557234299432 [09:06:30] this is for per article flat [09:08:14] so maybe disk load takes into account other values rather than mere disk space used? Maybe also disk cache? [09:12:26] ok opened a task https://phabricator.wikimedia.org/T146130 [09:12:34] will wait for urandom before proceeding :D [09:30:00] elukey: maybe compaction involves having old data to be deleted? [09:33:33] it might be yes, good point [09:43:18] Analytics, Cassandra: Inconsistent Cassandra disk load shown in metrics and nodetool status - https://phabricator.wikimedia.org/T146130#2651703 (elukey) [10:16:39] Analytics-Tech-community-metrics: Remove operations/kafka from Git metrics - https://phabricator.wikimedia.org/T146135#2651767 (Qgil) [10:50:39] elukey: spark UI seems to have issues rendering correctly on yarn.w.o : https://yarn.wikimedia.org/proxy/application_1472219073448_117202/ [10:51:32] buuuuu [10:51:34] checking [10:55:03] this is weird, only for spark [10:55:17] and there seems nothing different from other links from the URL perspective [10:55:20] mmmmm [10:55:46] elukey: not from url, but content is different (yarn roxies to spark ui) [10:56:42] but in apache I can see http 200 [10:56:55] with positive content len [10:56:58] what the hell [10:57:01] :( [10:59:08] so I tried in localhost and the page is returned [10:59:21] so there must be something weird between httpd and varnish [11:00:05] I am going to ask to traffic asap :( [11:42:03] Analytics-Tech-community-metrics: Missing time units for percentile values - https://phabricator.wikimedia.org/T145425#2651877 (Lcanasdiaz) a:Dicortazar @Dicortazar can you have a look at what Andre reports? [12:14:04] * elukey lunch! [12:31:38] Analytics-Tech-community-metrics: Remove operations/kafka from Git metrics - https://phabricator.wikimedia.org/T146135#2651944 (Aklapper) Steps: Go to https://wikimedia.biterg.io/ click "Git", search "Dana Powers", see it's about https://gerrit.wikimedia.org/r/operations/debs/python-kafka `operations/debs/p... [12:32:35] Analytics-Tech-community-metrics: Git repo blacklist config not applied on wikimedia.biterg.io? - https://phabricator.wikimedia.org/T146135#2651947 (Aklapper) p:Triage>Normal [13:13:00] good monringnn [13:17:36] o/ [13:18:49] hey nuria_. re T146064, I think it's best if I pair with joal and learn this myself. It will become handy in the future. [13:18:49] T146064: Oozie job to extract data for WDQS research - https://phabricator.wikimedia.org/T146064 [13:19:05] I'll ping J about it. [13:19:53] Analytics, Research-and-Data-Backlog, Research-collaborations, Research-management: Oozie job to extract data for WDQS research - https://phabricator.wikimedia.org/T146064#2652013 (leila) @JAllemandou Can I pair with you when you're working on this? It's useful for me to learn this from you, cuz... [13:20:22] leila: am happy to help with that too [13:20:24] Analytics, Research-and-Data, Research-collaborations, Research-management: Oozie job to extract data for WDQS research - https://phabricator.wikimedia.org/T146064#2652018 (leila) a:leila>None [13:20:35] oh, if joal is working on it, i'll let him [13:20:42] but if you are and just need some help, can do [13:20:48] yeah, ottomata. thanks though. :) [13:57:17] Analytics-Kanban: Reportupdater calculations for Pages Created and Edit counts - https://phabricator.wikimedia.org/T141479#2652112 (mforns) [14:01:45] ottomata: I have a doubt about pivot and service runner.. both of them are trying to bind the port afaics, so this could be a problem when running pivot no? [14:02:26] possibly, depends on how the pivot http server is created [14:02:27] looking at code... [14:02:39] it is created in ../build/server/www.js [14:02:44] after running gulp [14:02:50] sigh I didn't check this part before today [14:04:18] maybe we need that service-runner uses directly server/app.js ? [14:09:26] hmm, elukey i think if you copy www.js, or even if you modify it to return the server object [14:09:39] you just need the entrypoint to return an http server object [14:09:45] www.js creates it [14:09:53] but doesn't return or export it [14:10:26] so, you could copy/paste all of www.js into a ./app.js (or something) in your deploy repo [14:10:34] and wrap it in a function [14:10:34] like [14:11:14] createPivotServer(options) { [14:11:15] ... [14:11:15] server.listen(config_1.PORT); [14:11:15] return server; [14:11:15] } [14:11:30] options would be passed in from the service runner config.yaml file [14:11:40] so uhh, we could even modify it so that it took its config from there [14:11:41] but I thought that service-runner was responsible for that, calling server/app.js as entry point? [14:11:42] instead of cli [14:11:46] yes, [14:11:53] you'd set the entrypoint to whatever [14:11:56] it is [14:12:05] but, you need the entrypoint to return a server to service-runner [14:12:14] hmmm [14:12:14] ah okok [14:12:21] i dunno, maybe an express app works? [14:12:22] not sure. [14:12:31] if it does then you might be able to point it at app.js [14:12:34] build/server/app.js [14:12:35] that is [14:12:38] because it exports the app [14:12:44] not sure about that though [14:12:47] mobrovac: ^? [14:18:30] elukey: ottomata: mind restating the q? [14:18:31] a lot of backlog here to read [14:18:32] :P [14:20:00] mobrovac: what does the service runner entrypoint need to do? [14:20:12] start the server and bind it [14:20:19] does it need to return the server? [14:20:23] yes [14:20:36] ja so, the entry point needs to be a function that returns a started http server [14:20:38] cf the service template's app.js [14:20:41] not express app, right? [14:21:01] lemme paste the link for ya [14:21:46] ottomata: elukey: https://github.com/d00rman/service-template-node/blob/master/app.js#L214-L230 [14:22:07] this is the export require'd by service-runner on service start-up [14:22:38] elukey: ya so that example there is returning a Promise of an http server, which service runner knows what to do with [14:22:39] specifically, you are interested in https://github.com/d00rman/service-template-node/blob/master/app.js#L198-L209 [14:22:42] it works with just a plain http server too [14:23:13] a slightly simpler example i'm doing for kasocki: https://github.com/ottomata/eventstreams/blob/master/app.js [14:23:30] app.js exports just a function [14:23:38] that when called with options returns an http server [14:24:16] ja so elukey, since build/server/www.js does the http server initing, i think you can just copy/paste that code into your own app.js (or whatever file), and wrpa it in a function that returns the http server [14:24:34] you can then also modify the code to work with options assed passed in from service-runner config.yaml file [14:24:39] as passed* [14:24:40] hah [14:25:18] ahahahhha [14:25:51] * mobrovac likes assed options [14:27:48] haha [14:32:03] ok so service runner delegates the creation of the http server basically [14:34:20] Analytics, Research-and-Data, Research-collaborations, Research-management: Oozie job to extract data for WDQS research - https://phabricator.wikimedia.org/T146064#2652199 (JAllemandou) @leila : Sure ! When do you want us to proceed? [14:35:20] Analytics, Research-and-Data, Research-collaborations, Research-management: Oozie job to extract data for WDQS research - https://phabricator.wikimedia.org/T146064#2652202 (leila) Some time later tonight your time? [14:36:01] leila: heya :) Might be easier here for synchtro :) [14:36:57] leila: I'll have an hour in 1h30, would that match your schedule? [14:37:12] Analytics-Kanban, EventBus, Wikimedia-Stream: Kasocki Prototype - https://phabricator.wikimedia.org/T145095#2652203 (Ottomata) Alrighty, I feel pretty good about this prototype. https://github.com/wikimedia/kasocki (Also [[ https://phabricator.wikimedia.org/diffusion/WKSK/ | diffusion ]]) I've gott... [14:39:14] Analytics-Kanban: Pageview hourly stores records that are not really pageviews and those end up on top endpoint? - https://phabricator.wikimedia.org/T145922#2652204 (Jdlrobson) FYI @JMinor (since apps might be surfacing non page views) - 404.php has been the most read article for quite some time. @greg I'm... [14:52:39] Analytics, Research-and-Data, Research-collaborations, Research-management: Oozie job to extract data for WDQS research - https://phabricator.wikimedia.org/T146064#2652247 (JAllemandou) I'll have an hour between 6pm and 7pm CEST. [14:56:51] Analytics, Research-and-Data, Research-collaborations, Research-management: Oozie job to extract data for WDQS research - https://phabricator.wikimedia.org/T146064#2649398 (Nuria) @JAllemandou : Can you guys loop in nathaniel, so we can learn oozie so items like this one can be done with just CRS... [15:01:31] elukey: standduppp [15:09:14] mobrovac: did you have a chance to look at gulp and that repo? [15:09:38] milimetric: nope, wil do today [15:17:22] Analytics-Kanban: Make top pages for WP:MED articles - https://phabricator.wikimedia.org/T139324#2652339 (Milimetric) @Ladsgroup, I'm getting numbers for August and July, unless you want something else. [15:23:07] Analytics-Kanban: Make top pages for WP:MED articles - https://phabricator.wikimedia.org/T139324#2652361 (Ladsgroup) Nah, that's fine. Thanks. [15:40:40] Analytics-Kanban, EventBus, Wikimedia-Stream: Public Event Streams - https://phabricator.wikimedia.org/T130651#2141764 (Milimetric) SSE sounds cool but it's not supported in IE or Edge. We can support a REST interface into our data by implementing a kasocki client that is configured by the URI it's... [15:43:05] leila: https://phabricator.wikimedia.org/T146064 is purely development, I would also loop nathaniel cause it involves gerrit/CRs and testing that would need to be done in order to have things in place, bug fixes would requiregerrit credentials and such to deploy code. [15:43:18] a-team: faidon's talk at scaledot: https://www.youtube.com/watch?v=646mJu5f2cQ [15:44:04] oh elukey, joal, ops sync? [15:44:06] or we skip? [15:44:09] i'm ok with skipping [15:45:29] ottomata: I think that joal is talking with Marcel and Dan, so we can skip if nothing big comes up [15:45:41] or better, if we don't have anything major [15:46:04] i'd have talked about cassandra (but I need to chat with urandom first) and vk (that is pending code review) [15:46:14] and the cluster restart that is upcoming, but usual stuff [15:46:37] and this will be a reboot of all nodes since kernels are affected [15:46:38] sigh [15:46:46] nuria_: got you. [15:48:14] k cool, we skip :) [15:48:18] schana: do you have some time tonight or tomorrow to work with joal on T146064? this is a task related to a research collaboration we're doing, and although I'm happy to learn how to do this from joal, it seems doing it requires some proper se work. ;) [15:48:19] T146064: Oozie job to extract data for WDQS research - https://phabricator.wikimedia.org/T146064 [15:48:44] * schana reads [15:54:55] yes leila, I can be available anytime from 1030 to around 1330 Pacific - tomorrow can also work if need be [15:55:15] thanks, schana. I'll schedule something with joal then. [15:56:08] elukey: i'm around now; sorry, i didn't reply sooner, i had a meeting [15:56:10] leila, schana : if ok with you. I'll drive schana, but A-team would rather not own the job since it's research [15:57:07] urandom: hi! sorry to bug you each time :( [15:57:14] joal: that's fine given that schana is up for it. [15:57:20] urandom: just wanted to have your opinion on https://phabricator.wikimedia.org/T146130 [15:57:31] joal: can you schedule something for schana and yourelf, 1030-1330 PST today? [15:57:34] I am overly paranoid but it took too long to load that cluster :D [15:57:51] leila: were you planning on attending? [15:58:15] I guess joal is saying no, schana. so ignore me. add me as optional if you want to. [15:58:32] leila, schana , joal: we can help as much as needed, the idea is to get other teams acquited with oozie [15:59:01] leila, schana : that being said oozie is kind of a pain to deal with (java circa 2000, pretty much) [15:59:11] joal: would european time work better? [15:59:29] (it would for me) [15:59:29] leila, schana : so we can get you guys started, you can learn how to test and we can work on Crs as needed be. [15:59:57] schana: european time is bettre indeeed [16:00:11] leila, schana : in any oozie job there is going to be revisions of the data and likely bugfixes, the idea is that after initial training you are well position to do those. [16:00:32] schana: tomorrow morning, 11CEST? [16:00:32] makes sense, nuria_. :) [16:00:39] joal: perfect [16:00:49] milimetric: elukey, i'm starting to wonder if we can/should use pivot as src/ submodule at all [16:01:21] i think we need a package.json to specify service-runner as a dependency, otherwise, how will it be installed? [16:01:42] leila, schana: ok, thank you, you guys let me know if you think the setup is working [16:01:58] elukey: (looking btw) [16:02:01] ottomata: we're operating on a fork, we could add it to our version of pivot [16:02:06] will do nuria_ [16:02:16] nuria_: yup [16:02:24] hm [16:02:28] yeah but hmm [16:02:29] ottomata: I wanted to use service::node [16:02:31] do we really want to do that? [16:02:49] I'm not sure what the alternatives are [16:03:07] schana, leila, nuria_ : invites sent [16:03:11] I guess we could make another repo that installs pivot as a dependency [16:03:18] so package.json would have pivot, service-runner [16:03:23] thanks joal [16:03:24] elukey: has 1005 and 1006 since been restarted? [16:03:30] (gotta go make dinner / have lunch, bbl) [16:03:37] urandom: nope, I stopped [16:03:58] milimetric: why another repo? [16:04:01] ottomata: what would be the alternative? use pivot as node module? [16:04:07] maybe pivot is in npm, and we can treat it like any node dependency [16:04:14] and just freeze with wiith npm install . && git commit [16:04:31] i guess hm, [16:04:32] it is not available now in npm IIRC [16:04:36] package.json can specify a git uri [16:04:43] and we have a git repo for it [16:04:46] so we can do it like that [16:04:58] (like it is in npm but I think the status is blocked) [16:05:02] yeah hmmm [16:05:10] ok i have an idea... lemme try this in pivot/deploy repo' [16:06:12] elukey: so... wtf? [16:06:20] ahahhah [16:06:30] same thing that I said this morning [16:06:32] 1005 and 1006 report the same disk usage (df -h) [16:06:37] same as 1004 [16:06:41] yeah but disk load is weird [16:06:54] but that sstable load is... well, obviously wrong [16:10:31] I tried to go back 30 days in grafana and the values are not right, so it might be something with the dashboard itself.. but nodetool and cfstats show terabytes [16:10:42] and the funny thing is that we don't have that amount of data in the partitions [16:12:24] no, you don't :) [16:12:46] elukey: this is a bug, i'm trying to see if it's a known one [16:13:01] elukey: have you done any repairs? [16:14:00] elukey: guessing it might be LCS related, since we are not seeing it in restbase [16:14:28] nope didn't issue any repair [16:14:43] we have heavily loaded the cluster up to some days ago [16:15:14] so there might be some bug that confuses the load calculations? [16:15:26] yeah, i assume so [16:15:31] I tried to check in apache jira but didn't find anything substancial [16:15:40] what is restbase using as compaction? [16:15:50] s/as/for/ [16:15:53] elukey: same here, but it might be a tricky thing to search for [16:15:55] DTCS [16:17:04] elukey: so, i can't imagine this had anything to do with the logging change, i guess it's just the restart [16:17:45] oh yes yes I just wanted to know if it was safe to proceed with the restart [16:17:58] and, i doubt it's indicative of a serious issue; it probably won't hurt anything insofar as serving results [16:18:14] yep 1004 didn't have any trouble so far [16:18:21] I monitored requests with httpry [16:18:25] elukey: yeah, it should be [16:18:52] the only value in *not* restarting might be in trying to figure out the bug [16:19:02] but i dunno that it would help [16:19:35] and you kinda wanna restart after that config change [16:19:43] Just To Be Sure :) [16:19:52] :D [16:21:24] all right so I am going to complete the restarts [16:21:29] kk [16:21:33] and then keep the disk load monitored [16:21:42] because maybe the bug will come out again [16:22:09] !log restarting cassandra on aqs1005 [16:22:10] yeah, it would be helpful to monitor, and see if it's something that just accumulates, or if it's tied to something (like the bulk import) [16:22:12] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [16:22:54] urandom: yeah, what I wanted to discuss with you was if there might have been data loss or obvious "OH NOES PLEASE STOP" issues [16:23:13] but it looks like we are facing a simple calculation bug [16:23:51] elukey: milimetric q. how did you get the build/ dir? [16:23:54] i'm trying to follow instructions [16:23:55] but i get [16:23:58] [16:23:34] No gulpfile found [16:24:21] elukey: it's definitely weird! [16:24:34] i can see why you would be concerned :) [16:24:35] ottomata: I followed the instructions in the src dir [16:24:57] i'm doing the same [16:29:11] hmmm ok, somehow npm install via git uri does not exactly do git clone [16:29:19] my node_modules/imply-pivot does not have gulpfile [16:29:20] hm [16:43:14] milimetric: do you know anything about how npm works with git urls? [16:43:29] i would have guessed that it just cloned the repo into node_modules/ [16:51:20] Analytics, Cassandra: Inconsistent Cassandra disk load shown in metrics and nodetool status - https://phabricator.wikimedia.org/T146130#2652672 (elukey) All cassandra instances restarted, the behavior outlined in the task's description recurred. As agreed with @Eevans we are going to keep it monitored to... [16:52:28] a-team: I'll join half an hour later the meeting since I am in Travel training 101 [17:03:25] !log aqs100[56] added to LVS and serving live traffic [17:03:29] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log, Master [17:03:34] we are at full speed now [17:03:36] \o/ [17:06:34] elukey: \o/ [17:06:43] milimetric: i can't get pivot to build [17:06:59] mobrovac: npm install && gulp? [17:07:05] milimetric: i cloned your repo, and then npm install && npm install tsc && ./compile [17:07:26] uh... I donno what ./compile is, I never did that [17:07:33] haha [17:07:34] ok [17:07:38] trying your way [17:07:41] but I have gulp, so "sudo npm i -g gulp" [17:07:44] (if you don't) [17:08:25] it's installed in node_modules, so i use ./node_modules/.bin/gulp rather than global installs [17:08:31] * mobrovac doesn't like global npm modules [17:09:48] milimetric: ok, so basically you'd need this extra gulp step after npm install for the build section, right? [17:10:04] yes mobrovac [17:10:18] (sorry, I'm in our staff meeting now) [17:10:22] np [17:10:44] milimetric: tl;dr that is a feature that ought to be relatively easy to add [17:10:50] to service-runner [17:11:05] elukey: ^^^ [17:12:57] elukey: also, it should be easy to modify src/server/www.ts to make it compatible with service-runner [17:13:24] it's basically just needs to return the promisified server object [17:14:26] * elukey takes notes [17:17:16] Analytics-Kanban: Pageview hourly stores records that are not really pageviews and those end up on top endpoint? - https://phabricator.wikimedia.org/T145922#2652782 (greg) >>! In T145922#2652204, @Jdlrobson wrote: > @greg I'm a little concerned that 404s are being hit so frequently - worth investigating in a... [17:20:15] thx mobrovac [17:21:01] grazie Marko :) [17:21:36] np guys [17:31:27] milimetric: actually, allowing you to specify an extra step in the build process won't help you here because the src submodule will still stay untouched [17:31:59] milimetric: i think the easiest thing for you to do here is to remove build/ form .gitignore and .npmignore and commit the build before building the deploy repo [17:32:00] bye elukey! [17:32:05] byeeee o/ [17:32:19] nuria_: I'll remove aqs100[123] from service tomorrow [17:32:24] k, mobrovac that's what I was thinking the hacky way was. I can think of ways to do it, but they're much too complicated [17:32:25] thx [17:32:26] OMG!!! [17:32:30] it is happening!!!! [17:32:31] :) [17:32:35] yes :) [17:32:59] nuria_: let's chat tomorrow about raising the throttling limit ok? [17:33:06] elukey: yes [17:33:13] or if you have a safe value that you want to apply I can try tomorrow [17:34:42] keeping it as-is for at least 24h after putting the new cluster in prod would be a safe thing to do elukey [17:35:05] mobrovac: yep in fact I wanted to do it tomorrow after I remove aqs100[123] :) [17:35:15] :) [17:35:37] I am looking forward to see aqs100[123] decommed but I am also super paranoid [17:36:12] * elukey goes afk! [18:19:02] hmm mobrovac milimetric, can't say i like it much :/ [18:19:30] certainly ugly [18:20:16] we could make the build/ dir stick into the deploy repo, but it's questionable if that would work [18:21:15] mobrovac: how does the global npm install -g pivot work? [18:21:25] as documented in the pivot readme [18:21:27] for non dev install? [18:21:39] does it run gulp while installing to create build/...? [18:22:11] ### Install [18:22:11] Next simply run: [18:22:11] ``` [18:22:11] npm i -g imply-pivot [18:22:11] ``` [18:22:11] **That's it.** You are ready to Pivot. [18:22:50] We could do that but then we have no control, we have to upstream any changes we want. [18:22:59] i'mnot suggesting we do that [18:23:03] but i don't understand how it works [18:23:14] if running gulp is a requirement to install [18:24:27] oh yeah, probably that's what it does, want me to try? [18:24:54] (nom install does compile after fetching the sources in plenty of other modules) [18:25:00] *npm [18:26:59] i tried to gulp-build it for the deploy repo, but that doesn't work because the dirs are different [18:28:26] so [18:29:03] i've been trying to make a repo that has service-runner and pivot as node deps, but the pivot that gets installed via npm install . does not have build/ or gulpfile.js (because of .npmignore) [18:29:15] i don't know how or why npm -g install would be different [18:29:18] than npm install . [18:29:34] how does npm -g install imply-pivot know to run gulp? [18:29:44] and npm install imply-pivot not? [18:31:52] are you sure that one runs gulp but the other doesn't? [18:36:50] mobrovac: no [18:36:52] i'm not [18:37:17] but, according to readme you can do npm install -g and then pivot will work [18:37:23] and they don't say it needs extra steps [18:37:28] hard to test though, since its not on npm anymore [18:54:36] Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2653227 (Nuria) >@ellery, I talked about this with @mpopov Friday, and he told me that Discovery uses unique tokens as standard practice in their experiments. (They set an >ex... [18:58:32] Analytics, Operations, Performance-Team, Traffic: A/B Testing solid framework - https://phabricator.wikimedia.org/T135762#2653269 (Nuria) @Neil_P._Quinn_WMF , @ellery Please have in mind that in any of discovery's test there is no knowledge as to whether the user is part of another test (ex: hov... [19:52:15] thsi pivot code is nasty [20:14:20] Analytics, Easy: [REQUEST] Extract search queries from HTTP_REFERER field for a Wikibook - https://phabricator.wikimedia.org/T144714#2653765 (debt) I'm not exactly sure what the Discovery Analysis team can do right now for this ticket...removing it from our board for now. Please re-add us if there is any... [20:49:02] Analytics-Kanban, EventBus, Wikimedia-Stream: Public Event Streams - https://phabricator.wikimedia.org/T130651#2653887 (Nuria) @ottomata: I cannot see any disadvange of using socket.io other than it has some features that we might no use. It is a well known library and boiler plate code to use kasock... [21:15:38] elukey: nice latency improvement from switching to ssds! [21:35:52] wikimedia/mediawiki-extensions-EventLogging#601 (wmf/1.28.0-wmf.20 - e622988 : thcipriani): The build has errored. [21:35:52] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/commit/e622988fc537 [21:35:52] Build details : https://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/161463449 [22:00:16] Analytics-Kanban: Make top pages for WP:MED articles - https://phabricator.wikimedia.org/T139324#2654116 (Milimetric) Queries done. The data's in the milimetric.wikiproject_medicine_page_counts Hive table. I collected the steps I took in this gist: https://gist.github.com/milimetric/e77e22a736cef4c973a2666... [23:07:42] Analytics-Kanban: Make top pages for WP:MED articles - https://phabricator.wikimedia.org/T139324#2654503 (Doc_James) Looking good. Does it generate pageviews for entire projects in a given month yet? About 33K pages for WPMED. James