[03:26:39] (03Abandoned) 10MaxSem: Record sum of all wikis for geo tag counts [analytics/discovery-stats] - 10https://gerrit.wikimedia.org/r/324822 (owner: 10MaxSem) [09:02:23] https://phabricator.wikimedia.org/T155654 - rack and set up aqs100[7-9] [09:02:26] \o/ [09:18:05] buongiorno ateam! [09:24:10] Ola ! [09:25:31] Yay, new machines for aqs :) [09:25:59] buenos dias fdans :P [09:27:04] 🚢🇮🇹 [09:53:57] joal: I asked a couple of questions to Eric about the Cassandra bootstrap and the picture that I described yesterday seems mostly right [09:54:13] Great :) also makes sense to me :) [09:55:01] the only part that I was missing is the streaming strategy during bootstrap, namely the node trying to find the smallest subset of replicas for a certain key range that satisfies some criterias like closest to the instance etc.. [09:55:34] and since num_replicas == num_racks for restbase and us, each rack has at least one copy of each key range [09:56:01] so the new instance/node boostrapts from the instances of the same rack if possible [09:56:13] I think to avoid consistency violations? [09:56:38] theoretically we'll have three new racks [09:56:42] to minimize network load I suspect [09:56:48] yeah that one too [10:02:07] So, with three new racks, no 'in same rack' issue, therefore no issue :) [10:02:10] right ? [10:07:11] I have no idea where data will be streamed though :P [10:07:37] probably from only one of the other racks? [10:07:52] IIRC this was what we were discussing with Eric.. [10:07:57] now it makes a bit more sense :D [10:08:20] so my plan is to set up OS, partitions, etc.. without configuring cassandra [10:08:34] then scap deployment etc.. [10:08:39] k [10:08:48] and in the meantime, we discuss how to configure the instances [12:16:29] * elukey lunch! [12:25:56] 10Analytics, 06Operations: Move cloudera packages to a separate archive section - https://phabricator.wikimedia.org/T155726#2952404 (10MoritzMuehlenhoff) [13:09:00] Hi a-team, is there a more clever way how to know how many bytes was added/removed by revision than "select CAST(rev.rev_len AS SIGNED)-CAST(revParent.rev_len AS SIGNED) from (select * from revision where rev_id=13268975) as rev, (select * from revision where rev_id=13265148) as revParent;" ran at replica db? [13:09:27] (the first rev_id is rev_id I want to find, the second one is it's parent rev. [13:09:29] ) [13:10:04] Hi Urbanecm [13:10:41] Urbanecm: If you're looking for historical data and have access to hive, we have a new dataset provding the value you're after [13:11:06] Urbanecm: When I mean historical, I mean up-to end-of-2016 [13:11:27] Yes, I'm examining activity of seniors in the passed year. [13:11:43] Just a question. What is hive? [13:11:56] (as I don't know what it is I don't have access to it probably) [13:12:05] Urbanecm: hive is a querying system over hadoop (big data plateform) [13:12:13] Urbanecm: probably true :) [13:12:31] Can you link to some page about it? [13:12:31] Urbanecm: first, do you have access to stat1002? [13:12:33] No. [13:12:54] sure Urbanecm: https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hive [13:12:54] I have access to toollabs only. [13:13:01] thx [13:13:20] Arf Urbanecm ... I think you won't get access to the dataset I'm talking about then [13:13:51] Urbanecm: BUT, we are heavily working on providing it to labs, so hopfully in a not too far awasy future... [13:15:22] Is there some phab ticket for the task? [13:16:58] Urbanecm: T152788 is the parent task about the hting [13:16:58] T152788: Pull data for edit reconstruction from labs. - https://phabricator.wikimedia.org/T152788 [13:17:15] thx [13:17:23] 10Analytics: Pull data for edit reconstruction from labs and push it back after reconstruction - https://phabricator.wikimedia.org/T152788#2952605 (10JAllemandou) [13:17:46] In the meanwhile it is better to use labs replica and python script to get the data I want or should I request some kind of access instead? [13:18:50] joal, ^ [13:26:00] Urbanecm: if you either work for wmf or have an NDA I suggest request access to the cluster [13:26:52] No, I haven't NDA and I don't work for the WMF. [13:27:35] Urbanecm: That leaves you with python scripts then :( [13:27:45] Urbanecm: sorry for not having a better answer yet [13:28:31] It's good a-team is working on it! Thanks and have a good day! [13:28:47] Thanks ! [14:01:22] * milimetric going to a cafe to work, might be offline for a while [14:04:36] 10Analytics, 06Operations: Move cloudera packages to a separate archive section - https://phabricator.wikimedia.org/T155726#2952404 (10Ottomata) +1 I like this idea. [14:04:42] 10Analytics, 10Analytics-Cluster, 06Operations: Move cloudera packages to a separate archive section - https://phabricator.wikimedia.org/T155726#2952704 (10Ottomata) [14:08:34] Amir1: o/ [14:08:56] I saw that you asked access to hue.. Why do you need it? (curiosity) [14:08:58] elukey: hey [14:09:45] It's the UI around hadoop. So It would be easier to use instead of logging in and making requests I guess [14:09:51] correct me if I'm wrong [14:11:04] we generally use it to manage/check jobs, meanwhile beeline on stat100[24] would be more indicated to make requests etc.. [14:11:30] elukey: Okay, understood [14:11:41] so I don't need it [14:11:54] I thought it's something like quarry.wmflabs.org for hadoop [14:14:02] there is a query editor as far as I know [14:14:13] (to interface with Hive) [14:14:43] let me ask to the team today during standup (what is the best tool to use for querying) [14:14:47] and I'll come back to you asap [14:17:03] 10Analytics, 10Analytics-Cluster, 06Operations: Move cloudera packages to a separate archive section - https://phabricator.wikimedia.org/T155726#2952729 (10MoritzMuehlenhoff) [14:44:21] joal: chasemp here, hw issue in my mac so [14:44:40] I am off to apple store so will ping later to sync up [15:23:50] hi all, unless I'm mistaken it seems that the pageviews API is lagging data? any info about the situation would be greatly appreciated [15:27:56] joal: from hue it seems that cassandra-coord-pageview-per-project-hourly is missing one hour of data yesterday? [15:28:12] correct elukey, just checked that [15:28:24] I think nuria made the same mistake as the other ;) [15:29:10] andrew_____: thanks for the report, we'll try to fix asap :) [15:29:47] much appreciated! [15:30:06] any way for a layperson to help out with this process other than to report it here or at analytics@? [15:30:28] and which is preferred? IRC or email? [15:31:10] report in here is good, but email might be more visible for more people.. usually IRC ping first, and then we'll send an email if the issue is big and likely not going to be solved soon [15:31:30] does it sound good for you? [15:34:21] that definitely works for us! [15:34:57] elukey: re-running the webrequest jobs (upload, text) [15:35:00] ok? [15:38:34] joal: yep! [15:39:21] !log Launched 0080149-161121120201437-oozie-oozi-B to recover from missing webrequest-load 2017-01-18 19:00 [15:39:23] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:43:45] as a point of curiosity, what causes these cassandra jobs to fail? [15:44:42] andrew_____: It's not cassandra, it's upper in the chain [15:45:39] andrew_____: The pageview-api data loading jobs are dependent on the pageview extraction jobs to be successful, themselves dependent on the webrequest load jobs to be successful. [15:46:10] andrew_____: Yesterday, a new version of our code has been deployed, and a mistake has been made at job restart time, forgetting an hour basically [15:47:04] andrew_____: This forgotten hour is now being computed, once done, the dependent jobs will start [15:47:36] very cool [15:47:56] andrew_____: The power of oozie (http://oozie.apache.org/) [15:48:08] the new version of code was for the webrequest load jobs or the pageview extraction jobs? [15:48:40] andrew_____: It was a non-functional refactor of our java code used in many jobs as UDFs [15:48:53] andrew_____: please let me know when I become unclear :) [15:51:08] !log Launched 0080172-161121120201437-oozie-oozi-B to recover from missing webrequest-load 2017-01-18 19:00 with a correct setup this time [15:51:08] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:55:20] joal, hi! do you know how to commit changes to a Diphusion project? [15:58:04] I did that once: using Differential (was it?), through Arcanist [15:58:10] https://www.mediawiki.org/wiki/Phabricator/Differential [15:58:14] mforns: --^ [15:58:37] joal, :] I looked for it in wikitech... [15:58:39] mforns: which made me install PHP :((( [15:58:39] thank! [15:58:43] oh... [16:00:57] ottomata, elukey , joal : standdduppp [16:01:01] 10Analytics: Measure Community Backlog. - https://phabricator.wikimedia.org/T155497#2953085 (10Milimetric) >>! In T155497#2951019, @Halfak wrote: > It seems that, measuring the backlog will require a really clear and coherent strategy for identifying backlog items. Yes, exactly. Maybe the sequence can be: *... [16:01:41] comiiiinggg [16:24:54] 06Analytics-Kanban, 06Services (blocked), 15User-mobrovac: Upgrade AQS to node 6 - https://phabricator.wikimedia.org/T155642#2953137 (10Nuria) @mobrovac Can you point us to a dashboard where the memory improvement is apparent? [16:27:26] 06Analytics-Kanban, 06Services (blocked), 15User-mobrovac: Upgrade AQS to node 6 - https://phabricator.wikimedia.org/T155642#2949682 (10GWicke) @nuria: https://grafana.wikimedia.org/dashboard/db/restbase?panelId=4&fullscreen [16:29:58] 06Analytics-Kanban, 06Services (blocked), 15User-mobrovac: Upgrade AQS to node 6 - https://phabricator.wikimedia.org/T155642#2953172 (10Milimetric) nice, this time window shows the decrease with the upgrade: https://grafana.wikimedia.org/dashboard/db/restbase?panelId=4&fullscreen&from=now-7d&to=now [16:31:31] 06Analytics-Kanban, 06Services (blocked), 15User-mobrovac: Upgrade AQS to node 6 - https://phabricator.wikimedia.org/T155642#2953175 (10Nuria) @GWicke : is that a measure that aggregates for all machines? I was thinking memory improments would be visible in the server dashboard per host. For example this o... [16:33:12] 06Analytics-Kanban, 06Services (blocked), 15User-mobrovac: Upgrade AQS to node 6 - https://phabricator.wikimedia.org/T155642#2953187 (10Nuria) But the memory improvements do not show here for example: https://grafana.wikimedia.org/dashboard/file/server-board.json?var-server=restbase2001&var-network=eth0&from... [16:35:29] 06Analytics-Kanban, 06Services (blocked), 15User-mobrovac: Upgrade AQS to node 6 - https://phabricator.wikimedia.org/T155642#2953191 (10Milimetric) Step right now: testing on beta with https://wikitech.wikimedia.org/wiki/Analytics/AQS#Beta (getting an ssh key error while trying to deploy at the moment) [16:38:41] 10Analytics-Dashiki, 06Analytics-Kanban, 13Patch-For-Review: Add extension and category (ala Eventlogging) for DashikiConfigs - https://phabricator.wikimedia.org/T125403#2953193 (10Nuria) Deploy to beta Deploy to test wiki Test Deploy to production Enable extension by enabling json config [16:43:10] 06Analytics-Kanban, 06Services (blocked), 15User-mobrovac: Upgrade AQS to node 6 - https://phabricator.wikimedia.org/T155642#2949682 (10Pchelolo) > But the memory improvements do not show here for example: https://grafana.wikimedia.org/dashboard/file/server-board.json?var-server=restbase2001&var-network=eth0... [16:44:16] 06Analytics-Kanban, 07Easy, 03Google-Code-In-2016, 13Patch-For-Review: Add monthly request stats per article title to pageview api - https://phabricator.wikimedia.org/T139934#2953199 (10Nuria) Remains to be done: * Updating configuration for frontend restbase Is this on yaml? * Deploying to prod. [16:56:01] elukey: I tried scap deploy on deployment-tin and got "connection to deployment-aqs01.deployment-prep.eqiad.wmflabs failed and future stages will not be attempted for this target" [16:56:40] I checked my permissions and I'm an admin in the deployment-prep labs project [16:57:24] is deployment-aqs01.deployment-prep.eqiad.wmflabs up and running? [16:57:27] checking [16:57:47] yes it seems working [17:01:39] ah 'Agent admitted failure to sign using the key.' [17:02:08] so it might be an issue with the keyholder [17:04:30] 06Analytics-Kanban: Run a 1-off sqoop over the new labsdb servers - https://phabricator.wikimedia.org/T155658#2953246 (10Nuria) Steps: * Technical information of dbs from chase * Syncronization with moritz to open accesss to labs db hosts from analytics network, scooping is happening from hadoop * We are going... [17:09:41] 10Analytics-Dashiki, 06Analytics-Kanban: Add Map component to dashiki layout - https://phabricator.wikimedia.org/T153921#2953272 (10Nuria) * visualizer to spatially visualize the TimeSeries using the country field to display things on a map * legend of map * add couple tests [17:09:48] 10Analytics-Dashiki, 06Analytics-Kanban: Add Map component to dashiki layout - https://phabricator.wikimedia.org/T153921#2953273 (10Nuria) [17:11:02] 10Analytics-EventLogging, 06Analytics-Kanban: Add user_agent_map field to EventCapsule - https://phabricator.wikimedia.org/T153207#2953277 (10Nuria) a:05Nuria>03fdans [17:11:58] 06Analytics-Kanban, 06Services (blocked), 15User-mobrovac: Upgrade AQS to node 6 - https://phabricator.wikimedia.org/T155642#2953279 (10Nuria) * Deploy the new debian package with node6 with depooling/pooling [17:14:11] 10Analytics, 06Operations, 10netops, 13Patch-For-Review: Open temporary access from analytics vlan to new-labsdb one - https://phabricator.wikimedia.org/T155487#2953282 (10Nuria) [17:24:07] 10Analytics, 10Fundraising-Backlog, 13Patch-For-Review: Productionize banner impressions druid/pivot dataset - https://phabricator.wikimedia.org/T155141#2953308 (10Nuria) * Oozie work to source data from webrequest into a banner impressions table, indexing job in druid loads data into druid * Add step to wor... [17:24:38] 06Analytics-Kanban, 10Fundraising-Backlog, 13Patch-For-Review: Productionize banner impressions druid/pivot dataset - https://phabricator.wikimedia.org/T155141#2953309 (10Nuria) [17:33:46] I've been thinking all afternoon [17:33:51] mforns looks like something familiar with the green hood on and the headphone [17:34:04] xD [17:34:10] yoda? [17:34:16] no, but close! [17:34:20] xDDD [17:34:30] https://lumiere-a.akamaihd.net/v1/images/databank_ewok_01_169_747db03a.jpeg?region=0%2C49%2C1560%2C780 [17:34:41] hehehehehe [17:34:56] plenty of earthly colors :] [17:35:05] hell yeah [17:51:09] elukey: regarding the issue with the keyholder, is that on my end or ...? [17:51:27] like, can you run scap deploy on that box with no problem? [17:51:44] milimetric: trying to solve it now :) [17:52:03] oh, cool, thank you [17:56:44] milimetric: ah weird! scap deploy worked for me [17:56:57] hm, so it's just me [17:57:00] maybe because ops can deploy everywhere nowadays [17:58:03] elukey: when I ssh into deployment-tin do I have to do anything special like forward keys or something? [17:58:23] nono because the keyholder will take care of everything [17:58:37] I can ssh into both deployment-tin and deployment-aqs01 just fine [17:58:54] ah wait! [17:59:09] now I remember.. in labs the keyholder is weird, there was a ticket about it.. [17:59:15] let me find it in phab [17:59:36] (jumping into a meeting but I'll check IRC if you find something) [17:59:38] thanks! [18:02:26] milimetric: https://phabricator.wikimedia.org/T116206#2251441 and the next ones [18:02:53] https://phabricator.wikimedia.org/T116206#2251658 [18:03:05] now I remember that we had the same issue with Joseph [18:04:28] but Joseph is not in deploy-service now.. [18:04:33] joal: you there? [18:04:46] yes [18:04:51] in talk with ottomata [18:05:33] ah okok! [18:05:40] whenever you have time let me know [18:29:32] milimetric: problem solved! [18:29:38] you should now be able to deploy [18:30:03] thcipriani helped, you were not in the deploy-service group on deployment-tin [18:30:06] thanks! [18:30:18] if you want to try it out just to make sure [18:30:23] (whenever you have time) [18:32:24] elukey: ok with you if I upgrade node on this box? [18:32:30] any preferred way to upgrade here? [18:32:46] I sometimes get frustrated and do things that I think you would be horrified by [18:33:31] ahahhaah [18:33:57] milimetric: do you want to deploy the new change first, test it, deploy to prod and then restart over for node upgrade? [18:34:12] sure [18:34:44] nuria was going to do the feature deploy, I'll catch up with her after this meeting [18:35:22] all right, for the node upgrade i believe that you'll need sudo on the host [18:35:41] so we could do it tomorrow if you want, after you guys test the change today [18:35:45] would it be fine milimetric ? [18:38:56] elukey: i think that would be best, 1) feature 2) node. i can start that after 11 today [18:39:02] sorry, in an hour [18:40:10] 10Analytics, 10ChangeProp, 10Citoid, 10ContentTranslation, and 12 others: Node 6 upgrade planning - https://phabricator.wikimedia.org/T149331#2953730 (10Jdforrester-WMF) [18:43:13] elukey: here :) [18:43:25] elukey: I guess I can try to help deploying aqs [18:46:03] joal: solved! [18:46:41] nuria: exactly what I was suggesting :) So at this point I am going to log off, will check later on! [18:50:05] * elukey afk! [18:53:36] thanks elukey :) [18:58:31] k nuria, I'll work on the dashiki extension deployment now then, let me know if you run into trouble with the aqs build [19:00:12] see you tomorrow! [19:00:30] bye fdans ! [19:04:00] 06Analytics-Kanban, 06Services (blocked), 15User-mobrovac: Upgrade AQS to node 6 - https://phabricator.wikimedia.org/T155642#2953835 (10Nuria) So @pchelo: there is no graph I can look at (in terms of memory) to asses effect of deployment? [19:05:55] 06Analytics-Kanban, 06Services (blocked), 15User-mobrovac: Upgrade AQS to node 6 - https://phabricator.wikimedia.org/T155642#2953841 (10Pchelolo) @Nuria Gabriel shared the averaged across all hosts graph for RESTBase earlier here. We don't have per-host information of memory usage for RESTBase [19:12:55] Going offline a-team :) [19:12:58] tomorrow ! [19:13:03] bye joal ! [19:43:06] ottomata, yt? [19:44:54] I'm about to arc diff my change to the phab task, but the system doesn't ask me which task do I want to put it on... I'm afraid of executing the command and having my patch end up somewhere that needs to be cleaned after... How do you specify the phab task your change belongs to?? [19:45:23] mforns: ya [19:45:44] which task? hm, i think you just do that by putting the TXXX in the commit message? [19:45:49] hmmm [19:45:51] not sure though [19:46:01] aha [19:47:23] ottomata, in this diff the description does not contain any references to the phab task... https://phabricator.wikimedia.org/rWKSE7d7ba440381f3ab0d9aa5df40f852bc6c2662f84 [19:49:22] hm, mforns i dunno, hmmm [19:49:36] ottomata, OK, I'll try without [19:49:45] https://phabricator.wikimedia.org/D524 [19:49:52] https://phabricator.wikimedia.org/T154328 [19:49:52] yeah [19:49:55] put it in commit message [19:49:56] rigiht? [19:50:04] oh hm [19:50:11] no, that didn't auto link it in the task like it does with gerrit [19:50:31] oh it did [19:50:32] sort of [19:50:33] https://phabricator.wikimedia.org/p/Ottomata/ mentioned this in https://phabricator.wikimedia.org/rWKSE3167eb280159d28c022e710bc5dbe1397c6f011d. [19:53:53] ottomata, I tried putting in the task ID, but it wouldn't let me... [19:54:09] I executed it without task ID, and I got: https://phabricator.wikimedia.org/D539 [19:54:27] now looking if there's a way to link it with the task from phab [19:56:45] 06Analytics-Kanban, 10EventBus, 10Wikimedia-Stream, 13Patch-For-Review: Set charset=utf-8 in Content-Type response header from sse.js client - https://phabricator.wikimedia.org/T154328#2907516 (10mforns) [19:58:20] 06Analytics-Kanban, 10EventBus, 10Wikimedia-Stream, 13Patch-For-Review: Set charset=utf-8 in Content-Type response header from sse.js client - https://phabricator.wikimedia.org/T154328#2954059 (10mforns) I don't know how to make the revision appear in this view... Anyway, above is the link to the revision... [20:00:52] ottomata, I'm writing some comments on it still [20:00:55] mforns: nice! one tiny comment [20:00:56] oh! :) [20:00:57] ok [20:01:01] weren't ready for review eH? [20:01:17] go ahead! [20:01:17] i submittted comment [20:01:24] I meant comments in the phab revision [20:01:26] ok [20:14:34] ottomata, added a couple comments in phab, I guess you already know what I meant from the code, but could you read them? there are a couple important ones :] thanks! [20:16:44] ja [20:17:46] ottomata, added a last comment! [20:19:44] oh mforns ok you found 2 things [20:19:47] ok first, id=0 [20:19:53] i think it would be good to allow id=0 [20:19:57] OK [20:19:58] although it won't matter for kafkasse use [20:20:01] will change [20:20:19] as for case sensitive headers [20:20:20] hm [20:20:20] yeah [20:20:22] could be confusing [20:20:30] i guess http server is just lowercasing them? [20:20:37] looks like it [20:20:48] we could lowercase them before merging the dicts [20:20:58] ja probably a good idea [20:21:06] thus we would accept both camel and lower [20:21:19] K [20:22:01] yeah [20:22:02] k, thanks mforns [20:22:14] np, will change [20:31:08] 10Analytics-Dashiki, 06Analytics-Kanban, 13Patch-For-Review: Add extension and category (ala Eventlogging) for DashikiConfigs - https://phabricator.wikimedia.org/T125403#1986718 (10greg) Please follow the checklist at https://www.mediawiki.org/wiki/Review_queue, thanks! Notably, security review is needed be... [20:35:51] mobrovac: hola. I am trying to build aqs (./server.js build --deploy-repo --force --review -c config.test.yaml) and i get an error: "git submodule exited with code 128 destination path src already exists and it is not ampty directory".. any ideas? cc @Pchelolo @milimetric [20:36:59] nuria: it looks like there is a dir in deploy repo called src with files in it and it's not a git repo [20:37:36] in the deploy repo, do "rm -rf src && git checkout -- src && git pull && git submodule update --init" [20:37:41] that should fix it [20:38:19] mobrovac: k [20:41:37] mobrovac: also, ahem that script uses "git review " which i do not have installed nor do i ever use, may be changing that to a regular git push would be better [20:42:13] mobrovac: what needs to happen via git review after build [20:42:39] nuria: it uses git review only if you pass it the --review param [20:42:53] omit it and it won't try to submit the patch [20:43:12] 06Analytics-Kanban, 15User-Elukey: Ongoing: Give me permissions in LDAP - https://phabricator.wikimedia.org/T150790#2796693 (10spatton) Howdy, can I be added to the wmf LDAP group? Wikitech username is **Spatton**. I'm a member of the online fundraising team. Thank you. [20:43:24] nuria: after the build, the changes need to appear in the master branch of the deploy repo, so it can also be a straight push if you have the rights for it [20:47:57] mobrovac: k, thanks will do [20:48:05] np [20:51:12] (03PS1) 10Nuria: Update aqs to a7eb80d [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/333104 [20:59:15] ottomata, modified the patch as combined :] [20:59:24] bye team, see you tomorrow! [20:59:28] byee! [21:31:19] (03CR) 10Nuria: [V: 032 C: 032] Update aqs to a7eb80d [analytics/aqs/deploy] - 10https://gerrit.wikimedia.org/r/333104 (owner: 10Nuria) [22:39:45] nuria: switching channel :) [22:39:48] so I can see now [22:40:06] "The table already exists, and it cannot be upgraded to the requested schema (Error: Schema change, but no version increment.).\",\"keyspace\":\"local_group_default_T_pageviews_per_article_flat\" etc.. [22:42:53] elukey: ah, nice, how did you see this error again? [22:43:08] I removed the logstash config (and disabled puppet) [22:43:14] everything is in journalctl now [22:43:45] what is the command to 1) restart node? 2) disable puppet? [22:44:08] sudo puppet agent --disable [22:44:11] and then [22:44:18] sudo systemctl restart aqs [22:46:02] 10Analytics: Disable queries for recent data on stats.grok.se - https://phabricator.wikimedia.org/T155785#2954605 (10Krinkle) [22:46:19] 10Analytics, 10Tool-Labs-tools-Pageviews: Disable queries for recent data on stats.grok.se - https://phabricator.wikimedia.org/T155785#2954564 (10Krinkle) [22:47:13] nuria: /home/elukey/log.txt for the full output on text [22:47:53] much better to read [22:47:56] (on aqs01) [22:48:02] elukey: makes sense cause this deploy has a change on the compaction , we probably just need to alter table by hand [22:50:04] elukey: or .. not.. maybe something else? [22:50:39] I tried to restart cassandra to see its health, and now it is failing [22:50:42] LOL [22:51:05] Error opening zip file or JAR manifest missing : /srv/deployment/cassandra/jmx_exporter/lib/jmx_prometheus_javaagent-0.8-SNAPSHOT. [22:51:21] that is the new one that Eric is deploying for the prometheus exporter [22:52:40] so that needs to be deployed to aqs [22:52:48] but I don't know how [22:53:55] elukey: i also wonder if the error is related to compaction change, is that considered a "schema change"? [22:54:59] nuria: mmmm it should be a keyspace change, so probably yes.. but it might be a restbase weirdness, we haven't deployed in beta for a while I think [22:55:31] elukey: it is restbase "calling" it a aschema change: https://github.com/wikimedia/restbase-mod-table-cassandra/blob/master/lib/schemaMigration.js#L18 [22:56:23] elukey: but I am not sure it is being triggered by compaction change [22:58:04] so the first problem is to solve the aqs cassandra issue, but I believe I'd need to deploy some jar to the host [22:58:11] very new things [22:58:15] then restbase [22:58:34] I'll work on it tomorrow morning :( [22:58:40] elukey: ya, there are several things . [22:58:52] elukey: no worries!!!! [22:58:58] Pchelolo: yt? [22:59:20] nuria: a moment, we're in the middle of upgrading node in scb [22:59:25] Pchelolo: k [23:00:04] so under /srv/deployment/cassandra/jmx_exporter there should be something, but I can't even see the repo on deployment-tin [23:00:54] so it might be a new puppet change weirdness [23:03:45] elukey: do not worry, we will look at it tomorrow. i will troubleshoot some with Pchelolo when he is available about schema changes [23:03:49] https://wikitech.wikimedia.org/wiki/Cassandra/PrometheusJmxExporter [23:04:33] grrr [23:04:40] okok I'll double check tomorrow morning :) [23:04:52] sorry! I wanted to help but nothing good came out [23:05:03] hopefully tomorrow aqs01 will be fixed :) [23:05:07] bye! o/ [23:28:21] Pchelolo: available? (ok to say no) [23:28:46] nuria: sorry, we're still pretty occupied with scb upgrade [23:28:51] some bumps on the road [23:28:55] Pchelolo: k [23:59:20] nuria: things seem to calm down, what's up?