[03:59:43] (03PS1) 10Joal: Add zero carrier to druid pageviews [analytics/refinery] - 10https://gerrit.wikimedia.org/r/346235 (https://phabricator.wikimedia.org/T161824) [07:26:57] joal: o/ [07:27:32] I am going to reimage an1032,an1033 and an1034 if you don't mind [07:37:22] stopped the nodemanagers on those hosts in the meantime (so the containers will drain) [08:14:03] !log restarted webrequest-load-wf-text-2017-4-4-6 [08:14:04] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:18:23] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3153163 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by elukey on neodymium.eqiad.wmnet for hosts: ``` ['analytics1032.eqiad.wmnet', 'analytics1033.... [08:18:47] started the reimages [08:50:50] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3153194 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1032.eqiad.wmnet', 'analytics1033.eqiad.wmnet', 'analytics1034.eqiad.wmnet'] ``` Of... [10:18:50] * fdans is afk for his dinner break :) [10:37:54] new hosts back serving traffic :) [11:09:00] fdans: how is Japan? [11:09:02] :) [11:19:29] * elukey commuting to the office [11:54:01] reimaging analytics10[36,37,38] [11:55:10] elukey: japan is super crazy and awesome as usual :D [11:56:04] although my place is awesome for working at night, super quiet and the desk is more comfortable than the one I have at home [11:56:18] you are in Tokio now right? [11:56:54] elukey: nope, Kanazawa! [11:58:30] ah watching the sea and the Koreas :) [12:16:51] 10Analytics-Tech-community-metrics, 06Developer-Relations (Apr-Jun 2017): Identify Wikimedia's most important/used info panels in korma.wmflabs.org - https://phabricator.wikimedia.org/T132421#3153564 (10Aklapper) [12:19:18] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3153569 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by elukey on neodymium.eqiad.wmnet for hosts: ``` ['analytics1036.eqiad.wmnet', 'analytics1037.... [12:45:20] 10Analytics-Tech-community-metrics, 06Developer-Relations, 10Differential: Make MetricsGrimoire/korma support gathering Code Review statistics from Phabricator's Differential - https://phabricator.wikimedia.org/T118753#3153603 (10Aklapper) [13:09:59] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3153643 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by elukey on neodymium.eqiad.wmnet for hosts: ``` ['analytics1038.eqiad.wmnet'] ``` The log can... [13:10:22] mmmm wmf-autoreimage with multiple hosts is not super reliable [13:10:39] an1038's reimage didn't start at all :/ [13:35:40] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3153680 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1038.eqiad.wmnet'] ``` and were **ALL** successful. [13:45:36] 1036/7 are back in service [13:45:51] we have officially more Debian nodes than Ubuntu ones in Hadoop :) [13:46:04] <3 ! [13:46:52] aaand the latest ones have the 4.9 kernel [13:48:45] niiiiiiice elukey :D [13:51:00] proceeding further with 1039 and 1051 [13:51:12] I am leaving behind the journal nodes that are a bit delicate [13:51:40] the plan is to complete the worker nodes, then reimage the journal nodes one at the time and finally the master nodes [13:52:12] maybe an1002 first, then we flip it active to make sure that everything is good and after a couple of days an1001 [13:52:25] (03PS1) 10Joal: [WIP] Add Spark schema handler to refinery-core [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) [13:52:51] ottomata --^ Have fun :) [14:07:37] letting 1039 and 1051 to drain containers [14:08:10] joal: exactly 10 hosts remaining among the workers :) [14:08:17] hehe :) [14:08:19] (including jorunal nodes) [14:08:23] *journal [14:08:32] I should be able to finish this week [14:08:37] I'm going to start talking to halfak then :) [14:12:52] halfak: o/ [14:21:34] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3153884 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by elukey on neodymium.eqiad.wmnet for hosts: ``` ['analytics1039.eqiad.wmnet', 'analytics1051.... [14:22:26] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3153887 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by elukey on neodymium.eqiad.wmnet for hosts: ``` ['analytics1039.eqiad.wmnet', 'analytics1051.... [14:25:57] elukey: i'll do some hosts this afternoon! what we got left? [14:27:43] ottomata: hiiiiiiiiiiiiiiiiiiiii [14:27:46] hiii [14:28:27] 1028,1035,1052 (journal nodes) and 105[34567] [14:30:01] ok, so we can finish up the POWs (Plain Ol' Workers?) today [14:30:10] and then do the rest more carefully [14:30:11] ? [14:30:54] oh, did we say we were going to move journal nodes around? [14:30:55] off of those ones? [14:31:00] so to reinstall? [14:31:01] hmmmm [14:31:16] since /var/lib/hadoop/journal is on sda? [14:31:45] hmm, we could probably cp the journal dir onto sdb during the reimage, and then just copy it back after [14:37:05] ottomata: yep I wanted to ask how to proceed :) [14:37:08] need some brainstorm [14:38:17] y [14:38:18] a [14:38:36] ah ottomata the new hosts have Linux 4.9 [14:38:38] elukey: i bet its simpler to cp the data to /var/lib/hadoop/b or something just for the reimage [14:38:45] and then cp it back to journal dir [14:38:48] oh, cool [14:39:07] the only metric that changed a bit is Active Memory in https://grafana.wikimedia.org/dashboard/db/analytics-hadoop [14:39:27] but nothing to worry about, probably a better usage of the page/disk cache? [14:39:34] the rest looks fine [14:39:41] ah dmesg now needs sudo [14:39:51] but nothing else that I noticed [14:40:02] joal: you are a gentleman and a scholar [14:40:30] ottomata: ? [14:40:37] elukey: that's namenode mem? [14:40:44] joal: for schema differ [14:40:46] :) [14:40:46] ottomata: I like the compliments :) [14:41:01] we haven't reimaged namenode, right? [14:41:01] ottomata: should do the thing for minimal cost, no ? [14:41:05] why would namenode heap drop [14:41:10] yeah, that is so simple and pretty [14:41:26] joal: is that spark 2 onlye? [14:41:27] only? [14:41:44] ottomata: should work fine with what we have (1.6) [14:41:54] ok great, just say spark-hive 2.1 in pom or something [14:41:59] joal: am confused about line 22 [14:42:04] if (res.size != (s1.fieldNames ++ s2.fieldNames).toSet.size) [14:42:13] res is new fields, right? [14:42:21] union.diff(intersection) [14:42:22] This is why I added a comment, but it's not clear enough :) [14:42:30] correct [14:42:31] i understand the coment, just not the code :) [14:42:32] so [14:42:35] huhu [14:42:42] s1 ++ s2 is union again, no? [14:42:57] so, i would expect new fields to != full union [14:43:04] ottomata: .toSet removes dups [14:43:08] oh [14:43:09] SET [14:43:10] like set set [14:43:12] right [14:43:17] i think my lazy eye read just Seq [14:43:42] Yeah, expressive languages makes it important to read carefully :) [14:43:48] i think i need to write this down, hang on... [14:44:52] joal [14:44:54] u.diff(i) [14:44:58] if u == ab [14:45:00] sorry [14:45:02] yeah [14:45:03] u == abc [14:45:07] and i == c [14:45:09] u.diff(i) [14:45:12] ==? ab [14:45:13] ? [14:45:28] correct ottomata [14:45:30] ok [14:45:55] diff as in difference here (set accepting dups-wise) [14:46:09] ok, so yeah, still confuse then [14:46:11] let's say [14:46:13] s1 = ab [14:46:16] s2 = abc [14:46:24] i == c [14:46:28] no [14:46:36] sorry [14:46:36] ah ya [14:46:37] i = ab [14:46:47] s1.diff(s2) = [] [14:47:02] s2.diff(s1) = [c] [14:47:11] when using sequences [14:47:14] hmm oh ok [14:47:55] Given this, the `diff` and `union` value I use could be renamed to prevent misunderstanding [14:48:01] ottomata: --^ [14:48:04] ok, still confused [14:48:08] s1 = ab [14:48:09] s2 = abc [14:48:25] i = s1.intersect(s2) == ab [14:48:27] right? [14:48:34] yes [14:48:38] u = abc [14:48:40] right? [14:48:51] or, u == ababc [14:48:51] ? [14:48:57] u = abc [14:48:59] ok right [14:49:10] res = u.diff(i) == c [14:49:11] right? [14:49:42] s1 ++ s2.set == (ab ++ abc).set == abc [14:49:46] ? [14:49:57] s1 ++ s2.set == (ab ++ abc).set == abc NOPE [14:50:05] ok [14:50:06] Here we're not talking stes [14:50:14] (ab ++ abc) = ababc [14:50:22] That's why I need to remove intersect [14:50:29] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3153961 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1039.eqiad.wmnet', 'analytics1051.eqiad.wmnet'] ``` Of which those **FAILED**: ```... [14:50:51] OHHH [14:50:53] fields vs names [14:50:55] right. [14:51:01] ottomata: nono Active Memory for all the workers, last graph in our board.. an100[12] are still ubuntu :) [14:51:02] res might have multiple fields with the same name [14:51:46] ottomata: another way would be to use toSet as for names [14:52:01] hmm, ya elukey but i'd expect nodes to use less mem after restart anyway, but i they stay that way, then great! :) [14:52:48] joal: ok, but in my example, it'lll still break, no? [14:52:53] let's say my example has no duplicate name types [14:53:02] s1 = ab and s2 = abc [14:53:13] res will be c [14:53:18] nope [14:53:20] no? [14:53:27] with current code, res is abc :) [14:53:37] because union of ab a [14:53:38] i = ab [14:53:55] ab.union(abc) = ababc [14:54:02] OHHHH [14:54:20] ottomata: We could use toSet and make it easier to understand :) [14:54:27] because s1.a != s2.a because we are talking StructField records, not name [14:54:28] s [14:54:33] (s1 ++ s2).toSet [14:54:54] ottomata: CORRECT ! [14:55:14] got it [14:55:26] ottomata: and actually, more than that: because we're talking sequence, not set [14:55:41] event with same name and types, we'd have ababc [14:56:04] ok, joal i think what you have is fine, maybe just for help for future dummies like me, add a a simple example in the comments [14:56:05] ? [14:57:41] ottomata: batcave 1 minute before standup? [15:01:16] ottomata: ping startuppppp [15:01:19] OO [15:01:24] ottomata: ping standddupppp [15:01:34] joal: sorry missed that ping! [15:12:16] 06Analytics-Kanban, 10Analytics-Wikistats: Visual prototype for community feedback for Wikistats 2.0 iteration 1. - https://phabricator.wikimedia.org/T157827#3154007 (10Milimetric) We will be implementing the following changes in the design before it goes out for consultation: Wikistats 2.0 design updates... [15:13:06] 06Analytics-Kanban, 10Analytics-Wikistats: Create and monitor Round2 consultation page - https://phabricator.wikimedia.org/T162155#3154009 (10Milimetric) [15:19:18] ottomata: 1039 and 1051 back serving traffic [15:19:44] (03CR) 10Ottomata: Use hive query instead of parsing non existent sampled TSV files (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346197 (owner: 10Ottomata) [15:19:50] 06Analytics-Kanban: Pagecounts all sites data issues - https://phabricator.wikimedia.org/T162157#3154040 (10Nuria) [15:19:54] elukey: yeehaw [15:20:09] elukey: if you want, we can try to do a journalnode together after standup [15:20:09] today [15:20:21] 06Analytics-Kanban: Pagecounts all sites data issues - https://phabricator.wikimedia.org/T162157#3154053 (10Nuria) https://github.com/wikimedia/analytics-reportcard-data/edit/master/datafiles/rc_page_requests.csv [15:25:09] ottomata: sure! [15:46:24] (03PS2) 10Joal: [WIP] Add Spark schema handler to refinery-core [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346291 (https://phabricator.wikimedia.org/T161924) [15:46:28] ottomata: --^ [15:46:31] here you go :) [15:48:40] yeehaw! [15:48:43] thanks joal [15:50:12] (03CR) 10Nuria: [V: 032 C: 032] Add zero carrier to druid pageviews [analytics/refinery] - 10https://gerrit.wikimedia.org/r/346235 (https://phabricator.wikimedia.org/T161824) (owner: 10Joal) [15:50:39] 06Analytics-Kanban, 13Patch-For-Review: Add zero carrier to pageview_hourly data on druid - https://phabricator.wikimedia.org/T161824#3154139 (10Nuria) a:03JAllemandou [16:06:40] nuria, milimetric: it seems that the projectcount issue comes from meta.m [16:06:53] joal: on meeting , can talk later [16:07:11] In hive, when excluding meta.% from counted domain over year 2014 grouped by month, I get reasonable results [16:08:16] huh, weird. Close to the old numbers? 22ish billion per month? [16:09:06] yup milimetric [16:10:06] well, that's weird and unexpected then. What's hiding in meta? Could the project mapping have mapped all mobile hits to meta as duplicates or something? [16:11:23] I don't think so milimetric (or maybe, but can't imagine why) [16:11:42] and when saying meta milimetric, I mean the abbrev: meta.m and meta.mw [16:12:04] Just thinking 'cause they both have m. [16:12:27] oh wait, right [16:13:04] .m was overloaded for something else [16:17:49] !log Restart webrequest-load-wf-text-2017-4-4-14 [16:17:50] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [16:21:09] thanks joal :) [16:21:19] np elukey ;) [16:21:21] we are reimaging 1051 [16:21:26] err 1052 (journal node) [16:21:31] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3154217 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by otto on neodymium.eqiad.wmnet for hosts: ``` ['analytics1052.eqiad.wmnet'] ``` The log can b... [16:39:37] a-team: I have updated https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Doc_proposal with page moves proposal -- It's a loooooong list, but I'd love if you could take a few minutes to give an opinion :) [16:40:08] will do [16:46:22] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3154270 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1052.eqiad.wmnet'] ``` and were **ALL** successful. [16:52:02] ottomata: sqoop failed again (non-modified version) on 2 wikis (same as previous) [16:54:56] same wikis? [16:55:03] with retry? [16:55:09] must be something wrong with those wikis [16:55:22] no retry, not modified versionm [16:55:43] ottomata: I'll rerun them manually for this time, next month we'll monitor again [16:56:10] rerun them manually? [16:56:18] joal: we have succeeded with these wikis before [16:56:18] ? [16:56:24] yes ottomata [16:56:25] hm [16:56:27] weird [16:56:32] last run (2017-03) I did the same [16:57:00] * elukey afk! [16:57:01] byeee [17:03:48] joal: and manual run succeeds just fine? [17:03:57] ottomata: currently trying anew [17:04:06] ottomata: looks fine [17:13:40] ottomata: new run worked just fine [17:13:54] ottomata: let's wait for next month and check if the new version helps [17:15:23] ok [17:26:43] !log Restart mediawiki-history-denormalize-wf-2017-04 [17:26:44] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:27:59] ottomata: all good with 1052?? [17:28:06] its still chowning! [17:28:13] gooooood [17:28:24] will recheck in a couple of hours! [17:30:43] nuria: , any other thoughs on https://gerrit.wikimedia.org/r/#/c/346197/ or can I merge? [17:31:06] (03CR) 10Nuria: [C: 032] Use hive query instead of parsing non existent sampled TSV files [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/346197 (owner: 10Ottomata) [17:31:25] ottomata: merged, sorry, got caught up on budget land again [17:31:31] np [17:31:33] danke [17:33:06] oh joal still there? i remember another thought I had about the schema differ [17:42:36] ottomata: shoot [17:44:06] joal: so partitions [17:44:15] hm [17:44:29] currently all partitions must be in the schema ja? [17:44:30] hmm [17:44:35] i guess i can make a partition schema [17:44:42] and use it in the jointSchemas [17:44:46] or even just always use it as base schema [17:44:47] hmMmM [17:44:49] like [17:44:56] I don't get it ottomata :) [17:44:57] revision, year, month, day [17:44:59] year month day [17:45:02] are not in the json record [17:45:12] Arf ! [17:45:12] the are in the filesystem [17:45:20] timestamp is in the record [17:45:47] ottomata: there is a trick for spark to read partition and integrate them in base schema [17:45:52] oh? [17:45:55] ottomata: using basePath [17:46:02] even if partitoin is not hive style [17:46:07] 2017/04/03/00 [17:46:08] ? [17:46:14] ottomata: Ah nope [17:46:17] yeah [17:46:20] camus sucks at that [17:46:26] ottomata: Ahh, sorry forgot that this data is camus written [17:46:29] maybe we should investigate fixing that part about camus [17:46:38] i mean, we could import from kafka....but then we'd have to tons of manual stuff [17:46:45] And ottomata, for schema to be correct, I need partitions to be field defined [17:46:56] right, but joal, i could do that manually and set them in the job [17:47:01] this will basically be a refine job [17:47:02] so [17:47:08] make base schema be the one that contains the partitions [17:47:29] and, then as part of refine, we just add the partition fields either from fs dirs or from timestamp [17:47:39] ottomata: makes sense [17:47:47] ok, i'll thikn/work on that [17:47:52] or [17:47:56] we could fix camus to do hive partitions [17:48:00] that might not be so hard [17:48:04] and could be somethign we want anyway [17:48:08] what is the basePath trick? [17:48:33] elukey: fyi, an52 back, journalnode looks good [17:48:39] i had to restart journalnode proc an extra time [17:48:48] i think puppet brought it up before we copied journal data back in [17:48:49] I think you can tell spark to read data to a full-path (like a single partition), and tell it to use a specific portion of the path as BasePath to extract partition values [17:48:54] going to delete the extra copy in data/b [17:49:28] hm [17:50:55] ottomata: Will leave for tonight - tomorrow again ! [17:51:20] ok, laters! [17:51:22] thanks joal! [17:58:32] joal: back from budget land [18:06:56] Hey nuria: really quick before I leave: seems that the diff is related to meta.[m|mw] [18:07:05] I have not investigated further [18:07:18] joal: ok, will do some plots [18:07:24] joal: data is on hive right? [18:07:44] nuria: yes, wmf.projectcount_raw [18:08:05] only? [18:08:11] wasn't part of it on another table? [18:08:15] cc joal [18:08:56] nuria: projectcounts_all_site, but problem can be found on projectcounts_raw only if looking at 2014 [18:10:51] joal: ya, changed plots see [18:11:46] joal: https://analytics.wikimedia.org/dashboards/reportcard/#pagecounts-dec-2007-dec-2016 [18:11:53] joal: check it out, it includes meta [18:12:14] nuria: allright - We have our thing [18:12:25] now, what the heck is that ???? [18:12:27] joal: as , it matches PERFECTLY right? [18:12:43] joal: we have the what [18:12:48] nuria: if not perfectly, perfectly enough for me not to look further :) [18:12:49] joal: now we need the why [18:12:55] indeed [18:13:01] !log starting jessie upgrade of analytics105[34] [18:13:02] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:13:04] joal: will do some digging [18:13:09] nuria: It'll be tomorrow for me ;) [18:13:35] maybe the living history of mediawiki halfak might have ideas ...? [18:13:35] cc milimetric see spike on reportcard, i included meta now : https://analytics.wikimedia.org/dashboards/reportcard/#pagecounts-dec-2007-dec-2016 [18:13:46] halfak the ELDER? [18:13:50] we'll see ! Tomorrow a-team [18:13:53] o/ [18:13:55] wat [18:14:10] ok, I need to clesar this off before leaving ;) [18:14:17] huhu [18:14:41] halfak: any idea of what happened to meta? https://analytics.wikimedia.org/dashboards/reportcard/#pagecounts-dec-2007-dec-2016 [18:15:04] halfak: in between 2012/2015 [18:16:03] anyway, living for now (halfak I'll try to bother tomorrow on python/spark thing) [18:16:24] cool [18:16:26] looking [18:16:31] will have thoughts for tomorrow [18:16:32] o/ [18:20:41] 06Analytics-Kanban: Pagecounts all sites data issues - https://phabricator.wikimedia.org/T162157#3154496 (10Nuria) [18:21:00] 06Analytics-Kanban: Pagecounts all sites data issues - https://phabricator.wikimedia.org/T162157#3154040 (10Nuria) {F7216784} [18:23:02] bearloga: so i know what is going on when it comes to data, both you and Deskana move under contributors still working on search data? or rather editing data or ahem ... we do not know? [18:25:22] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3154500 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by otto on neodymium.eqiad.wmnet for hosts: ``` ['analytics1053.eqiad.wmnet', 'analytics1054.eq... [18:26:23] nuria: short answer: we do not know. long answer: what happens to Discovery's analysis team is one of the points of contention right now and the whole thing is unclear at the moment. [18:27:24] bearloga: ok, understood, just let me know when the dust settles so we know who does what so we know who to loop in what initiatives cc Deskana [18:49:08] milimetric: yt? [18:49:50] ottomata: yt? [18:50:08] ya hey [18:50:20] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3154665 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1053.eqiad.wmnet', 'analytics1054.eqiad.wmnet'] ``` and were **ALL** successful. [18:56:20] yeah nuria, what's up [18:56:48] milimetric: upgraded conf for reportcard, spike comes from meta: https://analytics.wikimedia.org/dashboards/reportcard/#pagecounts-dec-2007-dec-2016 [18:56:54] milimetric: does it ring a bell at all? [18:57:29] yeah, joseph found that in hdfs this morning, I can't figure out why meta would have more data in it [19:26:05] milimetric: we also need to change banner on wikistats to point to round2 consultation (or redirect from https://www.mediawiki.org/wiki/Wikistats_2.0_Design_Project/RequestforFeedback/Round1) [19:26:35] right, but I have to finish writing it first [19:26:50] I'll patch wikistats to point to it as soon as I finish and publish the new prototype changes [19:26:54] 06Analytics-Kanban, 10Analytics-Wikistats: Create and monitor Round2 consultation page - https://phabricator.wikimedia.org/T162155#3154895 (10Nuria) Let's make sure wikistats banner points to this page before we start the consultation: https://www.mediawiki.org/wiki/Wikistats_2.0_Design_Project/RequestforFeedb... [19:50:04] !log beginning jessie reimage for analytics105[56] [19:50:12] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [20:11:26] ottomata: niceeeee \o/ [20:25:25] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3155030 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by otto on neodymium.eqiad.wmnet for hosts: ``` ['analytics1056.eqiad.wmnet'] ``` The log can b... [20:25:58] 1055 is taking a while to drain [20:28:32] this is massive, only 4 nodes left with ubuntu in the cluster [20:29:51] 06Analytics-Kanban: Pagecounts all sites data issues - https://phabricator.wikimedia.org/T162157#3154040 (10Nuria) No data for wikidata in cassandra: 0 rows) cassandra@cqlsh> select * from "local_group_default_T_lgc_pagecounts_per_project".data where "_domain"='analytics.wikimedia.org' and "access-site"='desk... [20:33:10] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3155081 (10ops-monitoring-bot) Script wmf_auto_reimage was launched by otto on neodymium.eqiad.wmnet for hosts: ``` ['analytics1055.eqiad.wmnet'] ``` The log can b... [20:43:17] 06Analytics-Kanban: Pagecounts all sites data issues - https://phabricator.wikimedia.org/T162157#3155114 (10Nuria) Pageviews on meta definitely not real, see wikistats: https://stats.wikimedia.org/wikispecial/EN/ReportCardTopWikis.htm#lang_meta [20:50:10] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3155140 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1056.eqiad.wmnet'] ``` and were **ALL** successful. [20:58:47] 06Analytics-Kanban, 06Operations, 15User-Elukey: Reimage all the Hadoop worker nodes to Debian Jessie - https://phabricator.wikimedia.org/T160333#3155144 (10ops-monitoring-bot) Completed auto-reimage of hosts: ``` ['analytics1055.eqiad.wmnet'] ``` and were **ALL** successful. [21:35:29] 10Analytics, 10Fundraising-Backlog: Storage for banner history data - https://phabricator.wikimedia.org/T161635#3155335 (10ggellerman) p:05Triage>03Normal [22:25:46] PROBLEM - Hadoop DataNode on analytics1054 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.hdfs.server.datanode.DataNode [23:50:21] 10Analytics, 10DBA, 06Operations: Prep to decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#3155832 (10jcrespo) > @leila, we can dump and copy to analytics-store, as long as there aren't any database.table name collisions. I hope you are aware that if for any reason...