[01:38:28] 10Analytics, 10Patch-For-Review, 10User-bd808, 10cloud-services-team (Kanban): Remove logging from labs for schema https://meta.wikimedia.org/wiki/Schema:CommandInvocation - https://phabricator.wikimedia.org/T166712#3649673 (10Nuria) Let us know if we can also delete the tables this schema was logging to [01:40:15] 10Analytics, 10Patch-For-Review, 10User-bd808, 10cloud-services-team (Kanban): Remove logging from labs for schema https://meta.wikimedia.org/wiki/Schema:CommandInvocation - https://phabricator.wikimedia.org/T166712#3649674 (10bd808) >>! In T166712#3649673, @Nuria wrote: > Let us know if we can also delete... [01:44:45] (03CR) 10BryanDavis: [C: 031] fix name and link of host from labs [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/370606 (owner: 10Quiddity) [05:13:20] (03CR) 10Zhuyifei1999: [C: 032] fix name and link of host from labs [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/370606 (owner: 10Quiddity) [05:13:44] (03Merged) 10jenkins-bot: fix name and link of host from labs [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/370606 (owner: 10Quiddity) [09:19:16] Hi a-team, I'm sick (Lino and Melissa as well :(, I won't work today, probably neither tomorrow - I will send an email for confirmation [09:20:54] * elukey hugs joal [09:30:32] 10Analytics-Cluster, 10Analytics-Kanban, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3650057 (10elukey) Opened https://github.com/prometheus/jmx_exporter/issues/193 to ask to upstream if snake case could be implemented for other fields... [09:52:22] 10Analytics-Cluster, 10Analytics-Kanban, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3650208 (10fgiunchedi) Thanks @elukey and @Ottomata ! That would indeed be helpful to have and more readable too. FWIW I don't feel particularly stron... [11:01:59] * elukey lunch! [12:15:04] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3650388 (10Marostegui) @Nuria can we get rid of these tables finally? [12:58:19] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Use Prometheus for Kafka JMX metrics instead of jmxtrans - https://phabricator.wikimedia.org/T175922#3650506 (10elukey) New dashboard in https://grafana.wikimedia.org/dashboard/db/prometheus-kafka [13:31:11] 10Analytics-Cluster, 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: Use Prometheus for Kafka JMX metrics instead of jmxtrans - https://phabricator.wikimedia.org/T175922#3608057 (10elukey) I just verified that all metrics that we had in the Kafka dashboard are currently showed by the new prometheus... [13:39:13] 10Analytics-Cluster, 10Analytics-Kanban: Mirror topics from main Kafka clusters (from main-eqiad) into jumbo-eqiad - https://phabricator.wikimedia.org/T177216#3650677 (10Ottomata) [13:42:58] hiii mornin elukey ! :D [13:43:04] if you have a minute, could you look at https://gerrit.wikimedia.org/r/#/c/381489/ ? [13:43:13] i refactored some jmx exporter stuff there so want your opinion [13:48:02] ottomata: sure! In the meantime, https://grafana.wikimedia.org/dashboard/db/prometheus-kafka :) [13:48:34] if/when we change kafka metrics names it will be a 5 min find/replace job [13:49:09] ya looking goooOOOd i saw it [13:50:07] I like the code change a lot [13:50:36] running pcc but it seems awesome [14:02:43] gr8! [14:03:11] elukey: merging [14:03:54] ack! [14:04:22] hmm actually, mabye I shoulda waited til we decide on metric names for sure. i guess it won't hurt? [14:04:59] nah we'll just have to remember to silence the hosts [14:05:04] k [14:15:13] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review, 10Readers-Web-Backlog (Tracking): Schema:Popups suddenly stopped logging events in MariaDB, but they are still being sent according to Grafana - https://phabricator.wikimedia.org/T174815#3650853 (10elukey) Created https://wikitech.wikimedia.... [14:16:15] a-team: we have only 6% free space left for dbstore1002, so we need to take some actions asap :) [14:16:58] elukey, qq: what was the plan again? are we going to deactivate mediawiki replication in db1047? Or keep both replications there? [14:18:05] mforns: db1047 for the moment will not be touched, but its replacement (that it should already be in the DC now, hopefully) will only focus on log database replication [14:18:21] meanwhile dbstore1002 will only get wiki replication data [14:18:33] but to be effective we'd need to drop the log database in there [14:18:37] elukey, so for jobs that query mediawiki tables, they should keep pointing to dbstore1002? [14:18:50] yep exactly [14:18:52] ok [14:18:57] thanx [14:19:02] the things to do are, in my opinion: [14:19:32] 1) verify that the data for the log database on dbstore1002 is the same as db1047 (so not extra tables that we are not aware of, etc..) [14:20:02] 2) alert people using analytics-store for the log database that we'd need reclaim space as soon as possible [14:20:22] ideally I'd like to ask them to move directly to the brand new replacement of db1047, that will be way faster [14:20:32] but I am still not sure where the host is :D [14:33:36] sorry team, I had some trouble this morning at the apartment, I [14:33:40] I'm online now [14:34:36] elukey: I agree with your plan, I can write the email to the lists (analytics and research I guess?) [14:34:51] elukey: is it possible to drop just a few of the bigger tables so we limit the impact? [14:35:06] then make another announcement that we're dropping more later? [14:35:30] milimetric: sure sure, I'll ask to nuria_ the first ones to drop that we have already sqooped, no need to drop anything else right now.. but it would be great to have a plan for the following days :) [14:37:18] 10Analytics-Kanban, 10Analytics-Wikistats: Pageview retrieval does not work if one of the fails requests - https://phabricator.wikimedia.org/T176261#3650927 (10mforns) [14:39:44] elukey: yeah, if min/max timestamp is the same on db1047 for a table, then it makes sense that we can drop it. [14:55:34] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats unique devices metrics needs some copy that says "monthly" - https://phabricator.wikimedia.org/T176240#3650987 (10fdans) [14:56:56] 10Analytics-Kanban, 10Analytics-Wikistats: Wikistats2 bugs (4/4) - Detail page - https://phabricator.wikimedia.org/T170940#3650990 (10fdans) [15:07:00] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10Patch-For-Review, 10User-Elukey: Use Prometheus for Kafka JMX metrics instead of jmxtrans - https://phabricator.wikimedia.org/T175922#3651006 (10faidon) [15:12:05] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3651025 (10Nuria) Sounds (per e-mail conversation) that reserachers are interested on the data: https://lists.wikimedia.org/pipermail/wiki-research-l/2017-July/005931.html so i do not think we can remove them [15:23:15] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3651040 (10fgiunchedi) [15:25:04] 10Analytics-Kanban, 10User-Elukey: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T168303#3360724 (10Nuria) We can drop the following tables: TO_DROP_ImageMetricsCorsSupport_11686678 _EchoInteraction_5782287 and per ticket: https://phabricator.wikimedia.org/T171629 we can drop: PageCr... [15:29:05] 10Analytics-Kanban, 10User-Elukey: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T168303#3651067 (10Nuria) Let's scoop and archive: MediaViewer_10867062_15423246 [15:29:32] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3651068 (10Ottomata) Upstream author replied in https://github.com/prometheus/jmx_exporter/issues/193#issuecomment-333522441, says he t... [15:34:36] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3651096 (10Marostegui) 05Open>03declined How can something last written in 2013 still be useful? ``` root@db1052:/srv/sqldata/enwiki# ls -lh moodbar_feedback*.ibd -rw-rw---- 1 mysql mysql 40M Mar 12 2013 moo... [15:37:18] 10Analytics-Kanban: hashing code for refinery - https://phabricator.wikimedia.org/T177224#3651112 (10Nuria) [15:40:15] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3651160 (10Reedy) I don't see why it can't just be exported to an sql dump file, and archived somewhere. Possibly then imported to another db cluster (analytics or something) if someone wants it in future [15:42:11] 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Create Druid public cluster such AQS can query druid public data - https://phabricator.wikimedia.org/T176223#3651179 (10Nuria) [15:42:32] 10Analytics-Kanban, 10Analytics-Wikistats, 10Patch-For-Review: Create Druid public cluster such AQS can query druid public data - https://phabricator.wikimedia.org/T176223#3617909 (10Nuria) [15:44:11] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3651200 (10Eevans) Isn't snakecase the //Prometheus// convention? If we're concerned about consistency, aren't we concerned with these... [16:03:29] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3651306 (10jcrespo) 05declined>03Open If those tables are not in use in production, those tables have to be dropped from production boxes. Unless someone else wants to become the owner of mediawiki dbs, that... [16:08:16] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3651321 (10Ottomata) > We used some sample config (from Prometheus upstream, IIRC) for Cassandra that rewrote the very verbose MBean n... [16:09:19] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3651325 (10Ottomata) > Isn't snakecase the Prometheus convention? AFAIK, snakecase isn't an option for JMX metrics. It is either as t... [16:09:58] 10Analytics: Private geo wiki data in new analytics stack - https://phabricator.wikimedia.org/T176996#3651340 (10fdans) p:05Normal>03High [16:13:06] 10Analytics, 10EventBus, 10Wikimedia-Stream: Hits from private AbuseFilters aren't in the stream - https://phabricator.wikimedia.org/T175438#3651354 (10fdans) @Nirmos we'll be looking at this but it's possible that this behaviour is by design. [16:20:02] 10Analytics-Kanban, 10Beta-Cluster-Infrastructure: deployment-kafka01 - disk is full - https://phabricator.wikimedia.org/T174742#3651385 (10fdans) [16:23:39] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3651408 (10Eevans) >>! In T177078#3651321, @Ottomata wrote: >> We used some sample config (from Prometheus upstream, IIRC) for Cassand... [16:33:14] 10Analytics-Kanban, 10User-Elukey: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T168303#3651448 (10Nuria) next ones: MediaViewer_10867062_15423246 299.91 MobileWikiAppToCInteraction_10375484_15423246 140.57 Edit_13457736_15423246 130.85 MobileWikiAppSearch_10641988_15423246 83.53 [16:34:10] doing lunch, but I'll be around [16:39:08] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3651477 (10Nuria) >I don't see why it can't just be exported to an sql dump file, and archived somewhere. Possibly then imported to another db cluster (analytics or something) if someone wants it in future I thin... [16:53:45] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3651546 (10jcrespo) See my proposal, it keeps both manuel and you happy (but requires work from those wanting to keep these around, which I think is fair :-P). [17:06:12] 10Analytics, 10Proton, 10Readers-Web-Backlog, 10Patch-For-Review, 10Readers-Web-Kanban-Board: Implement Schema:Print purging strategy - https://phabricator.wikimedia.org/T175395#3592394 (10phuedx) {5d450b910d4e228aa6c657f3acb858f50736c840} was -1d by @mforns. [17:07:10] a-team: going to remove the following tables https://phabricator.wikimedia.org/T168303#3651045 from all the el databases [17:07:52] looks good elukey, and we're waiting on sqoop to finish to remove the other big ones one by one [17:08:07] let us know what the size looks like after this first drop [17:10:34] MariaDB [log]> show tables like '%Page'; [17:10:35] Empty set (0.00 sec) [17:10:49] PageCreation_7481635_1542324 and PageCreation_7481635 are not there ? [17:11:01] this is dbstore1002 [17:11:52] elukey: show tables like 'Page%' right? [17:11:52] nuria_: --^ [17:12:42] yesss sorry [17:12:48] but the tables are not there [17:13:10] I have only [17:13:10] | PageCreation_7481635 | [17:13:11] | PageCreation_7481635_15423246 | [17:13:54] elukey: wait .. me no compredou [17:14:01] those are the two we can delete [17:14:14] https://www.irccloud.com/pastebin/KKSMviOM/ [17:14:16] on the task you mentioned PageCreation_7481635_1542324 and PageCreation_7481635 [17:14:21] plus the [17:14:33] elukey: yeah, these are the ones from the task, that you just pasted here [17:14:42] they're the same names: [17:15:11] ahhh sorry I've read them from the other task [17:15:13] my bad [17:15:23] just wanted to confirm, my brain was confused [17:15:23] elukey: ahhh [17:15:29] elukey: ay ay [17:15:40] elukey: ok so we can delete those two [17:17:06] ah wait this is why I was confused - in the dbstore1002's task I can see "and per ticket: https://phabricator.wikimedia.org/T171629 we can drop: PageCreation_7481635_1542324 and PageCreation_7481635" [17:17:20] https://phabricator.wikimedia.org/T168303#3651045 [17:17:26] but those are not the ones on dbstore1002 [17:17:39] same PageCreation prefix but different revision [17:17:46] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3651673 (10Nuria) >but requires work from those wanting to keep these around, which I think is fair @jcrespo from research team? [17:18:00] that is probably fine as far as I can tell [17:18:34] (please people be patient with a poor ops dropping tables :P) [17:20:48] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3651683 (10Nuria) No, ok, you mean, analytics right? To be clear we do not have a use for that data ourselves but I think it should not be deleted if it is of interest for reserarch. Would you be so kind as to ou... [17:21:32] https://www.irccloud.com/pastebin/KKSMviOM/ are matching the ones that I can see on the db [17:21:35] okok [17:21:40] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3651686 (10jcrespo) Whoever wants to keep them! I say it is fair because normally when you ask **if** to keep them around, everybody is for it; if you ask **who** wants to keep around and take care of archiving i... [17:24:12] milimetric: sorry for the extra ping, but PageCreation_7481635 and PageCreation_7481635_15423246 are the correct ones right? [17:24:47] elukey: I checked and re-checked those names several times, but I'm dyslexic, one second, let me diff it [17:25:34] in the task there are "PageCreation_7481635_1542324 and PageCreation_7481635" meanwhile on the db there are "PageCreation_7481635 and PageCreation_7481635_15423246" [17:26:09] I am pretty sure it is fine but I want to be extra careful since after I press enter the data is gone :D [17:26:27] elukey: aha! so there is that tiny difference [17:26:30] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3651708 (10Nuria) @jcrespo: there is a staging database in the analytics replicas, could those tables be copied there before you delete them for all wikis? That is the best I can think of right now. [17:26:31] but I'm sure that's a typo [17:26:47] all right, I thought the same but I wanted a confirmation [17:26:49] proceeding then [17:26:51] because there's no such thing as _1542324 (it's always supposed to be _15423246) [17:27:12] +1 [17:27:34] elukey: yes +1 to dan, sorry [17:28:10] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3651713 (10jcrespo) Yes, that was actually my implicit suggestion. Other things can be suggested, and we will help, we just need to take them outside the *wik* dbs. [17:28:19] done! [17:29:17] space freed basically un-noticed :( [17:30:05] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3651718 (10Nuria) @jcrespo ok, i wasn't clear. Then maybe we can put them in a better-names database like "mediawiki-archive"? [17:30:10] elukey: ok, and the other two i listed should be safe to drop too, working now on blacklisting pagecontentsavecomplete [17:30:31] elukey: which will not free space but will stop it from being eaten [17:30:51] 10Analytics-Kanban, 10User-Elukey: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T168303#3651719 (10elukey) >>! In T168303#3651045, @Nuria wrote: > We can drop the following tables: > > TO_DROP_ImageMetricsCorsSupport_11686678 > > > _EchoInteraction_5782287 > > and per ticket: https:... [17:44:12] nuria_: if you are ok I'd step away from the keyboard and check your updates tomorrow morning [17:44:36] elukey: sure, i am importing now the media viewer table but it will take like 20 mins [17:47:16] super thanks :) [17:47:19] * elukey off! [17:47:31] byeeee team!! [17:54:17] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review, 10Readers-Web-Backlog (Tracking): Schema:Popups suddenly stopped logging events in MariaDB, but they are still being sent according to Grafana - https://phabricator.wikimedia.org/T174815#3651818 (10Nuria) @Tbayer Waiting for your confirmatio... [18:11:25] 10Analytics: Making geowiki data public - https://phabricator.wikimedia.org/T131280#3651863 (10Ijon) Thank you, @Milimetric. That would be great. [18:13:47] milimetric: i think cluster is super busy with end of the month jobs so the scooping is going to take forever [18:17:01] nuria_: makes sense, true [18:17:49] we could do something more manual like mysqldump / copy to stat1005 and stat1004 for redundancy, and when it's done copy up to HDFS ? [18:22:12] taking off for my midday break, nuria_, but ping me with what you want to do and I can help in a couple hours [18:22:19] milimetric: np [18:28:26] ottomata: yt? [18:30:19] oming! [18:42:41] 10Analytics, 10Proton, 10Readers-Web-Backlog, 10Patch-For-Review, 10Readers-Web-Kanban-Board: Implement Schema:Print purging strategy - https://phabricator.wikimedia.org/T175395#3651958 (10mforns) BTW, thanks a lot for taking the time and effort to create that puppet change! [19:02:22] (03PS2) 10Mforns: Replace references to dbstore1002 by db1047 [analytics/limn-flow-data] - 10https://gerrit.wikimedia.org/r/380751 (https://phabricator.wikimedia.org/T176639) [19:05:29] (03CR) 10Mforns: Replace references to dbstore1002 by db1047 (033 comments) [analytics/limn-flow-data] - 10https://gerrit.wikimedia.org/r/380751 (https://phabricator.wikimedia.org/T176639) (owner: 10Mforns) [20:24:23] fdans: did you talked to release engineering about migrating depots? [20:26:21] @nuria_: not yet, will talk to them tomorrow eu time :) [21:12:26] 10Analytics-Kanban, 10User-Elukey: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T168303#3652306 (10Nuria) This table is going to free about 40G but not much more, running tests on data now: -rw-r--r-- 3 hdfs nuria 7.4 G 2017-10-02 18:45 hdfs://analytics-hadoop/user/nuria/MediaViewer...