[01:40:30] 10Analytics, 10Proton, 10Readers-Web-Backlog, 10Patch-For-Review, 10Readers-Web-Kanban-Board: Implement Schema:Print purging strategy - https://phabricator.wikimedia.org/T175395#3652765 (10bmansurov) @Tbayer, would you please take a look at Mforns' comment at https://gerrit.wikimedia.org/r/#/c/379829/3?... [03:45:44] 10Analytics-Kanban, 10User-Elukey: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T168303#3652812 (10Nuria) Table MediaViewer_10867062_15423246 can be dropped , it is now available on archive db in hadoop. [05:54:48] 10Analytics-Kanban, 10User-Elukey: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T168303#3652929 (10elukey) Cleaned up some eventlogging_cleaner logs, new status of dbstore1002: ``` /dev/mapper/tank-data 6.4T 6.0T 475G 93% /srv /dev/sda1 37G 32G 3.2G 91% / ``` The ro... [06:04:10] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3652947 (10Marostegui) I like Jaime's solution :-) Placing them on the staging database would work pretty well for me too, as we could simple issue the drop on the master and replication will not touch them for t... [07:31:25] 10Analytics-Kanban, 10User-Elukey: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T168303#3653045 (10elukey) >>! In T168303#3652812, @Nuria wrote: > Table MediaViewer_10867062_15423246 can be dropped , it is now available on archive db in hadoop. Sanity check: ``` # dbstore1002 MariaDB [log]... [07:48:11] 10Analytics-Kanban, 10User-Elukey: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T168303#3653063 (10elukey) Removed the super huge /var/log/eventlogging_sync.log files, now the root looks like: ``` /dev/sda1 37G 5.8G 29G 17% / ``` [09:48:08] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T168303#3653222 (10jcrespo) ``` root@dbstore1002[(none)]> SELECT CONCAT(table_schema, '.', table_name), -> CONCAT(ROUND(table_rows / 1000000, 2), 'M')... [09:50:34] 10Analytics-Kanban, 10Patch-For-Review, 10User-Elukey: dbstore1002 /srv filling up - https://phabricator.wikimedia.org/T168303#3653232 (10jcrespo) The plan is to stop s5 replication, and convert wb_terms to TokuDB. [11:04:03] * elukey lunch! [11:09:45] (03PS2) 10Mforns: Replace references to dbstore1002 by db1047 [analytics/limn-edit-data] - 10https://gerrit.wikimedia.org/r/380755 (https://phabricator.wikimedia.org/T176639) [11:12:05] (03Abandoned) 10Mforns: Replace references to dbstore1002 by db1047 [analytics/limn-language-data] - 10https://gerrit.wikimedia.org/r/380756 (https://phabricator.wikimedia.org/T176639) (owner: 10Mforns) [11:18:51] (03PS2) 10Mforns: Remove reference to old cname x1-analytics-slave [analytics/limn-ee-data] - 10https://gerrit.wikimedia.org/r/380757 (https://phabricator.wikimedia.org/T176639) [11:19:55] (03Abandoned) 10Mforns: Replace references to dbstore1002 by db1047 [analytics/limn-multimedia-data] - 10https://gerrit.wikimedia.org/r/380758 (https://phabricator.wikimedia.org/T176639) (owner: 10Mforns) [12:13:58] (03CR) 10Elukey: [C: 031] Replace references to dbstore1002 by db1047 [analytics/limn-edit-data] - 10https://gerrit.wikimedia.org/r/380755 (https://phabricator.wikimedia.org/T176639) (owner: 10Mforns) [12:14:42] (03CR) 10Elukey: [C: 031] Remove reference to old cname x1-analytics-slave [analytics/limn-ee-data] - 10https://gerrit.wikimedia.org/r/380757 (https://phabricator.wikimedia.org/T176639) (owner: 10Mforns) [12:39:18] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3653409 (10elukey) >>! In T177078#3651470, @Ottomata wrote: > Ah with fancy schmancy regex matching, with each regex actually matching... [13:06:37] 10Analytics-Kanban, 10User-Elukey: Alarms on pageview API latency increase - https://phabricator.wikimedia.org/T164243#3226937 (10elukey) I think that it would be great to set up those alarms directly on Prometheus metrics since we are moving away from jmx trans. It should be a matter of enabling the prometheu... [13:07:25] 10Analytics, 10Analytics-Cluster, 10Operations, 10Patch-For-Review, 10User-Elukey: rack/setup/install new kafka nodes kafka-jumbo100[1-6] - https://phabricator.wikimedia.org/T167992#3653443 (10elukey) 05Open>03Resolved [13:24:28] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3653496 (10Ottomata) Phewwwwwww let's think about this a little more. If we are about to say that to use JMX Exporter, you must create... [13:35:09] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3653516 (10elukey) >>! In T177078#3653496, @Ottomata wrote: > Phewwwwwww let's think about this a little more. If we are about to say... [13:51:31] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3646534 (10Gehel) In short: anything is fine by me except lower casing everything. Since I was asked, I'll add my opinion. I don't rea... [13:53:11] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3653578 (10Ottomata) > Because this is not Java, it is prometheus These are metrics from Java. Prometheus is monitoring the metrics i... [14:08:04] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3653611 (10elukey) >>! In T177078#3653578, @Ottomata wrote: >> Because this is not Java, it is prometheus > > These are metrics from J... [14:14:01] elukey: you around? [14:15:42] milimetric: o/ [14:15:57] hey, cave for a second? I wanna pick your brain and it's probably easier in real time [14:16:56] milimetric: sure, grabbing my headphones [14:17:52] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3653625 (10Eevans) >>! In T177078#3653578, @Ottomata wrote: >> as far as I can see from this task there is not much agreement in using... [14:18:50] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3653626 (10Eevans) >>! In T177078#3653611, @elukey wrote: >>>! In T177078#3653578, @Ottomata wrote: >>> Because this is not Java, it is... [14:20:18] 10Analytics-Cluster, 10Analytics-Kanban, 10monitoring, 10User-Elukey: Decide on casing convention for JMX metrics in Prometheus - https://phabricator.wikimedia.org/T177078#3653630 (10Ottomata) Oook, so it sounds like we don't want just to lower, and both elukey and eevans don't want camelCase. So, the onl... [14:21:47] 10Analytics, 10Release-Engineering-Team (Kanban): Move Wikistats 2 from Differential to Gerrit - https://phabricator.wikimedia.org/T177288#3653631 (10fdans) [14:23:33] 10Analytics, 10Release-Engineering-Team (Kanban): Move Wikistats 2 from Differential to Gerrit - https://phabricator.wikimedia.org/T177288#3653651 (10fdans) [14:30:56] ottomata: o/ [14:31:32] if you are ok I'd merge the acl change, and then we can set a basic auth all rule for the ANONYMOUS user [14:33:16] 10Analytics, 10Release-Engineering-Team (Kanban): Move Wikistats 2 from Differential to Gerrit - https://phabricator.wikimedia.org/T177288#3653672 (10hashar) I have copied the repository with: ``` git clone ssh://vcs@git-ssh.wikimedia.org/source/wikistats.git cd wikistats.git git remote add gerrit ssh://gerrit... [14:34:41] 10Analytics, 10Release-Engineering-Team (Kanban): Move Wikistats 2 from Differential to Gerrit - https://phabricator.wikimedia.org/T177288#3653674 (10fdans) [14:35:18] 10Analytics, 10Release-Engineering-Team (Kanban): Move Wikistats 2 from Differential to Gerrit - https://phabricator.wikimedia.org/T177288#3653675 (10hashar) [14:35:26] elukey: will that work even if we don't have any keys set up yet? [14:37:02] ottomata: if my experiments are correct, on the plaintext port everybody should be authenticated as "ANONYMOUS", that is the catch-all user for the non-authenticated ones [14:38:29] so it should only be a matter of restarting the cluster and then allowing basic produce/consume acls for ANONYMOUS [14:38:53] (ah also the acls to allow brokers to communicate between each other) [14:44:53] elukey: will the brokers need keys to ID themselves though? [14:45:52] 10Analytics, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): Move Wikistats 2 from Differential to Gerrit - https://phabricator.wikimedia.org/T177288#3653697 (10hashar) [14:48:09] ottomata: good point, lemme double check [14:50:39] root@thorium:/srv/src/wikistats-v2# git pull [14:50:40] fatal: unable to access 'https://phabricator.wikimedia.org/source/wikistats.git/': The requested URL returned error: 403 [14:50:49] !log restarted failed workflow 0057215-170829140538136-oozie-oozi-W (druid monthly banner activity) [14:50:50] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:51:23] did we change anything on the wikistats repo ? [14:51:25] like auth etc.. [14:51:46] otherwise it might be one of those random weird issues with git clone [14:51:57] elukey: not that I know [14:52:52] elukey: my clone uses ssh://vcs@git-ssh.wikimedia.org/source/wikistats.git though [14:53:15] https://phabricator.wikimedia.org/T177288 ? [14:53:40] seems now on gerrit [14:55:25] https://gerrit.wikimedia.org/r/#/admin/projects/analytics/wikistats2 [14:55:37] * elukey pings fdans [14:55:38] :) [14:55:59] HELLOOOO [14:56:06] we are on the gerrits now! [14:56:26] as of half an hour ago [14:56:31] elukey milimetric [14:56:41] uh... [14:56:51] oh, I guess technically we did say after the quarter's over :) [14:56:55] ok, fair enough [14:57:20] fdans / mforns: thanks very much for the review on the topic selector, I love mforns's suggestions [14:57:22] 10Analytics-Kanban: Backup some files from HDFS with checksumming on/after copy - https://phabricator.wikimedia.org/T177224#3653744 (10Ottomata) p:05Triage>03High a:03Ottomata [14:57:40] fdans: should I change the git clone on thorium to pull from gerrit? [14:57:50] differencial seems to return a 403 now [14:57:55] fdans: yeah, you should've let luca know, 'cause we need to change the puppet code [14:58:24] I'm sorry [14:58:57] I got word from nuria yesterday to ping releng about this, and I certainly didn't expect to have it done this quick [14:59:21] fdans: all fine, it is only a puppet failure, fixing it in a sec [15:00:38] ping fdans mforns [15:00:44] ping joal [15:00:53] ah no joal sick [15:09:36] fdans: RECOVERY - puppet last run on thorium is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [15:09:39] fixed :) [15:10:06] graaaaaaazieeeee elukey!!!! [15:11:38] fdans: I basically changed the origin to https://gerrit.wikimedia.org/r/analytics/wikistats2 ok? [15:11:46] (and also the puppet refs) [15:12:00] yes, that's great :) [15:12:36] super :) [16:00:29] mforns: i think that lsat JIRA you linked to is what I need [16:00:33] but isn't merged or in our version [16:01:57] 10Analytics, 10DBA: Drop MoodBar tables from all wikis - https://phabricator.wikimedia.org/T153033#3653939 (10Nuria) @marostegui: let's put them on a mediawiki-archive database, the staging database (if I am not mistaken) has open permits for everyone to delete /update. If that is possible i think that would... [16:02:52] 10Analytics-Kanban: Backup some files from HDFS with checksumming on/after copy - https://phabricator.wikimedia.org/T177224#3653941 (10Ottomata) [16:43:19] ottomata, :/ [17:05:44] 10Analytics, 10Proton, 10Readers-Web-Backlog, 10Patch-For-Review, 10Readers-Web-Kanban-Board: Implement Schema:Print purging strategy - https://phabricator.wikimedia.org/T175395#3654073 (10phuedx) a:05bmansurov>03Tbayer Per T175395#3652765, this is awaiting feedback from @Tbayer. [17:06:20] nuria_: staff? :) [17:07:15] elukey: yes coming [17:07:22] elukey: sorry, just out from meeting [17:53:03] wikimedia/mediawiki-extensions-EventLogging#700 (wmf/1.31.0-wmf.2 - aa97dbe : Translation updater bot): The build has errored. [17:53:03] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/compare/wmf/1.31.0-wmf.2 [17:53:04] Build details : https://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/282838500 [18:37:27] fdans: btw, I hadn't pressed Submit on this but the release branch is there now: https://gerrit.wikimedia.org/r/#/admin/projects/analytics/wikistats2,branches [18:37:52] milimetric: awesome, thank you :) [19:16:54] wikimedia/mediawiki-extensions-EventLogging#701 (wmf/1.30.1-wmf.2 - aa97dbe : Translation updater bot): The build has errored. [19:16:54] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/compare/wmf/1.30.1-wmf.2 [19:16:54] Build details : https://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/282878012 [19:50:16] 10Analytics, 10Proton, 10Readers-Web-Backlog, 10Patch-For-Review, 10Readers-Web-Kanban-Board: Implement Schema:Print purging strategy - https://phabricator.wikimedia.org/T175395#3654958 (10Tbayer) I (HaeB) have responded at https://gerrit.wikimedia.org/r/#/c/379829/3 . [21:29:12] nuria_: mforns, no luck with my final ideas. [21:29:12] but [21:29:14] i return to this [21:29:15]   hdfs dfs -text file:///path/to/local/file1 >/dev/null [21:29:17] WORKS [21:29:18] so. [21:29:21] there MUST be a way [21:29:31] works how? [21:29:35] ottomata: [21:29:43] to get a local ChecksumFileSystem that knows how to verify against the local .crcs [21:29:49] aah [21:29:50] ok [21:30:05] if i copyToLocal some files, let's say file1 [21:30:10] i get the contents of file1 locally [21:30:12] and .file1.crc [21:30:18] if I do hdfs dfs -text file:///path/to/local/file1 [21:30:21] it will output the file [21:30:22] BUT [21:30:26] if i edit file1 locally [21:30:28] and then run [21:30:28] hdfs dfs -text file:///path/to/local/file1 [21:30:32] it throws that exception [21:30:33] kaput [21:30:34] so [21:30:40] it must be checking against SOMETHING [21:30:48] dunno what it could be checking other than the local .crc [21:30:54] there has got to be a way to do it [21:30:58] nothing [21:31:00] Ya [21:31:01] via regular APIs [21:31:02] there is [21:31:20] hmmm, OH, right [21:31:21] wait [21:31:22] right [21:31:35] that works because it is just checking locally [21:31:42] which is not the same the as comparing HDFS [21:31:45] checksum [21:31:45] hm [21:31:50] ya [21:32:20] but still [21:32:25] it is not that far of [21:32:32] just an md5 away [21:32:44] of *something* inside that crc [21:34:27] yeah [21:34:40] we will figure it out [22:20:38] 10Analytics-Kanban: Final steps to expose project family unique devices data - https://phabricator.wikimedia.org/T167539#3655324 (10Milimetric) [22:20:40] 10Analytics-Kanban: Update html language for per-domain uniques - https://phabricator.wikimedia.org/T168477#3655323 (10Milimetric) [22:56:56] nuria_: whatever happened to the webrequest tagging project? [22:57:02] did we put that off or... [23:15:59] nuria_: nvm, I see org.wikimedia.analytics.refinery.hive.GetWebrequestTagsUDF