[01:14:35] 10Analytics-Kanban, 10Patch-For-Review: Mediawiki History Druid indexing failed - https://phabricator.wikimedia.org/T170493#3468720 (10Nuria) mmm... operator error: format is yyyy-MM-dd HH:mm:ss.S! [01:16:41] (03PS3) 10Nuria: Modifying ingestion spec after additions to edit history [analytics/refinery] - 10https://gerrit.wikimedia.org/r/366327 (https://phabricator.wikimedia.org/T170493) [07:18:23] 10Analytics, 10EventBus, 10Scap: eventlogging-service-eventbus scap deployments should depool/pool during deployment - https://phabricator.wikimedia.org/T171506#3467218 (10Joe) indeed. [07:41:53] 10Analytics, 10EventBus, 10Scap: eventlogging-service-eventbus scap deployments should depool/pool during deployment - https://phabricator.wikimedia.org/T171506#3469035 (10elukey) Puppet was broken on kafka2002: ``` ESC[1;31mError: Could not set home on user[eventlogging]: Execution of '/usr/sbin/usermod -d... [10:55:19] 10Analytics: Upgrade AQS to node 6.11 - https://phabricator.wikimedia.org/T170790#3469578 (10elukey) p:05Normal>03High Installed new debian packages on deployment-aqs01.eqiad.wmflabs / deployment-aqs02.eqiad.wmflabs / deployment-aqs03.eqiad.wmflabs, ready for a test. Since this upgrade contains security upda... [11:05:30] * elukey lunch [12:13:25] !log executed sudo apt-get remove openjdk-8-jre openjdk-8-jre-headless on druid nodes [12:13:26] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:19:05] hi team :] [12:19:25] mforns: o/ [12:19:36] elukey! :] [12:19:40] the script has completed its work on dbstore1002 :) [12:19:50] we can check and then restart the full purge [12:20:37] elukey, \o/ until end of 2014 right [12:20:38] ? [12:21:48] yep! actuall let me grab the interval [12:21:53] since I've ran it two times [12:22:03] k [12:23:38] should be 'start_ts': '20090501170620', 'batch_size': 10000, 'end_ts': '20141231170620'} [12:45:53] mforns: which aqs url would you use to query uniques for say all of wikiquote? [12:46:05] fdans, mmmmm [12:46:05] I seem to be only able to query all of wikipedia [12:48:03] fdans, how do you get wikipedia family? [12:48:28] I can not get data from it [12:48:37] mforns: if i do [12:48:38] https://wikimedia.org/api/rest_v1/metrics/unique-devices/wikipedia.org/all-sites/monthly/2016050500/2017050500 [12:48:44] that works [12:49:20] ohhh but I guess that's just for the splash page [12:49:23] not the whole family [12:49:46] but.... aren't you supposed to be able to query a whole family in uniques mforns ? [12:50:08] ah ok, monthly one sec [12:53:48] fdans, mmmmm, not sure [12:54:13] the numbers are too low to be the whole wikipedia family [12:54:18] yeah [12:54:38] maybe the unique devices per project family are not yet loaded in cassandra [12:55:01] mforns: should they be in theory? [12:56:02] fdans, I don't think so: https://phabricator.wikimedia.org/T143927 [12:56:25] they are still being vetted, and the task for making them public is still untouched [12:57:25] oh I misunderstood everything then... I thought I had to deal with a special case in the uniques metric where we can query all uniques for a family [12:57:35] (but not for all projects) [13:06:10] !log stop cassandra load bundle, restarting AQS for jvm updates [13:06:10] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:06:32] a-team: I am restarting cassandra on aqs, let me know if you notice anything weird [13:41:32] * fdans lunch 🇪🇸 [14:13:05] ottomata: if you have time after standup I'd need some help in figuring out what's happening on the analytics labs kafka brokers [14:13:11] I can see auth going on and happening [14:13:19] but can't produce to any topic [14:13:21] for weird issues [14:13:29] for example, ISR are always 1/3 [14:13:42] +1 elukey i'm for you whenever [14:13:54] super thanks! [14:33:22] hey ottomata. I sent Nithum an email about testing TensorFlow on AMD GPU, per the discussion we had in the last chill hour. you are in bcc fyi, I'll let you know if I hear back. [14:39:27] oh great, thank you [14:46:17] milimetric, do you have 10 mins before standup? [14:46:23] 10Analytics, 10Analytics-Wikistats: Fix Wikistats build in Jenkins - https://phabricator.wikimedia.org/T171599#3470314 (10fdans) [14:46:24] yes, cave? [14:46:28] yep! [14:59:12] 10Analytics-Kanban: Initial Launch of new Wikistats 2.0 website - https://phabricator.wikimedia.org/T160370#3470412 (10Nuria) [14:59:14] 10Analytics-Kanban, 10Analytics-Wikistats: Define, Document (and test) Desktop and Mobile browser support for wikistats 2.0 - https://phabricator.wikimedia.org/T170457#3470411 (10Nuria) 05Open>03Resolved [15:00:41] fdans: standdduppp [15:14:27] ottomata: (ignorant me asking) what did you change for the mysql consumers in EL? I can still see two of them, the m4 and the eventbus one in the logs [15:15:35] (03PS4) 10Nuria: Modifying ingestion spec after additions to edit history [analytics/refinery] - 10https://gerrit.wikimedia.org/r/366327 (https://phabricator.wikimedia.org/T170493) [15:24:37] !log restart cassandra loading after maintenance via hue [15:24:38] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [15:31:51] milimetric: my druid spec change is reday to merge if data looks good in druid, cc mforns_ :https://gerrit.wikimedia.org/r/#/c/366327/4/oozie/mediawiki/history/druid/load_mediawiki_history.json.template [15:32:12] lookin [15:33:42] mforns: let me know when you want to data-vet eventlogging [15:34:09] elukey, in 10 minutes? [15:35:08] oook! [15:37:43] nuria_, change looks good, is the data in pivot? probably in a "hidden" dataset? do you have a link? [15:38:43] mforns: then later we can get back to the router thing [15:38:54] milimetric, OK! [15:42:20] mforns: you can take a bit more time, I am debugging kafka with Andrew [15:42:39] elukey, ok, will ping milimetric first then if he's ok [15:52:51] Off to the dentist :( [15:52:58] I am the most miserable man on earth [15:53:14] (Be thankful you have access to a dentist.) [15:54:33] harej: not really complaining about the dentist, but my own teeth :) [15:59:41] Ah, yes. [16:00:14] * harej is somewhat bitter because a doctor just canceled on him [16:01:34] mforns: whenever you are ready, I keep working on kafka in the meantime [16:01:43] elukey, ok, let's do it [16:01:48] batcave! [16:38:14] ottomata: can you ssh to zk2-1.analytics.eqiad.wmflabs ? [16:41:13] grrrr [16:41:14] no [16:41:33] mforns: dataset should be visible but maybe snapshot is not where it should be? https://pivot.wikimedia.org/#mediawiki-history-beta/ [16:41:45] k [16:41:58] mforns: no, it is, i imported may [16:41:58] elukey: 2017-07-25T16:17:02.940437+00:00 zk2-1 nslcd[487]: [b127f8] no available LDAP server found: Server is unavailable [16:42:01] from horizon log [16:42:13] maybe kill it and make a new one :/ [16:42:13] dunno [16:42:31] elukey: or just apply a zk server on one (or all) of the kafka3 nodes [16:42:34] mforns: ... mmmm wait [16:42:34] since you know those work :/ [16:43:59] :( [16:51:17] mforns: dataset is this one: https://pivot.wikimedia.org/#mediawiki-history-beta [16:51:24] nuria_, ok thanks! [16:51:43] mforns: it can be seen cause timstamps are on the new format yyyy:mm:dd hh:mm:ss.S [16:51:59] mforns: it includes up to the month of may [16:52:05] k [17:10:10] wikimedia/mediawiki-extensions-EventLogging#675 (wmf/1.30.0-wmf.11 - 019d339 : Kunal Mehta): The build has errored. [17:10:10] Change view : https://github.com/wikimedia/mediawiki-extensions-EventLogging/compare/wmf/1.30.0-wmf.11 [17:10:10] Build details : https://travis-ci.org/wikimedia/mediawiki-extensions-EventLogging/builds/257396082 [17:13:18] 10Analytics: Add purge info for Kartographer schema - https://phabricator.wikimedia.org/T171622#3471039 (10mpopov) [17:14:03] 10Analytics, 10Discovery, 10Discovery-Analysis: Add purge info for Kartographer schema - https://phabricator.wikimedia.org/T171622#3471052 (10mpopov) [17:40:49] Heyas analytics folks, anyone about who can chat with me about the new druid100[456] systems setup? [17:40:53] ottomata: ^ ? [17:41:21] I need to make the racking task for chris, and I can see that druid100[123] all exist in different racks and rows. Are these new systems repalcing the old, or adding to the pool? [17:43:08] I'm assuming adding to the pool as the existing druid100[123] are fairly new still [17:43:11] robh heya [17:43:14] adding to pool! [17:43:29] so the more diverse the better, but you could pair them in the same racks, so we'd have 2 in each rack, if you like [17:43:35] ok, so ideally in different rows (so right now we have in a,c,d, toss one of the new in b, and then in different racks in the others [17:43:41] sure ya [17:43:46] i think we can get by without using the same rack [17:43:53] but i'll mention its acceptable if nothing else is available [17:43:59] k cool [17:44:00] jessie or stretch? (existing are jessie) [17:44:00] danke [17:44:04] jessie [17:44:09] we need java 7 [17:44:11] cool [17:44:11] or it doesn't work :/ [17:44:22] gotta upgrade hadoop to java 8 before we can use more stretch [17:44:27] ottomata: elukey@zk1-1:~$ \p/ [17:44:31] \o/ [17:44:35] the orig!? [17:44:37] elukey, I checked another table, and I found that clientIp values are still there, so we're not updating in vain :] [17:44:58] ottomata: yep! It was suffering from the LDAP outage, puppet broken == no updates to the CA [17:45:09] elukey, plus the sanitization worked well, because before the timestamp limit, clientIp was NULL and after not. [17:45:09] * elukey hugs mforns [17:45:15] hehehe [17:45:33] mforns: eventlogcleaner /usr/local/bin/eventlogging_cleaner --whitelist /etc/eventlogging/whitelist.tsv --older-than 570 --newer-than 940 --batch-size 100000 [17:45:41] +1? [17:45:59] elukey, why 570 and 940? [17:46:03] {'end_ts': '20160102174556', 'batch_size': 10000, 'start_ts': '20141228174556'} [17:46:14] +2 [17:46:20] * elukey runs the script [17:46:48] ottomata: oh, when we finish these installs [17:46:50] hand off to you? [17:46:52] or someone else? [17:47:44] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad: rack/setup/install druid100[456].eqiad.wmnet - https://phabricator.wikimedia.org/T171626#3471295 (10RobH) [17:47:56] i listed you, but we can change if needed. [17:48:37] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad: rack/setup/install druid100[456].eqiad.wmnet - https://phabricator.wikimedia.org/T171626#3471329 (10RobH) [17:58:23] fdans: let me know if you need help with npm issue and CI [18:06:15] 10Analytics: Stop collecting Data for pageCreate schema, archive table on hdfs - https://phabricator.wikimedia.org/T171629#3471480 (10Nuria) [18:06:54] robh ya that's fine [18:07:02] cool [18:07:11] thank you [18:08:47] 10Analytics-Kanban: Archive PageContentSaveComplete in hdfs while we continue collecting data - https://phabricator.wikimedia.org/T170720#3471501 (10Nuria) [18:10:39] 10Analytics: Stop collecting Data for PageCreation schema, archive table on hdfs - https://phabricator.wikimedia.org/T171629#3471556 (10Nuria) [18:16:28] ottomata: I am able to run the kafka producer without any errors! [18:16:30] \o/ [18:16:50] tomorrow I'll restart doing more careful tests [18:16:57] but it looks like I am unblocked [18:17:17] mforns: let me know if you see any problems on data on mediawiki history [18:17:30] I was also using the wrong zk endpoint (without the /kafka/mothership2etc..) for the ACLs [18:17:45] feel better now :) [18:17:46] thanks ottomata ! [18:17:51] * elukey goes offline for today :) [18:26:58] nuria_, data looks good as far as I can see [18:28:35] great elukey :) [18:28:39] laters! [18:48:40] mforns: ok merging new spec [18:48:51] k! +2 [18:49:04] mforns: https://gerrit.wikimedia.org/r/#/c/366327/ [18:49:12] mforns: take a last look, it cannot hurt [18:49:26] (03CR) 10Mforns: [V: 032 C: 032] Modifying ingestion spec after additions to edit history [analytics/refinery] - 10https://gerrit.wikimedia.org/r/366327 (https://phabricator.wikimedia.org/T170493) (owner: 10Nuria) [18:57:25] milimetric, I'm looking into using router-view's code, if I get stuck I ping you, ok? [18:57:39] mforns: I looked at it too, it's pretty complicated [18:57:53] so I'm happy to brain bounce [18:58:04] I feel like I'm carelessly sending you down a hard path [18:58:06] k milimetric wanna cave? [18:58:19] DISCLAIMER: I have no idea how hard this is, it could be bad :) [18:58:22] k, sure, omw [19:15:19] 10Analytics, 10Analytics-Cluster, 10Operations, 10ops-eqiad: rack/setup/install druid100[456].eqiad.wmnet - https://phabricator.wikimedia.org/T171626#3471295 (10Cmjohnson) [19:26:23] 10Analytics, 10Analytics-Wikistats: Add piwik to wikistats 2.0 site - https://phabricator.wikimedia.org/T171642#3471959 (10Nuria)