[07:31:55] Hi team [08:12:28] 10Analytics, 10Operations, 10Traffic: varnishkafka statsv and webrequest crashed on cp1081 - https://phabricator.wikimedia.org/T231331 (10Nuria) 05Open→03Resolved [08:12:33] 10Analytics, 10Operations, 10Traffic: varnishkafka statsv and webrequest crashed on cp1081 - https://phabricator.wikimedia.org/T231331 (10Nuria) Agreed, closing. [08:13:08] (03PS2) 10Joal: Correct oozie workflows for error-emails [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532361 (https://phabricator.wikimedia.org/T228747) [08:15:25] Good morning fdans :) [08:15:35] I've reviewed the patch on partitions: it looks good :) [08:16:30] fdans: I wonder about the dt field though - Didn't we agree on adding it in addition to partitions? [08:29:16] 10Analytics, 10Analytics-EventLogging, 10EventBus, 10CPT Initiatives (Modern Event Platform (TEC2)), 10Core Platform Team Workboards (Clinic Duty Team): Develop a library for JSON schema backwards incompatibility detection - https://phabricator.wikimedia.org/T206889 (10Nuria) > but I really think we shou... [08:54:56] 10Analytics, 10Tool-Pageviews: Load media requests data into cassandra - https://phabricator.wikimedia.org/T228149 (10Nuria) [09:09:46] 10Analytics, 10Multimedia, 10Tool-Pageviews: Statistics for views of individual Wikimedia images - https://phabricator.wikimedia.org/T210313 (10Nuria) The mediawiki beacon that is measuring pageviews is defined at: https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/InitialiseSett... [09:14:08] joal: hmmmm I feel like dt makes more sense when there is no counting at all and the table is a list of events (there's probably a name for that but I don't know it) [09:14:19] like in the event db or in webrequest [09:15:31] fdans: I disagree - dt informs on time the same as partition, except that it's present inside the data files in a format facilitating datetime management (and also facilitating druid loading :) [09:17:10] joal: hmmm doen't dt in event contain an instantaneous time to a seconds granularity though, as opposed to year/month/day/hour? [09:17:35] fdans: it does, but the convention dt is about the format [09:17:59] fdans: we could have a dt field for pageview, it would have second = 0 everywhere [09:18:07] and even minute = 0 too [09:19:02] ok, adding dt [09:20:47] 10Analytics, 10Multimedia, 10Tool-Pageviews: Statistics for views of individual Wikimedia images - https://phabricator.wikimedia.org/T210313 (10Nuria) GFrom data from 2018-08-28 (sampled 1/128) from 23 million requests for files (to upload.wikimedia.org) about 137K are recorded as views from media-viewer. I... [09:20:48] fdans: thanks :) [09:29:42] (03PS8) 10Fdans: Change partition structure to year/month/day/hour. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532725 (https://phabricator.wikimedia.org/T229817) [09:36:06] fdans: if you confirm you have tested the query in --^, I'll hapilly merge :) [09:36:20] joal: I'm testing now quickly [09:36:33] fdans: You read my mind :) [09:44:30] nuria: if by any chance you're around - shall I put back the wikidata file for geoeditors yearly? [09:56:43] (03PS9) 10Fdans: Change partition structure to year/month/day/hour. [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532725 (https://phabricator.wikimedia.org/T229817) [09:57:01] ok joal I'm happy -- tested in the cluster [09:57:20] awesome fdans :) Will merge in a bit [10:12:30] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging !" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532725 (https://phabricator.wikimedia.org/T229817) (owner: 10Fdans) [10:14:12] (03PS3) 10Joal: Correct oozie workflows for error-emails [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532361 (https://phabricator.wikimedia.org/T228747) [10:15:04] (03CR) 10Joal: [V: 03+2 C: 03+2] "merging" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/532361 (https://phabricator.wikimedia.org/T228747) (owner: 10Joal) [10:19:19] (03PS4) 10Joal: Update oozie job for yarn queue to work [analytics/refinery] - 10https://gerrit.wikimedia.org/r/531682 (https://phabricator.wikimedia.org/T231002) [10:22:19] (03CR) 10Joal: [V: 03+2 C: 03+2] "merging" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/531682 (https://phabricator.wikimedia.org/T231002) (owner: 10Joal) [11:17:39] (03PS4) 10Joal: Update geoditors-yearly oozie job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533169 (https://phabricator.wikimedia.org/T215655) [11:18:56] (03CR) 10Joal: [V: 03+2] "As per Leila request, I added wikidata geoeditors generation." (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/533169 (https://phabricator.wikimedia.org/T215655) (owner: 10Joal) [11:59:35] (03PS1) 10Ladsgroup: Track *.jar files as git lfs [analytics/wmde/toolkit-analyzer-build] - 10https://gerrit.wikimedia.org/r/533507 (https://phabricator.wikimedia.org/T230015) [13:10:49] when did we switch to this headless chrome testing thing? [13:10:52] in wikistast [13:56:37] (03PS3) 10Fdans: Transition data rows to using time ranges instead of timestamps [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/531148 (https://phabricator.wikimedia.org/T230514) [14:54:24] joal one thing i wanted to run by you before we start loading to cassandra [14:54:56] right now, in addition to the "external" referal value, we have "external (search engine)" [14:55:05] you think this will be problematic? [14:56:24] fdans: Do we want to change the parenthesis (URL encoding difficulties)? [14:56:48] except from that, no problem comes to my mind :) [14:57:05] I Need to drop to get the kids - I'll be there at standup [14:57:10] yeah that's what I was thinking but I don't think parethnses present a problem [14:57:33] anyway, we can deal with that at the cassandra loading level [15:17:17] Hi all! Quick question here... What would be the correct way for client-side code to check that eventLogging is installed, since it's considered a soft dependency? Previously we used mw.config.get( 'wgEventLoggingBaseUri' ) to check, but that's broken now... [15:19:40] Maybe just check that mw.eventLog is there? [16:03:51] Yo team - standup? [16:04:18] Ah - no standup actually :) [16:04:22] Ok :) [18:13:39] PROBLEM - Disk space on Hadoop worker on an-worker1090 is CRITICAL: DISK CRITICAL - free space: /var/lib/hadoop/data/k 26 GB (0% inode=99%): /var/lib/hadoop/data/e 26 GB (0% inode=99%): /var/lib/hadoop/data/b 25 GB (0% inode=99%): /var/lib/hadoop/data/d 26 GB (0% inode=99%): /var/lib/hadoop/data/g 26 GB (0% inode=99%): /var/lib/hadoop/data/c 26 GB (0% inode=99%): /var/lib/hadoop/data/m 28 GB (0% inode=99%): /var/lib/hadoop/data/i [18:13:39] 99%): /var/lib/hadoop/data/j 27 GB (0% inode=99%): /var/lib/hadoop/data/h 26 GB (0% inode=99%): /var/lib/hadoop/data/f 26 GB (0% inode=99%): /var/lib/hadoop/data/l 16 GB (0% inode=99%): https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration [18:45:14] 10Analytics: Superset + Turnilo access for Verena Lindner + Raja Gumienny (WMDE) - https://phabricator.wikimedia.org/T231677 (10fdans) [18:46:40] (back) [19:48:35] PROBLEM - Disk space on Hadoop worker on an-worker1087 is CRITICAL: DISK CRITICAL - free space: /var/lib/hadoop/data/m 26 GB (0% inode=99%): /var/lib/hadoop/data/l 25 GB (0% inode=99%): /var/lib/hadoop/data/b 25 GB (0% inode=99%): /var/lib/hadoop/data/i 26 GB (0% inode=99%): /var/lib/hadoop/data/f 26 GB (0% inode=99%): /var/lib/hadoop/data/j 23 GB (0% inode=99%): /var/lib/hadoop/data/d 26 GB (0% inode=99%): /var/lib/hadoop/data/k [19:48:35] 99%): /var/lib/hadoop/data/e 26 GB (0% inode=99%): /var/lib/hadoop/data/c 25 GB (0% inode=99%): /var/lib/hadoop/data/g 26 GB (0% inode=99%): /var/lib/hadoop/data/h 16 GB (0% inode=99%): https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration [21:53:56] 'active editors' stats seems to include spammers who are blocked, could you please exclude them? [22:01:11] PROBLEM - Disk space on Hadoop worker on an-worker1087 is CRITICAL: DISK CRITICAL - free space: /var/lib/hadoop/data/m 25 GB (0% inode=99%): /var/lib/hadoop/data/l 24 GB (0% inode=99%): /var/lib/hadoop/data/b 26 GB (0% inode=99%): /var/lib/hadoop/data/i 26 GB (0% inode=99%): /var/lib/hadoop/data/f 26 GB (0% inode=99%): /var/lib/hadoop/data/j 22 GB (0% inode=99%): /var/lib/hadoop/data/d 26 GB (0% inode=99%): /var/lib/hadoop/data/k [22:01:11] 99%): /var/lib/hadoop/data/e 25 GB (0% inode=99%): /var/lib/hadoop/data/c 25 GB (0% inode=99%): /var/lib/hadoop/data/g 26 GB (0% inode=99%): /var/lib/hadoop/data/h 16 GB (0% inode=99%): https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration