[00:01:25] (03CR) 10jerkins-bot: [V: 04-1] Update mediawiki-history page reconsruction [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388267 (owner: 10Joal) [00:59:31] (03PS2) 10Joal: Rename latest/historical fields in mw-history [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388265 [01:21:46] (03PS3) 10Joal: Rename latest/historical fields in mw-history [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388265 [01:24:50] (03PS2) 10Joal: Update mediawiki-history page reconsruction [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388267 [08:21:33] 10Analytics-Kanban, 10User-Elukey, 10cloud-services-team (Kanban): Remove logging from labs for schema https://meta.wikimedia.org/wiki/Schema:CommandInvocation - https://phabricator.wikimedia.org/T166712#3731752 (10elukey) p:05Triage>03Normal a:03Nuria [08:48:41] o/ [08:49:29] joal,fdans - if you have 2 mins would you mind to try to connect to the log db on db1108 via stat100$something to verify that the research account works? [08:50:58] on it elukey [08:51:16] thanks! I am sure that works fine but better be sure [08:51:29] mysql -h db1108.eqiad.wmnet on stat1005 or similar [08:54:46] elukey: I'm getting ERROR 1044 (42000): Access denied for user 'research'@'%' to database 'db' [08:55:21] after running mysql -h db1108.eqiad.wmnet [08:55:27] and 'use db' [08:59:46] the db is 'log' :D [08:59:58] but you were able to get to the mysql console right? [09:01:10] fdans: --^ [09:01:11] omg elukey I haven't had my coffee yet [09:01:15] ahhahaha [09:01:32] yes, I was able to get to the console [09:01:40] \o/ [09:01:42] thanks! [09:01:53] no prrrrroblem! [09:59:38] elukey: Heya ! [09:59:54] o/ [09:59:58] elukey: I have been able to log into the mysql host, but I have not been able to connect to the log db :( [10:00:54] joal: what error do you get? [10:01:04] elukey: Access denied [10:01:10] ERROR 1044 (42000): Access denied for user 'research'@'%' to database 'log' [10:01:19] from stat1005 [10:07:36] I've merged the X-Cache-Status patch we've discussed yesterday [10:07:57] thanks! [10:08:06] let me know if anything is weird! [10:08:55] Ok ema - Just to be sure that I'm correct on expectations: x_cache field doesn't change, and cache_status is now more complete? [10:09:10] correct [10:09:14] Great :) [10:09:58] ema: Do you mind adding a line to https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest#Changes_and_known_problems_since_2015-03-04? [10:14:36] joal: should I bump record_version? [10:15:16] ema: Not needed I think - It requires a refinery deploy, and as long as it's documented, it's fine by me :) [10:16:08] ema: What would be great is to add a comment stating that this change is coming from the raw data and not the refinery process, therefore no record-version bump [10:16:16] ok! [10:16:22] Thanks ema :) [10:20:26] joal: done [10:21:08] Again many thanks ema :) [10:21:17] thank you! [10:21:21] ema - when have you merged it? [10:21:54] joal: you should be able to connect now [10:22:01] elukey: testing [10:22:31] joal: 09:47 utc [10:22:52] ema: I shall see some different rows for hour 9 :) [10:22:54] Ok ! [10:22:59] elukey: success ! [10:23:18] \o/ [10:26:33] ema question for you - [10:27:17] ema: https://gist.github.com/jobar/ef4a1599d270c0a7df068a8a707b6984 [10:27:33] ema: Those are counts per cache_status values for hour 9 on webrequest_misc [10:27:56] ema: I wonder what "int" and therefore "int-front" values mean? [10:29:31] joal: that stands for "internally generated", from the point of view of varnish. So it's a response that varnish returned, not the application layer. An example is 301s for TLS redirects [10:29:42] see curl -I http://en.wikipedia.org | grep -i cache [10:30:47] ema: makes a lot of sense :) [10:30:53] ema: Thanks for the explanation! [10:31:23] why does misc have such an insane amount of int responses is interesting though [10:31:49] see https://grafana.wikimedia.org/dashboard/db/varnish-caching?refresh=15m&panelId=2&fullscreen&orgId=1&var-cluster=misc&var-site=All for an example of where we use those stats [10:34:15] there's apparently something hammering http://stream.wikimedia.org/socket.io/1/ and those are all TLS redirects [10:35:56] ema: Do you want me to analyze that our a bit more, or do you go for it yourself? [10:36:21] joal: looks like those are all requests coming from google compute engine, User-Agent: Java/1.8.0_60-ea [10:36:52] ema: like, people listening to event-stream that have forgotten to put an 's' at http? [10:37:03] that's what it looks like indeed :( [10:37:39] mwarfb [10:38:14] best thing is that after the redirect they get a 404 [10:38:16] oh man [10:39:36] Muhahahaha [10:39:48] excuse me ema - this is kinda funny and not funny [10:39:54] :) [10:40:14] ema: wasn't 'int' used to indicate a error state in x-cache? I am probably not remembering correctly [10:40:58] elukey: int is anything generated by varnish, so yes 503 fetch errors for example would fall into that category too [10:41:13] ahhhhh [10:41:16] okok :) [10:53:17] https://goo.gl/fjtA6D /o\ [10:54:02] :( [10:54:30] if you add User Agent to the dimensions to split on, Java/1.blah never ever gets anything but 301s [10:55:20] !log Kill mediawiki-history oozie job to prevent computing october snapshot before fixing reconstruction process [10:55:21] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:56:08] that's worng ema - I wonder if there is anything we should do [10:57:32] joal: so, this is not causing any functional issues at the moment, but I agree that it's not very cool. I'll discuss it with brandon when he shows up :) [10:57:51] thanks mate :) [13:29:36] mforns: o/ [13:29:44] I am running eventlogging_cleaner on db1108 [13:29:51] 100k batches with 2s of sleep [13:29:58] https://grafana.wikimedia.org/dashboard/file/server-board.json?var-server=db1108&refresh=1m&orgId=1 [13:30:11] doesn't seem that anything is running from the metrics lol [13:30:47] it is flying [13:40:45] :D [13:49:03] (03PS1) 10Joal: Update mw-history page reconstruction (restores) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388444 [13:52:43] 10Analytics-Kanban: Rename historical fields in mediawiki-history - https://phabricator.wikimedia.org/T179689#3733022 (10JAllemandou) [13:53:00] 10Analytics-Kanban: Rename historical fields in mediawiki-history - https://phabricator.wikimedia.org/T179689#3733036 (10JAllemandou) a:03JAllemandou [13:53:40] 10Analytics-Kanban: Fix mediawiki history page reconstruction bug (similar timestamps) - https://phabricator.wikimedia.org/T179074#3733040 (10JAllemandou) [13:56:05] 10Analytics-Kanban: Fix mediawiki-history page reconstruction bug (restores) - https://phabricator.wikimedia.org/T179690#3733045 (10JAllemandou) [13:57:35] 10Analytics-Kanban: Fix mediawiki-history page reconstruction bug (restores final) - https://phabricator.wikimedia.org/T179692#3733080 (10JAllemandou) [13:57:56] 10Analytics-Kanban: Fix mediawiki-history page reconstruction bug (restores) - https://phabricator.wikimedia.org/T179690#3733045 (10JAllemandou) [13:58:13] 10Analytics-Kanban: Fix mediawiki-history page reconstruction bug (restores) - https://phabricator.wikimedia.org/T179690#3733045 (10JAllemandou) [13:58:26] 10Analytics-Kanban: Fix mediawiki-history page reconstruction bug (restores) - https://phabricator.wikimedia.org/T179690#3733045 (10JAllemandou) a:03JAllemandou [13:59:24] (03PS4) 10Joal: Rename latest/historical fields in mw-history [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388265 (https://phabricator.wikimedia.org/T179689) [14:00:28] (03PS3) 10Joal: Update mediawiki-history page reconsruction [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388267 (https://phabricator.wikimedia.org/T179074) [14:01:26] (03PS2) 10Joal: Update mw-history page reconstruction (restores) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388444 (https://phabricator.wikimedia.org/T179690) [14:01:50] milimetric, mforns --^ I have documented my work in tasks, and provided 3 patches as planned [14:02:26] I'll be gone from now up to tonight to enjoy end-of-afternoon with family, I'll check with you when I'll be back tonight [14:02:47] I hope to be able to launch a recomputation later this weekend :) [14:06:40] (03PS2) 10Joal: Rename latest/historical fields in mw-history [analytics/refinery] - 10https://gerrit.wikimedia.org/r/388266 (https://phabricator.wikimedia.org/T179689) [14:07:29] Gone a-team - See you later tonight (have a good weekend elukey, you'll be gone when I'll get back ;) [14:08:12] joal: you too! [14:12:09] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Resolve EventCapsule / MySQL / Hive schema discrepancies - https://phabricator.wikimedia.org/T179625#3733144 (10Ottomata) [14:12:43] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Resolve EventCapsule / MySQL / Hive schema discrepancies - https://phabricator.wikimedia.org/T179625#3731162 (10Ottomata) [14:12:51] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Resolve EventCapsule / MySQL / Hive schema discrepancies - https://phabricator.wikimedia.org/T179625#3731162 (10Ottomata) [14:31:44] 10Analytics, 10Analytics-EventLogging, 10Operations, 10Ops-Access-Requests: Requesting Sharvani Haran to be added to researchers group - https://phabricator.wikimedia.org/T179611#3730724 (10herron) Hello, membership to group `researchers` would provide access to `stat1006` but not `stat1005`. Could you ple... [14:36:25] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Resolve EventCapsule / MySQL / Hive schema discrepancies - https://phabricator.wikimedia.org/T179625#3733192 (10Ottomata) [14:38:18] 10Analytics, 10Analytics-EventLogging, 10Operations, 10Ops-Access-Requests: Requesting Sharvani Haran to be added to researchers group - https://phabricator.wikimedia.org/T179611#3730724 (10Ottomata) deployment-eventlog02 access is handled by cloud services and/or release engineering folks, one of them can... [15:58:25] 10Analytics-Tech-community-metrics, 10Developer-Relations (Oct-Dec 2017): Make Qgil a fallback for Bitergia access (lock-in) - https://phabricator.wikimedia.org/T178381#3733460 (10Qgil) I got the permissions in the GitHub repor and now I have an account in GitLab: https://gitlab.com/qgil [16:02:46] 10Analytics-EventLogging, 10Analytics-Kanban, 10Patch-For-Review: Resolve EventCapsule / MySQL / Hive schema discrepancies - https://phabricator.wikimedia.org/T179625#3733475 (10Ottomata) [16:24:17] elukey you froze [16:24:35] no worries since it was the end of our meeting [16:25:01] nuria_: hangouts hates me for some reason [16:25:03] :( [16:25:12] elukey: no worries see ya monday [16:25:21] all right thanks! Have a good weekend :) [16:32:17] nuria_: do you know if tilman is using userAgent for Popups analysis/experiement? [16:32:28] ottomata: yes [16:32:35] ottomata: they are [16:32:50] hmm ok [16:32:58] this isn't easy to do in hive, actually, mforns might know more about this than me [16:33:02] soryr [16:33:04] in spark* [16:33:13] i had a lot of trouble modifying columns directly in dataframes [16:33:19] mforns: got a few mins for brain bounce on this? [16:33:30] yesss [16:33:32] bc [16:49:32] (03CR) 10Milimetric: [C: 04-1] Update mediawiki-history page reconsruction (033 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388267 (https://phabricator.wikimedia.org/T179074) (owner: 10Joal) [17:02:44] (03CR) 10Milimetric: "looks good without running the tests. Just one suggestion to use "mostRecent" instead of "current"" (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388265 (https://phabricator.wikimedia.org/T179689) (owner: 10Joal) [17:04:08] (03CR) 10Milimetric: [V: 032 C: 032] Rename latest/historical fields in mw-history [analytics/refinery] - 10https://gerrit.wikimedia.org/r/388266 (https://phabricator.wikimedia.org/T179689) (owner: 10Joal) [17:06:25] elukey: (d1108) sounds exciting! i just connected to it and ran a first small query. the difference wasn't big (38.07 sec vs. 47.65 sec earlier on analytics-store), but i'll make sure to try out with larger queries too [17:11:22] HaeB: thanks! I am running the eventlogging_cleaner script now (issuing batches of 100K updates ever 2s) so it might be a bit slower [17:12:09] (and afaics it is flying for those updates) [17:22:28] 10Analytics-Tech-community-metrics, 10Developer-Relations (Oct-Dec 2017): Make Qgil a fallback for Bitergia access (lock-in) - https://phabricator.wikimedia.org/T178381#3733702 (10Aklapper) @Qgil: Thanks. I've asked Bitergia to add https://gitlab.com/qgil to the Gitlab group. After that has happened, you can... [17:24:06] (03CR) 10Milimetric: [C: 032] "Just a question on a comment that wasn't entirely clear, so +2 in general." (032 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388444 (https://phabricator.wikimedia.org/T179690) (owner: 10Joal) [17:44:18] 10Analytics, 10Analytics-EventLogging, 10Operations, 10Ops-Access-Requests: Requesting Sharvani Haran to be added to researchers group - https://phabricator.wikimedia.org/T179611#3733790 (10Sharvaniharan) @Ottomata access for only "researchers" will be sufficient for now. Thank you for updating the ticket... [17:50:38] (03CR) 10Fdans: Add central notice component and detect adblock (033 comments) [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/383798 (https://phabricator.wikimedia.org/T177491) (owner: 10Fdans) [17:50:46] (03PS3) 10Fdans: Add central notice component and detect adblock [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/383798 (https://phabricator.wikimedia.org/T177491) [17:53:02] hey ottomata. I can't find the documentation for sqoop, I'm looking at https://wikitech.wikimedia.org/wiki/Analytics/Systems/Data_Lake/Edits/Pipeline/Data_loading but that's probably not the right place. [17:54:48] leila: nuria wrote this doc a while ago about transfering el tables to hdfs: https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging/Administration#Dumping_data_via_sqoop_from_eventlogging_to_hdfs [17:54:59] not sure if it helps :) [17:55:34] it does, elukey. thanks! :) [18:04:59] leila: note that it still has problems preserving data types https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging#Hadoop._Archived_Data [18:06:57] * elukey off! [18:31:45] ah, thanks HaeB [19:06:11] lzia: the type issue depends on types on incoming data, if you import from mw it shall be easier [19:07:53] got it nuria_. miriam, ^ [19:08:49] :) thanks leila and nuria [19:11:02] :) thanks lzia and nuria_ [20:36:29] 10Analytics, 10Analytics-EventLogging: Timestamp format in Hive-refined EventLogging tables is incompatible with MySQL version - https://phabricator.wikimedia.org/T179540#3734357 (10Tbayer) >>! In T179540#3730832, @Ottomata wrote: >> As documented in https://meta.wikimedia.org/wiki/Schema:EventCapsule , EventL... [20:44:07] Heya milimetric - still here? [20:45:31] hey joal [20:45:33] cave? [20:45:52] yes milimetric ! [20:58:08] checked, everything looks good joal [20:58:25] at least from the big picture size, it's grown as much as it has in the past [20:58:45] awesome milimetric :) [20:59:04] milimetric: will provide patches, if you can + that;'s great ;) [20:59:27] (03PS5) 10Joal: Rename latest/historical fields in mw-history [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388265 (https://phabricator.wikimedia.org/T179689) [20:59:42] (03CR) 10Joal: Rename latest/historical fields in mw-history (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388265 (https://phabricator.wikimedia.org/T179689) (owner: 10Joal) [21:00:48] joal, you still there??? [21:00:54] Hey mforns :) [21:00:58] hey! [21:01:01] mforns: It's my night shift ;) [21:01:06] can you cave for a min? [21:01:32] hehe [21:01:40] mforns: I can :) [21:03:15] oh joal on the current -> mostRecent change, I meant the variables in the code too, sorry for the confusion [21:03:24] like currentBlocks -> mostRecentBlocks [21:03:34] milimetric: weird I didn't find them - will repost ! [21:03:48] joal: you might have looked for word boundaries? [21:03:59] probably milimetric :) [21:14:08] milimetric: I checked, all other instances of "current" were used as "the one thing currently being worked" - It;s good we updated the 2 that were different :0 [21:14:13] milimetric: Thanks for catching ! [21:14:56] (03PS6) 10Joal: Rename latest/historical fields in mw-history [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388265 (https://phabricator.wikimedia.org/T179689) [21:17:32] (03CR) 10Milimetric: [V: 032 C: 032] Rename latest/historical fields in mw-history [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388265 (https://phabricator.wikimedia.org/T179689) (owner: 10Joal) [21:18:32] (03PS4) 10Joal: Update mediawiki-history page reconstruction [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388267 (https://phabricator.wikimedia.org/T179074) [21:18:55] (03CR) 10Joal: "Done !" (033 comments) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388267 (https://phabricator.wikimedia.org/T179074) (owner: 10Joal) [21:23:44] (03PS3) 10Joal: Update mw-history page reconstruction (restores) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388444 (https://phabricator.wikimedia.org/T179690) [21:24:02] (03CR) 10Joal: Update mw-history page reconstruction (restores) (031 comment) [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/388444 (https://phabricator.wikimedia.org/T179690) (owner: 10Joal) [22:58:21] 10Analytics-Tech-community-metrics, 10Developer-Relations (Oct-Dec 2017): Merge ~1350 duplicated Phab accounts in the Bitergia DB - https://phabricator.wikimedia.org/T179745#3734718 (10Aklapper) [23:24:48] 10Analytics: Reading Common Crawl data from hadoop / webproxy performance - https://phabricator.wikimedia.org/T179748#3734765 (10EBernhardson)