[09:36:38] Analytics / General/Unknown: zero.log contains duplicate host in logs - https://bugzilla.wikimedia.org/69371#c9 (nuria) Summing up, somehow we are generating requests in the client like: "http://es.m.wikipedia.org/http://es.m.wikipedia.org/wiki/Wikipedia:Portada" which, according to http are valid req... [09:37:08] Analytics / General/Unknown: zero.log contains duplicate host in logs - https://bugzilla.wikimedia.org/69371 (nuria) NEW>RESO/INV [09:40:25] Analytics / Wikimetrics: Wikimetrics is not supporting mlwiki cohort - https://bugzilla.wikimedia.org/69462 (nuria) NEW p:Unprio s:normal a:None Attempting to upload an indic language cohort today we discovered that Wikimetrics is not supporting mlwiki cohort - server error for list includi... [09:43:23] hola springle, [09:43:37] can i ask you about labs db and replication lag? [09:46:17] nuria: !ask :) [09:47:00] Is it possible to have a high replication lag (like > 24hrs) between labs db and production? [09:47:15] or rather is that something that has ever happened? [09:47:36] that is possible, but only if something is broken [09:48:37] nuria: yes, it's happened. we've had broken replication due to bugs or user transaction blocking [09:48:43] is that something we should guard against on our applications ? or is it too unlikely instance? [09:48:48] i'm not aware of any recently [09:49:03] "too unlikely of an instance" [09:49:43] guard against how? replication is asynchronous. it's always possible that lag can occur, hopefully small but maybe large, and applications need to be aware [09:51:58] We have reports that run everynite that would be sentisitive to a lag of more than 24 hours, small lag is no issue [09:55:26] springle: what is the best way to automatically monitor the event lag? I saw we have an event_log table that seems to keep track of replication , is querying that the best way to know replication is Ok? [09:55:41] https://git.wikimedia.org/blob/operations%2Fsoftware.git/c57ddf6b82f046f893de8e70bda15e4d57b4ae25/dbtools%2Fevents_labsdb.sql [10:02:49] nuria: ops.event_log table has nothing to do with replication lag. i think most people who care look at timestamps in active tables [10:03:53] springle: ok, so we look at our data rather that look somewhere when replication happened, right? [10:04:46] it's possible to grant user accounts access to REPLICATION CLIENT which allows SHOW SLAVE STATUS command, but to my knowledge that isn't done on labsdbs [10:05:03] we'd have to chat with Coren [10:07:43] springle: is there anything (alarm, script?) ongoing that monitors the replication delay for enwiki, dewiki.. etc on labs? [10:08:50] not presently. there was, but we're halfway through a migration and the new multi-source replication requires new monitoring [10:10:08] nuria: https://icinga-admin.wikimedia.org/cgi-bin/icinga/status.cgi?host=db1053&nostatusheader [10:10:26] that is the sanitarium server (or one of them), which is labsdb master in replication tree [10:11:35] if that is all green, chances are high that labsdb is also fine. we have events in place on labsdb watching for replag that should make it hard for users to block [10:12:29] eventually labsdbs will appear with multiple channels like https://icinga-admin.wikimedia.org/cgi-bin/icinga/status.cgi?host=dbstore1002&nostatusheader << that's analytics-store [10:13:03] ok, let me talk to the team and see how they want o monitor this best, if you are to setup alarms will it be OK for our team to receive them? (just to be informed, we obviously cannot take any action) [10:13:13] hmm, i've passed icinga-admin urls. that may not help you much [10:13:46] i see them cause i have icinga permits from EventLogging [10:13:57] excellent :) [10:14:07] yes, you guys can be notified if you wish [10:17:02] nuria: i think we could expose slave lag in a table. we've recently started information_schema_p on labsdb [10:17:20] springle: that would be excellent for us [10:17:40] as we could query the table and make sure not run "current day" reports [10:18:01] our daily runs backfill and yesterday reports, if left empty, will get backfilled tomorrow [10:18:28] springle: should i write a bug describing teh use case to create teh table? [10:18:39] sorry, "the use case for which [10:18:58] yes [10:18:58] it will be great to have a table to consult replication lag" [10:19:02] assign to me [10:19:10] ok, what project should it be under? [10:19:23] no idea :) pick something [10:19:42] ok, analytics then [10:24:25] Analytics / General/Unknown: Create a table in labs with replication lag data - https://bugzilla.wikimedia.org/69463 (nuria) NEW p:Unprio s:normal a:None I am creating this bug at the request of springle. It will be very useful to be able to consult replication lag on a table with wide acc... [10:25:24] Analytics / General/Unknown: Create a table in labs with replication lag data - https://bugzilla.wikimedia.org/69463 (nuria) a:Sean Pringle [10:25:41] many thanks again springle for your help [10:28:52] :) [10:29:41] (PS7) Nuria: Removing usage of celery chains from report scheduling [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/150475 (https://bugzilla.wikimedia.org/68840) (owner: Milimetric) [10:35:23] Analytics / General/Unknown: Create a table in labs with replication lag data - https://bugzilla.wikimedia.org/69463#c1 (nuria) We schedule reports by project and i imagine replication will reported per host, not per project so a global measure of how replication is working on the labs cluster will be... [12:10:17] ohai Ironholds_ [12:10:48] Ironholds_: wanted to let you know that the URL format for Mobile App requests will change super slightly - action=mobileview will come before format=json, but that's it. [12:13:07] (PS6) Yuvipanda: Show only latest run of query in queries list [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153626 [12:13:16] (CR) jenkins-bot: [V: -1] Show only latest run of query in queries list [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153626 (owner: Yuvipanda) [12:23:53] (PS7) Yuvipanda: Show only latest run of query in queries list [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153626 [12:30:54] (CR) Yuvipanda: [C: 2] Show only latest run of query in queries list [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153626 (owner: Yuvipanda) [12:31:02] (CR) Yuvipanda: [C: 2] Move check_sql into QueryRevision model [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153620 (owner: Yuvipanda) [12:31:07] (Merged) jenkins-bot: Move check_sql into QueryRevision model [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153620 (owner: Yuvipanda) [12:31:11] (Merged) jenkins-bot: Show only latest run of query in queries list [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153626 (owner: Yuvipanda) [12:53:08] Analytics / Tech community metrics: Remove severity related graphs from bugzilla_response_time.html - https://bugzilla.wikimedia.org/69179#c1 (Quim Gil) NEW>PATC This should do it. https://github.com/Bitergia/mediawiki-dashboard/pull/52 [13:04:58] Analytics / Tech community metrics: Gerrit metrics: details about review queues - https://bugzilla.wikimedia.org/58428#c3 (Quim Gil) ASSI>RESO/WON p:Normal>Lowest After using http://korma.wmflabs.org/browser/gerrit_review_queue.html on a daily basis, I think it already offers the informatio... [13:13:38] Analytics / Tech community metrics: Wrong data at "Update time for pending reviews waiting for reviewer in days" - https://bugzilla.wikimedia.org/68436#c6 (Quim Gil) Any idea of what is happening with DataValues? Also, projects like Parsoid, SmashPig, and gerrit.wikimedia.org_integration_docroot appea... [13:21:22] (PS1) Yuvipanda: Use Unicode, not String [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153785 [13:27:00] yoooo [13:27:03] (PS2) Yuvipanda: Use Unicode instead of String and force utf8 connections [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153785 [13:27:03] qchris: :) [13:27:08] ottomata: :-) [13:27:14] (CR) Yuvipanda: [C: 2] Use Unicode instead of String and force utf8 connections [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153785 (owner: Yuvipanda) [13:27:19] (Merged) jenkins-bot: Use Unicode instead of String and force utf8 connections [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153785 (owner: Yuvipanda) [13:27:20] mornin [13:27:27] mornin [13:27:32] do you think it would be better or worse to filter eventlogging by: [13:27:40] a. using grep before awk? [13:27:53] or [13:27:53] b. filtering in the awk scripts? [13:28:00] (some filtering is already done in the awk scripts) [13:28:36] I thought about using grep before awk. But then we'd have to teach grep about the columns (Referer column might get in the way. Unlikely ... but still) [13:28:50] ah, true [13:28:50] Yes, we can filter in the awk scripts. [13:29:00] But it turned out that analytics people do not like awk. [13:29:07] haha, you mean...dan? [13:29:09] So I'd prefer to keep awk usage to a minimum. [13:29:56] hello analytics people! i'm going to bring my sed & awk book for you! [13:29:57] to SF [13:30:15] ottomata: Not sure about him ... but! The less languages, the better. [13:30:25] And grep is pretty anywhere already. [13:30:27] yeah, true [13:30:28] i'm fine with it [13:30:29] cool [13:30:50] awk really is pretty handy though, i'm not sure what out there is better than awk for awk's purpose :p [13:30:57] * qchris likes awk a lot! [13:31:16] Well there is xml + xslt :-) [13:31:40] hah [13:31:43] uh [13:32:07] your statement that that exists is true. [13:32:17] Please ... could someone go "Yes, totally qchris. xml + xslt just rocks!" ? [13:32:37] um, for stream parsing text into fields? [13:33:00] haha, qchris must really like oozie then...:) [13:33:00] Meh. No love for it these days :-) [13:33:12] ottomata: I do! [13:35:57] (PS10) Ottomata: Add Oozie bundle for Icinga monitoring of webrequest datasets [analytics/refinery] - https://gerrit.wikimedia.org/r/152050 (owner: QChris) [13:36:31] qchris: i meant to submit that yesterday, but for some reason didn't! [13:36:56] :-) [13:54:28] ottomata: does Icinga distinguish between a service's name and its description? [13:54:31] So: [13:54:33] hive_partition_webrequest-bits [13:54:35] vs. [13:54:41] Raw webrequest bits data imported into HDFS and Hive. [13:54:41] ? [13:54:59] Because the send_nsca man page says that one should use the description [13:55:15] But we call it 'name' and use 'hive_partition_webrequest-bits' [13:55:22] hm, i believe so... [13:55:23] https://gerrit.wikimedia.org/r/#/c/151963/3/manifests/role/analytics/refinery.pp [13:55:27] let me check the icinga confs [13:55:52] The 'puppet freshness' service has [13:56:02] uppercase P in puppet in the description name [13:56:13] and the submit_check_result also uses upper case P. [13:56:23] Other than that, I could not find passive checks. [13:56:55] i think you are right [13:57:01] the 'name' here is for puppet only [13:57:02] not for icinga [13:57:24] But that would mean a new patch set for change 151963 [13:57:30] I can merge the refinery part. right? [13:59:09] hm, i think i want to change the name of the argument then [13:59:14] to match send_nsca [13:59:19] Ok. [13:59:24] and change the arguemnt description in oozie to not refer to puppet [13:59:47] No CR+2 then :-( [13:59:57] ha, s'ok, will get that in in a sec [14:12:12] (PS11) Ottomata: Add Oozie bundle for Icinga monitoring of webrequest datasets [analytics/refinery] - https://gerrit.wikimedia.org/r/152050 (owner: QChris) [14:15:44] (PS1) Yuvipanda: Remove QueryRepository, use SQLAlchemy directly [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153794 [14:30:56] (PS1) Yuvipanda: Remove QueryRevisionRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153796 [14:30:58] (PS1) Yuvipanda: Remove QueryRunRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153797 [14:31:00] (PS1) Yuvipanda: Remove UserRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153798 [14:31:02] (PS1) Yuvipanda: Fix NPE when creating a new query [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153799 [14:31:04] (CR) jenkins-bot: [V: -1] Remove QueryRevisionRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153796 (owner: Yuvipanda) [14:31:07] (CR) jenkins-bot: [V: -1] Remove QueryRunRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153797 (owner: Yuvipanda) [14:31:11] (CR) jenkins-bot: [V: -1] Remove UserRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153798 (owner: Yuvipanda) [14:31:55] (CR) jenkins-bot: [V: -1] Fix NPE when creating a new query [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153799 (owner: Yuvipanda) [14:32:28] (PS2) Yuvipanda: Remove UserRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153798 [14:32:30] (PS2) Yuvipanda: Fix NPE when creating a new query [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153799 [14:32:32] (PS2) Yuvipanda: Remove QueryRevisionRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153796 [14:32:34] (PS2) Yuvipanda: Remove QueryRunRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153797 [14:37:53] Analytics / General/Unknown: Create a table in labs with replication lag data - https://bugzilla.wikimedia.org/69463#c2 (nuria) Please note that this table needs to exist on the labs side, not on the production side. [14:55:13] (CR) Yuvipanda: [C: 2] Remove QueryRepository, use SQLAlchemy directly [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153794 (owner: Yuvipanda) [14:55:16] (CR) Yuvipanda: [C: 2] Remove QueryRevisionRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153796 (owner: Yuvipanda) [14:55:19] (Merged) jenkins-bot: Remove QueryRepository, use SQLAlchemy directly [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153794 (owner: Yuvipanda) [14:55:21] (CR) Yuvipanda: [C: 2] Remove QueryRunRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153797 (owner: Yuvipanda) [14:55:23] (Merged) jenkins-bot: Remove QueryRevisionRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153796 (owner: Yuvipanda) [14:55:25] (CR) Yuvipanda: [C: 2] Remove UserRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153798 (owner: Yuvipanda) [14:55:29] (Merged) jenkins-bot: Remove QueryRunRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153797 (owner: Yuvipanda) [14:55:31] (CR) Yuvipanda: [C: 2] Fix NPE when creating a new query [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153799 (owner: Yuvipanda) [14:55:33] (Merged) jenkins-bot: Remove UserRepository [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153798 (owner: Yuvipanda) [14:55:38] (Merged) jenkins-bot: Fix NPE when creating a new query [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153799 (owner: Yuvipanda) [15:06:28] hola springle [15:43:20] (PS2) Milimetric: [WIP] Ensure wikimetrics session is always closed [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/153616 (https://bugzilla.wikimedia.org/68833) [15:43:40] (PS2) Milimetric: Fix slow Rolling Active Editor metric [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/149482 (https://bugzilla.wikimedia.org/68596) [15:48:00] (PS3) Milimetric: Fix slow Rolling Active Editor metric [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/149482 (https://bugzilla.wikimedia.org/68596) [15:49:13] (CR) Milimetric: Fix slow Rolling Active Editor metric (1 comment) [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/149482 (https://bugzilla.wikimedia.org/68596) (owner: Milimetric) [15:51:45] YuviPanda: awesome you're talking to Sean about making event logging public, heartfelt +2 [16:15:02] (PS3) Milimetric: [WIP] Ensure wikimetrics session is always closed [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/153616 (https://bugzilla.wikimedia.org/68833) [16:16:26] Analytics / General/Unknown: zero.log contains duplicate host in logs - https://bugzilla.wikimedia.org/69371#c10 (nuria) Need to look at IP ranges as I looked at languages and wikipedia's for geographic commonality and that might not be the best. [16:19:46] nuria: I'm working on merging your "removing the chain" patch [16:19:52] k [16:19:57] great thank you [16:20:09] let me know if there is something i should do [16:20:12] I'll upload a new patchset, it's a million times easier than explaining - but feel free to revert, just a sec [16:20:35] (PS8) Milimetric: Removing usage of celery chains from report scheduling [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/150475 (https://bugzilla.wikimedia.org/68840) [16:20:46] nuria: ^ you can take a look [16:21:12] https://gerrit.wikimedia.org/r/#/c/150475/7..8/wikimetrics/schedules/daily.py [16:22:02] milimetric: did you run tests? cause w/o teh signature tests were failing [16:22:21] yeah, the parallel_reports test ran fine [16:23:06] it must've been something else, that signature thing is just another syntax to use, and it's usually used as shorthand when you're passing tasks around [16:23:14] but let me know if tasks don't run for you [16:23:32] can you run manual tests? [16:23:49] yeah, I ran it, I was saying above [16:24:02] I'm running it again, just in case it's nondeterministic :) [16:28:21] milimetric, YuviPanda, are you subscribed to wikidata mailing list? http://lists.wikimedia.org/pipermail/wikidata-l/2014-August/004293.html [16:29:04] i think we should allocate either data:graph: or graph: namespace on commons :) [16:36:12] nuria: btw, the tests run fine still, do they not run for you? [16:37:48] milemtric, let me run them again with patches 7 and 8 [16:37:55] sorry milimetric [16:39:30] they do run milimetric so i guess were good [16:39:45] *we're [16:39:49] k, cool, i'll rebase and merge [16:39:57] (PS9) Milimetric: Removing usage of celery chains from report scheduling [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/150475 (https://bugzilla.wikimedia.org/68840) [16:40:06] (CR) Milimetric: [C: 2] Removing usage of celery chains from report scheduling [analytics/wikimetrics] - https://gerrit.wikimedia.org/r/150475 (https://bugzilla.wikimedia.org/68840) (owner: Milimetric) [16:41:21] woa.... that merged into my session thing WITHOUT ISSUES. What?! :D [16:42:21] yurikR: I'm not subscribed to anything right now, I feel hugely overwhelmed with email [16:42:41] but I agree, a "graph:" namespace on commons would be great [16:43:48] milimetric, hehe, i hear you, agree, this way it will contain lots of embeddable graphs ( ) [16:44:24] or [16:51:41] Analytics / General/Unknown: zero.log contains duplicate host in logs - https://bugzilla.wikimedia.org/69371#c11 (nuria) I take my prior comment back, this looks like a proxy issue not client issue. Data below for requests that match "orghttp" in the month of August thus far in zero, mobile and sample... [16:58:47] qchris: oops, meant to ping you here [16:58:56] Ok. That channel it is. [16:59:28] ok so readY? [16:59:28] https://gerrit.wikimedia.org/r/#/c/152050/ [17:00:03] I've been in meetings/away since you uploaded that. Let me have alook again. [17:00:47] k [17:00:57] mostly just change variable name and documentation [17:04:35] (CR) QChris: [C: -1] Add Oozie bundle for Icinga monitoring of webrequest datasets (1 comment) [analytics/refinery] - https://gerrit.wikimedia.org/r/152050 (owner: QChris) [17:05:53] (CR) QChris: Add Oozie bundle for Icinga monitoring of webrequest datasets (1 comment) [analytics/refinery] - https://gerrit.wikimedia.org/r/152050 (owner: QChris) [17:07:53] (PS12) Ottomata: Add Oozie bundle for Icinga monitoring of webrequest datasets [analytics/refinery] - https://gerrit.wikimedia.org/r/152050 (owner: QChris) [17:08:32] (CR) QChris: [C: 2 V: 2] Add Oozie bundle for Icinga monitoring of webrequest datasets [analytics/refinery] - https://gerrit.wikimedia.org/r/152050 (owner: QChris) [17:08:53] Is there anything I can do on the puppet part of it too? [17:11:07] not sure! [17:11:19] Meh. I could only nag on tabs. vs. spaces. [17:11:24] The rest I do not understand. [17:11:59] oo that was copy pasted, looks like that file is not consistent anyway :( [17:12:14] :-D [17:12:22] qchris: interested in an explanation? [17:12:47] Sure. Up to now, I think I only have a rough clue. [17:12:54] I'd love to understand more of it. [17:13:09] so, the monitor_service does some fancy puppet magic to get icinga config files on the icinga host [17:13:25] the config files set up a nagios_service with the provided parameters there [17:13:27] So whenever the freshness is no longer met, the analytics_cluster_data_import-FAIL is run. And oozie updates the freshness, right? [17:13:53] yes, that's right [17:13:56] you got it :) [17:14:15] Oh. Ok. :-) [17:14:44] i'm going to get the oozie part running first [17:14:51] Nope. [17:14:58] We need send_nsca on the data nodes. [17:15:05] That is in the puppet part :-) [17:15:32] (At least it screamed at me when I tried) [17:15:46] qchris: i realized that for location i had checked the language+wiki before but not the IP (i know, retarded) so yes, there is commonality of IPs [17:16:12] oh, yes, hm, send_nsca must be there, true [17:16:16] the icinga stuff doesn't have to be set up though [17:16:19] ok, puppet first [17:16:56] oh, I am not yet including that ::check class anyw=here [17:16:56] cool [17:17:04] so we can just merge this nad get send_nsca installed [17:17:19] Sounds great :-) [17:18:19] nuria: If you're looking on the /zero/ tsvs, common IPs are somewhat expected, as the traffic (mostly) comes through carrier IPs, or Opera IPs. [17:19:15] nuria: If you want to disambiguate, you can use the X-Forwarded-For header [17:19:18] qchris: but the "overall common" and the "bad" common do not match [17:20:08] ok. [17:26:11] Analytics / General/Unknown: zero.log contains duplicate host in logs - https://bugzilla.wikimedia.org/69371#c12 (nuria) IPs with most issues in zero are not the most used IPs so, again, this points to a proxy issue. [17:29:11] Analytics / Wikimetrics: Wikimetrics can't run a lot of recurrent reports at the same time - https://bugzilla.wikimedia.org/68840#c6 (nuria) We removed chains to simplify and be able to better test our code, the bigest gain on performance however comes from the migration of labs db hosts to maria db.... [18:07:34] yurikR: that's super interesting! [18:07:45] yurikR: I want to get it deployed on meta as well, so the research graphs can use this [18:08:19] YuviPanda|groggy, i don't mean it will only be available on commons - rather this should be the storage spot [18:08:30] yurikR: right. that'll be doubly awesome [18:08:50] but it will not preclude it from running (and storing) data on other wikis [18:08:55] yurikR: although I'm wary of storing laaarge amounts of data with ContentHandler [18:09:12] YuviPanda|groggy, want to +2 a few things? deploying it to prod now (not enabling yet) [18:09:24] w00t sure [18:09:25] link me [18:09:35] YuviPanda|groggy, [18:09:36] https://gerrit.wikimedia.org/r/#/c/153840/ [18:10:12] yurikR: done [18:10:24] YuviPanda|groggy, working on wmf16 patch, sec [18:10:25] yurikR: that just enables branching, right [18:10:35] ?? [18:11:00] the patch I just merged :) [18:11:12] that auto-adds extension to the new branches generated [18:11:21] which means it won't touch 16 [18:11:24] doing it manually [18:11:25] right [18:11:30] but doesn't check them out in prod or anything [18:11:52] yurikR: by deploying it you mean only on zerowiki, right? [18:12:27] YuviPanda|groggy, not even there - it will be on the servers, but not enabled anywhere [18:12:40] makes it much easier to enable it via configs first on beta, etc [18:12:45] ah cool [18:13:18] although maybe i should enable it on zerowiki since that's where i will be playing with it the most at first [18:13:22] will see if i have enough time [18:14:24] yurikR: \o/ cool [18:18:18] (PS1) Yurik: Cleaned up log parsing and filtering [analytics/zero-sms] - https://gerrit.wikimedia.org/r/153845 [18:18:39] (CR) Yurik: [C: 2 V: 2] Cleaned up log parsing and filtering [analytics/zero-sms] - https://gerrit.wikimedia.org/r/153845 (owner: Yurik) [18:47:22] phuedx: around for a bit of idea bouncing? [19:39:26] Analytics / General/Unknown: zero.log contains duplicate host in logs - https://bugzilla.wikimedia.org/69371#c13 (Yuri Astrakhan) Nuria, are you saying one of our proxies is causing this? Or is it some common proxy software that many carriers are using that sets incorrect HOST value when forwarding req... [19:41:56] Analytics / General/Unknown: zero.log contains duplicate host in logs - https://bugzilla.wikimedia.org/69371#c14 (nuria) Well, neither. By looking at the data looks to be caused by a proxy but I do not think is "common" software as the percentage data affected seems pretty small. [19:43:49] ottomata: would you be so kind to look at this change: https://gerrit.wikimedia.org/r/#/c/153390/ [19:44:22] corresponding wikimetrics change has been merged: https://gerrit.wikimedia.org/r/#/c/150475/ [19:44:40] (change has been tested on dev) [19:45:55] hm, ok ,nuria, this seems like the type of thing you'd want to parameterize and change in role class, no? [19:46:15] by lowerin concurrency in the module, you lower the default for all users of the module, independent of whatever environement it is [19:46:36] is MAX_PARALLEL_PER_RUN no longer a proper config? [19:46:43] ottomata: that is the intention as db connection pool is limited [19:47:11] no, we no longer use MAX_PARALLEL_PER_RUN [19:47:18] hmm, ok [19:47:23] now, the setting could be parametized regardless [19:47:30] i can do that [19:47:37] ottomata, epic sql question: I select ... into outfile, without specifying the directory, and I expected it to end up in stat1003's /tmp. can you point? [19:47:38] i think it is, right? [19:47:56] naw, into outfile operates on the mysql server [19:48:03] unfortunetly [19:48:18] so, you won't have access to it that way [19:48:41] I see. so, I have to change my home permission, and write it there? [19:48:41] nuria: sorry, just merged! what more needs parameterized? [19:48:53] naw, you can't use into outfile unless you have access to the mysql server node [19:48:53] ottomata: np [19:48:54] which you don't [19:49:03] (right? double checking...) [19:49:25] ottomata: change works fine, i was just taking your suggestion to make it better [19:49:31] The http://dev.mysql.com/doc/refman/5.0/en/select-into.htmlhttp://dev.mysql.com/doc/refman/5.0/en/select-into.htmlhttp://dev.mysql.com/doc/refman/5.0/en/select-into.html form of http://dev.mysql.com/doc/refman/5.0/en/select.html writes the selected rows to a file. The file is created on the server host [19:49:52] nuria, i think you are fine with this change as is, if you intend to change the default everywhere anyway [19:49:58] you need this to remove the MAX_PARALLEL thing anyway [19:49:59] so it sgood [19:50:28] leila: read http://dev.mysql.com/doc/refman/5.0/en/select-into.html [19:50:33] scroll down to the part where it says [19:50:35] ok ottomata, will send you corresponding module change for puppet [19:50:38] statement is intended primarily to let you very quickly dump a table to a text file on the server machine [19:50:40] k [19:50:54] reading, ottomata. [19:51:08] "However, if the MySQL client software is installed on the remote machine, you can instead use a client command such as mysql -e "SELECT ..." > file_name to generate the file on the client host." [19:51:32] gotcha. trying. thanks. [20:03:46] ottomata, mysql -e "select *" > file_name results in ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) [20:04:05] is it related to mysql-server being installed or not? [20:04:55] nope, you need to specify the hostname you are trying to connect to [20:05:10] mysql -hs1-analytics... [20:05:12] or whatever it is [21:12:48] (PS1) Yuvipanda: Add user page for each user [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153942 [21:12:53] (CR) jenkins-bot: [V: -1] Add user page for each user [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153942 (owner: Yuvipanda) [21:13:49] (CR) Yuvipanda: [C: 2] Add user page for each user [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153942 (owner: Yuvipanda) [21:13:53] (CR) jenkins-bot: [V: -1] Add user page for each user [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153942 (owner: Yuvipanda) [21:13:57] (CR) Yuvipanda: [C: -2] Add user page for each user [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153942 (owner: Yuvipanda) [21:14:44] (PS2) Yuvipanda: Add user page for each user [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153942 [21:15:25] (CR) Yuvipanda: [C: 2] Add user page for each user [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153942 (owner: Yuvipanda) [21:15:33] (Merged) jenkins-bot: Add user page for each user [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153942 (owner: Yuvipanda) [21:33:56] Analytics / Tech community metrics: Wrong data at "Update time for pending reviews waiting for reviewer in days" - https://bugzilla.wikimedia.org/68436#c7 (Jeroen De Dauw) DataValues still has a copy on Gerrit?! This stuff was moved to GitHub ages ago https://github.com/DataValues/ [21:35:49] (PS1) Yuvipanda: Add a link to user profile in drop down [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153947 [21:36:20] (PS1) Yuvipanda: Rename 'Query Runs' to Recent Queries [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153948 [21:37:07] (CR) Yuvipanda: [C: 2] Add a link to user profile in drop down [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153947 (owner: Yuvipanda) [21:37:12] (Merged) jenkins-bot: Add a link to user profile in drop down [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153947 (owner: Yuvipanda) [21:37:14] (CR) Yuvipanda: [C: 2] Rename 'Query Runs' to Recent Queries [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153948 (owner: Yuvipanda) [21:37:19] (Merged) jenkins-bot: Rename 'Query Runs' to Recent Queries [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153948 (owner: Yuvipanda) [21:39:55] (PS1) Yuvipanda: Fix bug where you seem to be the person whose profile you're seeing [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153949 [21:40:07] (CR) Yuvipanda: [C: 2] Fix bug where you seem to be the person whose profile you're seeing [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153949 (owner: Yuvipanda) [21:40:13] (Merged) jenkins-bot: Fix bug where you seem to be the person whose profile you're seeing [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153949 (owner: Yuvipanda) [21:43:06] milimetric, hey [21:43:19] hi [21:43:51] so multimedia doesn't show in the zero logs then? [21:45:03] doesn't seem so [21:45:08] (PS1) Yuvipanda: Order queries in profile by recentness of creation [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153950 [21:45:14] i think its because varnish doesn't do the analysis on it [21:46:09] (CR) Yuvipanda: [C: 2] Order queries in profile by recentness of creation [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153950 (owner: Yuvipanda) [21:46:14] (Merged) jenkins-bot: Order queries in profile by recentness of creation [analytics/quarry/web] - https://gerrit.wikimedia.org/r/153950 (owner: Yuvipanda) [21:46:17] so much self merging [21:46:50] hm, so yurikR this change would affect only zero logs or it would be available everywhere, but the zero partners are interested in the figures? [21:47:06] *affect only zero requests, rather [21:47:09] this change? [21:47:18] as in, the reduced file sizes [21:47:32] its already public, we have been using it for a month or two [21:47:41] first on smaller partners [21:47:42] right, I know [21:47:50] we basically change HTML to request smaller image [21:47:54] nothing else [21:48:02] but only when it's hit from zero, right? [21:48:18] only when its a hit from a specific subset of zero partners [21:48:36] we are rapidly increasing that set [21:48:52] unified design is using small images by default [21:50:41] cool, so yurikR I assume you know about the cache log format: https://wikitech.wikimedia.org/wiki/Cache_log_format [21:50:46] and that the reply size is field 7 there [21:51:07] so basically, we have a hairy problem then [21:51:13] a. get zero request with images [21:51:23] b. find images that would be requested [21:51:29] c. find sizes of those [21:51:38] d. find size if they weren't requested through zero [21:51:41] something like that? [21:52:03] milimetric, yes hairy :) Ideally we should be marking all traffic, not just multimedia, with X-Analytics tag [21:52:29] in which case we simply group-by zero [21:52:32] right [21:52:36] and divide by image count [21:53:25] otherwise we would have to do a ton of these weird manipulations :) [21:53:38] yep, I don't see a way around it right now [21:54:02] you mean to analyze backwards or to do it at all? [21:54:03] what ori was saying - to just curl and get the size, that would still mean you need to know what images are being requested from zero [21:54:28] i don't see a simple hack is what I mean [21:54:41] yes, i did that analysis once - tons of work, cannot easily repeat the test, and practically useless [21:54:49] yeah, it sucks [21:55:04] let me get bblack [21:55:04] yurikR: let me brainbounce with the europe folks tomorrow morning and we'll see if we can think of something better? [21:55:30] i really think we should start marking all traffic, not just text [21:56:00] ops will be happy [21:56:14] milimetric, another major issue, much more pressing [21:56:19] what do i do with tons of logs [21:56:28] where should i put them on stat1002 [21:56:39] those 4GB you were talking about? [21:57:09] yurikR: ^ [21:57:14] yep [21:57:24] and a small python script to go with it [21:57:39] btw, that python script needs a number of libs, e.g. S3 access [21:57:45] 4GB is really small compared to the stuff on there [21:57:57] location? [21:57:57] so I wouldn't worry too much, /a/ is the usual spot [21:58:01] cron? [21:58:08] root for me [21:58:09] :) [21:58:11] cron would need to be puppetized, do you know where? [21:58:29] is it in a separate puppet repo? [21:58:32] from the main prod? [21:58:43] no, operations/puppet, lemme point you to the spot though, or at least an example [21:59:42] yurikR: https://git.wikimedia.org/blob/operations%2Fpuppet/a64123c2abaa4934c131b2a760bf8619e5d5ea6e/manifests%2Fmisc%2Fstatistics.pp#L246 [21:59:50] but wait, yurikR, what exactly are you doing there? [22:00:06] doing where? :) [22:00:11] i download stats from S3 [22:00:17] like, what's the cron doing, what stats are these, etc [22:00:19] logs from our SMS partner [22:00:22]