[00:19:13] nuria: yt [00:19:28] * springle looks at clock [00:19:33] probably not :) [01:12:13] Analytics-EventLogging, Analytics-Kanban, Wikimedia-Search: Estimate maximum throughput of Schema:Search (capacity) {oryx} - https://phabricator.wikimedia.org/T89019#1072393 (Deskana) >>! In T89019#1063749, @Nuria wrote: > Have you instrumented your code? No. We're reaching out first because we want... [01:37:35] Analytics-EventLogging: Verify that each existing table corresponds to its schema definition - https://phabricator.wikimedia.org/T91031#1072639 (ori) NEW [01:41:00] Analytics-EventLogging: Verify that each existing table corresponds to its schema definition - https://phabricator.wikimedia.org/T91031#1072678 (ori) [02:03:44] springle: ta-ta-channnnn I am back! [02:04:02] springle: are we ready to switch the db box? [02:04:06] cc ori [02:04:16] it's done [02:04:50] :) [02:05:40] nuria: too slow ;) [02:08:46] springle: niceeee [02:08:54] also ori: niceeee [02:09:21] springle: did ori do any sanity checking on EL? [02:10:38] springle: ah yes, he did, i see now the ticket with the null fields [02:11:06] springle: should i still use this page to monitor db? [02:12:04] nuria: this page? [02:12:09] xhttps://tendril.wikimedia.org/host/view/db1020.eqiad.wmnet/3306 [02:12:12] sorry [02:12:16] https://tendril.wikimedia.org/host/view/db1020.eqiad.wmnet/3306 [02:12:38] s/db1020/db1046/ [02:13:45] springle: ok, is db1046 the master now? [02:13:57] correct [02:15:44] springle: and what host do we use to access the db to query? we used to do >mysql --defaults-extra-file=/etc/mysql/conf.d/research-client.cnf --host db1020.eqiad.wmnet -e "select * from log.MobileWebUIClickTracking_10742159 where uuid='ac4855a8b2d5592f9f6d3dde95f0083e'"; [02:16:01] springle: is there another slave we query from? [02:16:26] what sort of queries? [02:17:27] just simple selects like that, or ad-hoc bigger stuff? [02:17:55] simple record lookups you can "--host db1046.eqiad.wmnet" [02:18:27] larger/slower/ad-hoc queries i advise keeping off the master; use analytics-store [02:19:53] springle: from stat1003 I get: Access denied for user 'research'@'10.64.36.103' (using password: YES) [02:20:55] springle: when running the above query [02:21:19] 'research' user has not had access to the EL master since db1047 (a year?) [02:22:11] niether db1020 nor db1046 have a 'research' grant. are we sure this worked? [02:25:54] springle: this was working yesterday: [02:26:05] springle: mysql --defaults-extra-file=/etc/mysql/conf.d/research-client.cnf --host db1020.eqiad.wmnet -e "select * from log.MobileWebUIClickTracking_10742159 where uuid='ac4855a8b2d5592f9f6d3dde95f0083e'" [02:26:17] springle: from1003 [02:26:52] stat1003 that is [02:28:23] nuria: we're missing somthing... db1020 has not changed since yesterday except to have EL traffic moved. in fact it still has a copy of EL data. it definitely does not have a 'research' user [02:28:32] * springle goes to check admin log [02:28:54] springle: ah SORRY, i misstyped [02:29:05] --host dbstore1002.eqiad.wmnet [02:29:12] ah :) [02:29:17] that will still work [02:29:34] springle: SORRY big time [02:29:35] you could also "--host analytics-store.eqiad.wmnet" [02:29:50] analytics-store is a CNAME for dbstore1002 [11:58:17] Analytics, Wikidata: active user statistics that have less lag than wikistats - https://phabricator.wikimedia.org/T88121#1073209 (JanZerebecki) The previous query was a bogus result, because the data is usually only kept 2 days in that table. [13:18:03] Analytics, Wikidata: active user statistics that have less lag than wikistats - https://phabricator.wikimedia.org/T88121#1073309 (JanZerebecki) $ date -d '-30days' --iso 2015-01-28 $ date -d '-1days' --iso 2015-02-26 [wikidatawiki]> select count(*) as c, rc_user_text from recentchanges where rc_timestamp... [13:27:23] Analytics, Wikidata: active user statistics that have less lag than wikistats - https://phabricator.wikimedia.org/T88121#1073317 (Lydia_Pintscher) Open>Resolved a:Lydia_Pintscher Sweet. Thanks! [13:32:02] Analytics, Wikidata, ยง Wikidata-Sprint-2015-02-25: active user statistics that have less lag than wikistats - https://phabricator.wikimedia.org/T88121#1073320 (JanZerebecki) [14:57:18] (PS8) Mforns: [WIP] [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/192319 (https://phabricator.wikimedia.org/T89251) [14:57:30] (CR) jenkins-bot: [V: -1] [WIP] [analytics/limn-mobile-data] - https://gerrit.wikimedia.org/r/192319 (https://phabricator.wikimedia.org/T89251) (owner: Mforns) [15:06:53] Analytics-Kanban: Adhoc Analysis: Guided Tour activations - https://phabricator.wikimedia.org/T90942#1073439 (Milimetric) a:Milimetric>Ironholds [15:45:07] Analytics-Volunteering, Engineering-Community, Phabricator, Project-Creators, and 3 others: Analytics-Volunteering and Wikidata's Need-Volunteer tags; "New contributors" vs "volunteers" terms - https://phabricator.wikimedia.org/T88266#1073595 (Aklapper) [16:02:07] (PS5) Milimetric: Analyze edit success rate by user type [analytics/limn-edit-data] - https://gerrit.wikimedia.org/r/192944 (https://phabricator.wikimedia.org/T89729) [17:18:56] Analytics-EventLogging, Analytics-Kanban, Wikimedia-Search: Estimate maximum throughput of Schema:Search (capacity) {oryx} - https://phabricator.wikimedia.org/T89019#1073774 (Nuria) Questions. Schema related: >I could explain here, but the schema itself documents that: https://meta.wikimedia.org/wik... [17:31:36] ottomata: have a minute? [17:42:25] yes! [17:42:50] nuria: ^ [17:43:10] ottomata: would you happen to know whether serach requests also come to the cluster? [17:46:08] they do not [17:46:19] well [17:46:22] i take that back [17:46:28] they do not from the cirrus or elasticsearch [17:46:28] ottomata: aham... [17:46:30] yes? [17:46:38] but, the http requests to cirrus search should [17:46:41] so i think you should be able to find them [17:46:55] i think anyway [17:47:01] not certain of that [17:47:17] ottomata: are there serach logs elsewhere? [17:47:50] Analytics-EventLogging, Analytics-Kanban, Wikimedia-Search: Estimate maximum throughput of Schema:Search (capacity) {oryx} - https://phabricator.wikimedia.org/T89019#1073880 (Nuria) Summing up IRC conversation: Looks like @Deskana is interested in two things: 1) How our users use search 2) number of... [17:48:10] hm, they used to. [17:48:13] ah yes [17:48:16] they did from lsearchd [17:48:20] not from new search [17:58:32] Analytics-EventLogging, Analytics-Kanban, Wikimedia-Search: Estimate maximum throughput of Schema:Search (capacity) {oryx} - https://phabricator.wikimedia.org/T89019#1073937 (Nuria) Using EL client side to estimate search pageviews you will be missing: 1. Bot searches 2. Non js enabled,capable clien... [18:00:27] Analytics-Cluster, Analytics-Kanban, operations: Upgrade Analytics Cluster to Trusty, and then to CDH 5.3 - https://phabricator.wikimedia.org/T1200#1073949 (kevinator) [18:09:38] Analytics-EventLogging, Analytics-Kanban, Wikimedia-Search: Estimate maximum throughput of Schema:Search (capacity) {oryx} - https://phabricator.wikimedia.org/T89019#1074015 (Nuria) To sum up: measuring pageviews with EL is not effective cause you need to instrument all clients to get a somewhat accur... [18:19:57] Analytics, Analytics-Kanban: Review and clean up JS on new graphoid service - https://phabricator.wikimedia.org/T90147#1074090 (kevinator) [18:58:48] (PS1) QChris: Adjust legacy_tsv description in status dump script [analytics/refinery] - https://gerrit.wikimedia.org/r/193410 [18:58:50] (PS1) QChris: Add mediacounts to status dump script [analytics/refinery] - https://gerrit.wikimedia.org/r/193411 [18:58:52] (PS1) QChris: Collapse caption lines in table of dump script, if they agree [analytics/refinery] - https://gerrit.wikimedia.org/r/193412 [19:01:08] ottomata: I think all the items that blocked announcement of the mediacounts files are dealt with. They are still backfilling, but are there any issues/blockers you want to see addressed before I discuss with ezachte on how he wants them announced? [19:10:08] hm, nope, go for it! [19:10:52] k. Thanks. [19:11:45] nuria: goign to run to the post office, back in time for meeting fingers c rossed. [19:12:16] i've gone postal! [19:12:19] https://en.wikipedia.org/wiki/Going_postal [19:12:29] ottomata|postal: k [19:55:14] (Abandoned) OliverKeyes: (WIP) project class/variant extraction UDF [analytics/refinery/source] - https://gerrit.wikimedia.org/r/188588 (owner: OliverKeyes) [19:57:25] mforns: you around? [19:57:29] yep [19:57:36] quick chat in the cave? [19:57:38] we have meeting in 3 mins [19:57:41] sure [20:25:59] Analytics-Kanban: Failure Types by User Type - https://phabricator.wikimedia.org/T91123#1074634 (Milimetric) NEW a:Milimetric [20:26:01] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.2:compile (default-compile) on project refinery-core: Execution default-compile of goal org.apache.maven.plugins:maven-compiler-plugin:3.2:compile failed: Plugin org.apache.maven.plugins:maven-compiler-plugin:3.2 or one of its dependencies could not be resolved: The following artifacts could not be resolved: org.codehaus.plexus:plexus-utils:jar:1.5.1, or [20:26:02] g.apache.maven.shared:maven-shared-utils:jar:0.1, org.apache.maven.shared:maven-shared-incremental:jar:1.1, com.google.collections:google-collections:jar:1.0, junit:junit:jar:3.8.2: Failure to find org.codehaus.plexus:plexus-utils:jar:1.5.1 in http://archiva.wikimedia.org/repository/mirrored/ was cached in the local repository, resolution will not be reattempted until the update interval of wmf-mirrored has elapsed or updates are fo [20:26:07] rced [20:26:09] wtf? [20:29:33] uh... archiva problem? classpath? I'm just saying java words... [20:29:35] I have no idea [20:29:37] * milimetric runs away [20:30:53] Analytics, Engineering-Community, ECT-February-2015, ECT-March-2015: Analytics Team Offsite - Before Wikimania - https://phabricator.wikimedia.org/T90602#1074648 (Rfarrand) [20:35:50] Analytics-Cluster, Analytics-Kanban, Scrum-of-Scrums: Create Daily & Monthly pageview dump with country data - https://phabricator.wikimedia.org/T90759#1074660 (Nuria) I am not sure we ill be able to provide content size. Doesn't seem like we would. Please note that this format does not include countr... [20:36:04] hah. nuria? [20:36:20] Ironholds: yes [20:36:28] any idea about the error ^ [20:36:51] Ironholds: ya, something is missing on archiva [20:37:14] Ironholds: blow up your local cache just in case (~/.m2) [20:37:24] and try to re-download all deps [20:38:04] Ironholds: if you get the same error it means a new dep is needed on archiva ( org.codehaus.plexus:plexus-utils:jar:) [20:38:58] hmn; ta [20:39:32] Ironholds: makes sense? [20:40:13] yep; I'll try. [20:41:17] nuria, yeah, no luck [20:44:59] Ironholds: ottomata is your man (unless you added that dep yourself) [20:46:17] junit? [20:46:22] naw [20:46:24] ottomata, ping? ;p [20:46:33] I will just keep going around engineers until someone knows what's up. [20:49:06] Ironholds: yo [20:49:33] ottomata, error above, any idea? [20:49:53] lemme seee... [20:50:10] what are you compiling? did you add a new dependency? [20:50:29] refinery-source, a fresh download from gerrit [20:50:49] ha, we have many versions of plexus-utils mirrored [20:50:51] but not 1.5.1 :p [20:50:51] http://archiva.wikimedia.org/#basicsearch/plexus-utils [20:50:55] really? [20:50:56] hm [20:51:54] rm -rf ~/.m2/org/codehaus [20:51:56] and try again now. [20:52:33] Ironholds: ^ [20:52:34] same error [20:52:37] ? [20:52:45] wait, do it one more time, with the remove, i'm watching som elogs [20:52:55] what? [20:53:16] rm -rf ~/.m2/org/codehaus [20:53:23] rm -rf ~/.m2/org/codehaus; mvn test [20:53:26] have now run twice, no dice. [20:53:30] oh [20:53:31] sorry [20:53:36] rm -rf ~/.m2/repository/org/codehaus [20:53:58] same error [20:54:04] nuh uh [20:54:29] wait, no, you're right [20:54:37] we have [20:54:37] http://archiva.wikimedia.org/#artifact/org.codehaus.plexus/plexus-utils/1.5.1 [20:54:38] :) [20:55:00] you good? Ironholds? [20:55:06] now it's only claiming all the other dependencies can't be met :D [20:55:15] blast ~/.m2/repository then [20:55:18] and redownload everything [20:55:26] rm -rf ~/.m2/repository [20:55:47] ha, actually we need to update CDH dependencies to 5.3.0 :p [20:56:00] you are lucky i'm actually doing archiva stuff right now anyway :) [21:01:54] ottomata, yay! [21:09:39] ottomata, it works! [21:10:55] cool [21:15:25] (PS1) OliverKeyes: Exclude edit attempts from being counted as pageviews. [analytics/refinery/source] - https://gerrit.wikimedia.org/r/193486 [21:15:49] ottomata, the result of that: ^ [21:19:31] Ironholds: i'm sure you are, but just checking, you keeping the wiki definition in sync with the implementation? [21:20:25] not yet; I'll make the changes when I've done all of the hand-coding so I can log it all at once [21:20:29] at which point it becomes yinz problem [21:21:05] ok [21:21:15] can you also update changelog.md for the 0.0.8-SNAPSHOT [21:21:22] OH! i forgot to update that last time I deployed [21:21:44] where it says 0.0.7-SNAPSHOT, remove the -SNAPSHOT part, and then add a new section above for 0.0.8-SNAPSHOT [21:21:57] that way we can keep track of logical changes to each version [21:22:17] like, "when did the pagviews stop counting edits? oh, 0.0.8, cause the changelog says so" [21:22:17] sure [21:22:19] danke [21:22:26] put that in there and you will get +2! [21:22:39] hmmmm [21:22:42] actually>>....> [21:22:48] i thikn you don't need a regex for that, no? [21:23:02] (PS2) OliverKeyes: Exclude edit attempts from being counted as pageviews. [analytics/refinery/source] - https://gerrit.wikimedia.org/r/193486 [21:23:05] done [21:23:10] ottomata, action=edit? [21:23:12] http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#contains(java.lang.CharSequence) [21:23:16] nope, but I'm optimising for the future [21:23:16] it isn't a regex [21:23:18] just a static string [21:23:20] hmm [21:23:28] the future being "some other moron will make a change like this and not tell us" [21:23:30] ok [21:23:50] but I've been told my code isn't up to engineering standards anyways, so if you want to change that it's your call. [21:24:02] Ironholds: if you would please, replace v0.0.7-SNAPSHOT with v0.0.7 [21:24:11] 0.0.7 has already been releasd [21:24:16] and make yours say [21:24:19] v0.0.8-SNAPSHOT [21:24:32] This is my fault for not changing this the last time I released [21:25:34] sure [21:25:42] what does SNAPSHOT mean in this context? [21:26:56] its a maven convention, i think [21:27:00] we can actually release snapshots [21:27:05] with the same version number [21:27:14] its kinda like WIP of this version, but maven convention [21:27:26] we keep -SNAPSHOT on the version name until it is released [21:35:16] gotcha [21:35:21] so, it's WIP but only for Maven? [21:35:40] let me reiterate how Maven's elevator pitch can only honestly consist of "!Ant" ;p [21:36:03] (PS3) OliverKeyes: Exclude edit attempts from being counted as pageviews. [analytics/refinery/source] - https://gerrit.wikimedia.org/r/193486 [21:36:06] done and done [21:50:29] (CR) Ottomata: [C: 2 V: 2] Exclude edit attempts from being counted as pageviews. [analytics/refinery/source] - https://gerrit.wikimedia.org/r/193486 (owner: OliverKeyes) [21:50:31] gotcha :) [22:06:35] ottomata: if you have 10 min to chat some time today, please ping me. thanks! [22:07:34] now is good [22:07:35] leila: [22:07:36] batcave? [22:07:50] great. actually, the background noise is too much on my end. here is fine. [22:07:52] if you don't mind [22:08:05] ottomata, ^ [22:08:06] sure [22:08:30] re Ashwin's access: given the group he is in now, can he have access to stat1002 databases? [22:09:32] what are stat1002 databases? [22:10:06] sorry. analytics-store [22:10:37] basically, where xxwiki databases are, ottomata. [22:11:59] ah [22:12:12] um, technically yes, but he woul dneed research db password [22:12:20] which he is supposed to get by being in the researchers group [22:13:58] I see, ottomata. so, I guess there are two options: giving him the research password, or changing his group. With my limited understanding I think the latter is better, given that we don't make extra special cases we need to keep track of. what do you think? [22:14:21] ja, he needs added to that group [22:14:24] that is the right thing to do [22:14:32] which will probably require the 3 day access thing :/ [22:14:46] got it, ottomata. he has already been through the 72 hour thing once [22:15:18] should I make a card for this, ottomata? [22:15:26] or this should happen on your end? [22:15:29] a phab ticket yeah, the usual access request process [22:15:33] naw, gotta do it that way :/ [22:16:07] okay. I'll make a request. Toby has already approved this before, will we need his approval again? [22:18:10] yup [22:20:26] ottomata: just to make sure, we're asking for analytics-privatedata-users? [22:20:31] no [22:20:32] researchers [22:20:50] you just want him to access the mysql research slaves, right? [22:21:45] yes [22:22:04] k ja researchers [22:23:42] leila, that will give him access to stat1003 [22:23:46] and to read the file [22:23:50] /etc/mysql/conf.d/research-client.cnf [22:24:29] I see, ottomata. how can he move data from there to the cluster then, ottomata? [22:25:04] I'm thinking not having everything in one place may become a problem? [22:28:06] hmmMMM [22:28:09] he wants to sqoop, eh? [22:28:09] hm [22:28:19] this is so annoying! ha [22:28:20] hm. [22:28:30] sqoop, or extract there, and write map-reduce? [22:28:39] yeah um, no i see the issue. um [22:28:57] I'm leaning towards Bob-level access. [22:29:07] this is sounding more and more like we should have a hadoop client on stat1003. [22:29:08] yes [22:29:21] lelia, i think toby already approved anyway, so you shoudl go for analytics-privatedata-users access then [22:29:24] there is a mysql file that is readable [22:29:30] or [22:29:31] hm [22:29:36] I hear you, ottomata. [23:10:12] hey aaron [23:10:49] dammit, otto has gone home :( [23:10:59] nuria, you know the stripping of http://www. in the request logs? [23:11:59] where is that done? [23:23:38] (PS1) OliverKeyes: Avoid excluding Wikidata pageviews [analytics/refinery/source] - https://gerrit.wikimedia.org/r/193511 [23:24:05] (PS2) OliverKeyes: Avoid excluding Wikidata pageviews [analytics/refinery/source] - https://gerrit.wikimedia.org/r/193511 [23:24:33] (Abandoned) OliverKeyes: Avoid excluding Wikidata pageviews [analytics/refinery/source] - https://gerrit.wikimedia.org/r/193511 (owner: OliverKeyes) [23:27:07] (PS1) OliverKeyes: Avoid excluding Wikidata pageviews [analytics/refinery/source] - https://gerrit.wikimedia.org/r/193513 [23:27:50] (PS2) OliverKeyes: Avoid excluding Wikidata pageviews [analytics/refinery/source] - https://gerrit.wikimedia.org/r/193513